Cultural Analytics

PROJECTS | OVERVIEW | POWERPOINT PRESENTATIONS | ARTICLES | WHY CULTURAL ANALYTICS?


hiperspace_manga_horizontal
Exploring 1,000,000 manga pages on HIPerSpace visual supercomputer at Calit2.


PROJECTS / IMAGES / ANIMATIONS / EXHIBITIONS

Cultural Analytics projects

still visualizations (Flickr)

selected exhibitions (Flickr)


OVERVIEW

Cultural Analytics is the use of computational methods for the analysis of massive cultural data sets and flows.

The term "cultural analytics" was coined by Lev Manovich in 2007. At Software Studies Initiative we are focusing on a particular part of analytics paradigm - using digital image analysis and visualization for working with large image and video collections.

Here are some of the questions which drive our work:

How we do analyze millions of digitized visual artifacts from the past?
How do we explore billions of digital photos and videos (both user-generated content and professional media)?
How do we research interactive media processes and experiences (evolution of web design, playing a video game)?
What new concepts and models we need to deal with the new scale of born-digital culture?
How can the use of computational techniques and massive cultural data sets help develop cultural theory for the 21st century?

To address these challenges, we are developing techniques and software and applying them to progressively larger image and video sets. In addition to digital humanities., these techniques can be also used in cinema studies, game studies, media studies, ethnography, exhibition design, and other fields.

Cultural analytics shares many ideas and approaches with visual analytics ("the science of analytical reasoning facilitated by visual interactive interfaces") and visual data analysis, defined as follows:

"Visual data analysis blends highly advanced computational methods with sophisticated graphics engines to tap the extraordinary ability of humans to see patterns and structure in even the most complex visual presentations. Currently applied to massive, heterogeneous, and dynamic datasets, such as those generated in studies of astrophysical, fluidic, biological, and other complex processes, the techniques have become sophisticated enough to allow the interactive manipulation of variables in real time. Ultra high-resolution displays allow teams of researchers to zoom in to examine specific aspects of the renderings, or to navigate along interesting visual pathways, following their intuitions and even hunches to see where they may lead. New research is now beginning to apply these sorts of tools to the social sciences and humanities as well, and the techniques offer considerable promise in helping us understand complex social processes like learning, political and organizational change, and the diffusion of knowledge."


Our lab created a set of open source software tools which cover all the parts of visual data analysis for humanities:

- images and video preparation;

- feature extraction (using digital image analysis);

- interactive visual exploration;

- rendering of high res still and animated visualizations;

Our software works on regular laptops and desktop computers, as well as next-generation large-scale displays such as HIPerSpace visual supercomputer with the combined resolution of 42,000 x 8000 pixel. We use open-source technologies whenever possible in our development. Our core methods can be used by people without technical training - an important consideration for their adoption in humanities. We borrow ideas from information visualization, media design, interfaces of media editing applications, media art, and digital art.

To date, we have already successfully applied our techniques to films, animations, video games, comics, magazines, books, and other print publications, artworks, photos, and other media content. Examples include all pages of Science and Popular Science magazines published in 1872-1922), hundreds of hours of videogame recordings, all paintings by van Gogh, and one million manga pages. For details, see the Cultural Analytics projects page; in addition, you can also find many more visualizations on Flickr and YouTube.

Our past and present collaborators include Getty Museum, Wikimedia, Austrian Film Museum, Magnum Photos, Netherlands Institute for Sound and Image and other institutions who are interested in using our methodology with their media collections. Since 2009 our work has been shown in 10 exhibitions (Graphic Design Museum, San Diego Museum of Contemporary Art, gallery@calit2, and other venues.)

Cultural Analytics research is supported by the National Science Foundation (NSF), the National Endowment for the Humanities (NEH), National Energy Research Scientific Computing Center (NERSC), University of California Humanities Research Institute (UCHRI), University of California, San Diego (UCSD), California Institute for Telecommunications and Information Technology (Calit2), and the Singapore Ministry of Education.

Beginning in Fall 2011, we are releasing fully documented tools we developed as open source to make it easier for people in digital humanities to work with large image and video data sets.



Cultural Analytics software on 287 megapixel HIPerSpace supervisualization system (YouTube)


POWERPOINT PRESENTATIONS

Style Space: Analysis and Visualization of 1 million Manga pages (06/2010). [key 56.6 MB]. [ppt 17.9 MB]

Cultural Visualization Techniques (l0/2009). [key 10.3 MB]. [ppt 1.1 MB]

Learning from Software (11/2009). [key 1 MB]. [ppt 500 KB]

Cultural Analytics: vision (last update: 10/2009). [key 26.3 MB] [ppt 12.8 MB]

Cultural Analytics: case studies (last update: 06/2009). [key 32.4 MB]. [ppt 8.2 MB]


ARTICLES

See Publications


CULTURAL ANALYTICS AND DIGITAL HUMANITIES

The idea of quantitative analysis and visualization of massive cultural visual datasets in the humanities context was originally proposed by Lev Manovich in 2005. The formation of Software Studies Initiative in 2007 made possible to begin practical research in Cultural Analytics.

What will happen when more humanists start using interactive supervisualizations as a standard tool in their work, the way many scientists do already? If slides made possible art history, and if a movie projector and video recorder enabled film studies, what new cultural disciplines may emerge out of the use of interactive visualization and data analysis of massive cultural data sets?

Our key goals:
- Being able to better represent the complexity, diversity, variability, and uniqueness of cultural processes and artifacts.
- Create much more inclusive cultural histories and analysis - ideally taking into account all available cultural objects created in particular cultural area and time period (“art history without names.")
- Develop techniques to describe the dimensions of cultural artifacts and processes which until now received little or no attention (such as gradual historical changes over long periods) and/or are difficult to describe using natural languages (such as motion).
- Create visualization techniques and interfaces for exploration of cultural data which operate across multiple scales (think Google Earth) - from details of structure of a particular individual cultural artifact/processes (such as a single shot in a film) to massive cultural data sets/flows (such as all films made in 20th century).

Cultural analytics paradigm is related to culturomics introduced by another research team in 2010. However, while cultunomics is focused on historical text data ("digitize and analyze data about culture on extremely large scales: all books, all newspapers, all manuscripts, etc."), cultural analytics is more general - its ultimate goal is to both analyze all existing human cultural records in all media as well as contemporary born-digital cultural data and cultural and social flows.

Our work is closely aligned to the vision of digital humanities put forward by Office of Digital Humanities at the National Endowment of Humanities (the U.S. federal agency which funds humanities research). The joint NEH/NSF Digging into Data competition (2009) description opens with these questions: “How does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitized data -- far more than they could read in a lifetime -- what does that mean for research?” The same questions guide our research.

The following is our statement responding to these questions (from our Digging Into Data 2011 Competition application):

"Digitization of massive amounts of cultural artifacts, the rise of born-digital and social media, and the progress in computational tools that can process massive amounts of data makes possible a fundamentally new approach to the humanities. Scholars no longer have to choose between data size and data depth. They can study exact trajectories formed by billions of cultural expressions and conversations in space and time, zooming into particular cultural texts and zooming out to see larger patterns.

New super-visualization technologies specifically designed for research purposes allow interactive exploration of massive media collections which may contain tens of thousands of hours of video and millions of still images. Researchers can quickly generate new questions and hypotheses and immediately test them. This means that researchers can quickly explore many research questions within a fraction of the time previously needed to ask just one question.

Computational analysis and visualization of large cultural data sets allows the detailed analysis of gradual historical patterns that may only manifest themselves over tens of thousands of artifacts created over number of years. Rather than describing the history of any media collection in terms of discrete parts (years, decades, periods, etc.), we can begin to see it as a set of curves, each showing how a particular dimension of form, content, and reception changes over time. In a similar fashion, we can supplement existing data classification with new categories that group together artifacts which share some common characteristics. For instance, rather than only dividing television news programs according to producers, air dates and times, or ratings, we can generate many new programs clusters based on patterns in rhetorical strategies, semantics, and visual form. In another example, we can analyze millions of examples of contemporary graphic design, web design, motion graphics, experience design and other recently developed cultural fields to create their maps which would reveal if they have any stylistic and content clusters."



Mapping Time exhibition at Calit2, Fall 2010


CULTURAL ANALYTICS AND MEDIA INDUSTRY

Computational analysis of massive cultural and social data sets and data flows is used widely in media and web industries. It structures contemporary media universe, cultural production and consumption, and cultural memory. Search engines, spam detection, Netflix and Amazon recommendations, Last.fm, Flickr "interesting" photo rankings, movie success predictions, tools such as Google Books Ngram Viewer, Insights for Search, Search by Image, and and numerous other applications and services all rely on cultural analytics. This work is carried out in media industries and in academia by researchers in data mining, social computing, media computing, music information retrieval, computational linguistics, and other areas of computer science.

As humanities and social science researchers start to apply computational techniques to large data sets in their fields (see Digging Into Data 2011 competition), many questions arise. What are the new possibilities for studying culture and society made possible by "big data"? Do humanists and social scientists need to develop their own methodologies for working with big data? What is "data" in the case of interactive media? How can new computational methods can be combined with more established humanities approaches and theories? Is it possibly to study massive media sets without in-depth technical knowledge?

In addition to our practical work on digital humanities project, we are Software Studies Initiative (established 2007) are equally interested in exploring such larger questions. We believe that they can only be productively addressed using "software studies" approach, i.e. in depth understanding of software technologies behind cultural analytics.

Lev Manovich.

Recently...