The Nature of Code - new book by Daniel Shiffman


The Processing guru Daniel Shiffman released his new book The Nature of Code:

"Can we capture the unpredictable evolutionary and emergent properties of nature in software? Can understanding the mathematical principles behind our physical world world help us to create digital worlds? This book focuses on the programming strategies and techniques behind computer simulations of natural systems using Processing."

The book release is also a perfect example of how publishing should work. You can read the whole book in HTML online (of course, the design is exellent). HTML version includes clear illustrations of all concepts and also numerous embedded Processing sketches. As you go through the book, you see examples of codes and next to them running sketches (so you can see exactly what code does dynamically).

You can also download all the code examples, as well as all of the source files for building the book.

You can also buy the book in PDF format - selecting any price you want from $ 0.00 to $ $ 50.00. And you can donate any percentage of the money to Processing Foundation.

This is simply great. I am looking forward to reading the book and trying the examples, so I can use the techniques discusssed by Shiffman for new visualizations. (In my graduate seminar "Big Data, Visualization, and Digital Humanities: at CUNY Graduate Center in Spring 2013, I plan to introduce the humanities students to Processing.)

Data stream, database, timeline (new article by Lev Manovich, part 1)

Data stream, database, timeline: the forms of social media

Lev Manovich

[Part 1]

The interiors of Buddhist temples and Medieval cathedrals, store displays, books, newspapers, magazines, films, motion graphics, manga, museums, car dashboards, wayfinding systems in airports, email, chat, blogs, social networks and every other communication media organize information in particular ways, creating distinct user experiences. These information design patterns [ ] specify how information is presented visually and/or spatially, how it is updated over time, and how users can interact with it. Some of these design patterns are unique to particular communication technologies; others are more general and shared by multiple technologies. We can call such shared patterns the forms of information. In this article, I will discuss three common information forms of web-based social network services: a data stream, a database, and a timeline.

Any discussion of the interfaces of the social network services has to be done carefully. First, Twitter, Facebook, Google+ and other companies periodically change the interfaces of their products (think of Facebook Timeline interface introduced at the end of 2011). Second, the companies are expected to provide access to these services on a number of platforms – full web sites, mobile web sites, the apps for iOS Android, and smart TVs. [ ] Because of the differences in screen size, types of files which can be accessed, and other technical details, the interface for each platform will have some differences. Third, numerous third party web sites and mobile apps designed for interacting with the social platforms may also offer alternative interfaces. For example, instead of a single timeline as offered by the native Twitter interface, TweetDeck allows the user to curate multiple columns each displaying different type of content. (In 2011, over one million applications and clients for Twitter were registered.) [ ] Forth, users also use third-party applications which aggregate the data from many social networks, providing a single commmand center interface (for example, HootSuite). Because of these considerations, it is not meaningful to talk about a single “Twitter interface” or “Facebook interface” as seen by all users. So when I discuss below the details of how these services organize information, these details only refer to the services’ own web sites and mobile clients, as they are implemented currently (10/2012). However, the general information forms of a data stream, a database, and a timeline are central to both

Before digital computers, the data was typically recorded in some permanent medium. This meant that the format in which it was presented was also fixed. The invention of interactive graphical computing in the 1960s enabled displaying the same data in various ways on the computer display. The user experience of the data was no longer dependent on how it was stored (files, relational databases, object-oriented databases, etc.) (More precisely, we should say that the physical representation of data, its logical representation, and its user representation became separated.) In addition, the display could now be updated dynamically in real-time. This added more possibilities for displaying data.

When the first Macs in 1984 brought a graphical user interface to the masses, they popularized this new independence. The Finder allowed users to view the files and applications as icons or as a list of items, and sort any view in four different ways.[ ]

The rise of the World Wide Web in the 1990s, and social media in 2000s expanded this principle to other collections of data. As an implementation of a hypertext system, the design of the web and the graphical web browsers emphasized a particular information form - hyperlinks between separate pages (thus the logical model of the web and the interface view were closely aligned). In many popular illustrations of the web at that time, it was similarly shown as a network of single linked documents. However, the users were free to link web pages n whatever way they preferred. This led to the emergence of certain common patterns for organizing the data that were not originally planned by computer scientists. Rather than being sets of pages all linked to each other, the actual web sites created by users and companies in the 1990s often followed a different organization: a single page presenting a large collection of linked documents, i.e., a curated catalog of data objects. The examples included the list of “favorites” (other sites a user liked), a collection of personal photographs, separate radio shows archived on the site of a radio station, etc. Such digital catalogs were also very common in stand-alone digital media products such as CD-ROMs presenting artworks from museum collections.

In my 1998 article Database as a Symbolic Form I called this information form a “database” and opposed it to the historically dominant way of organization information – a narrative. [ ] I used the word “database” to describe a catalog of objects that does not have a default sort order. (Metaphorically, we can say that in a catalog the objects are organized in space rather than in time.)

As I explained in the beginning of that article, the actual computer databases are anything but simple catalogs of objects. The databases allow users to compute and retrieve many kinds of information about the stored data; generate and combine subsets of data; create different views of the data; and perform many other operations. These operations are enabled by database languages such as SQL. Similar to the separation between how data is stored and how it is represented to the user in general computer interfaces, database design also differentiate between three levels or data organization: external (specific database views designed for end users), conceptual (the global view only available to the database administrator), and internal (database implementation). [ ] I wrote: “From the point of view of user's experience a large proportion of them [museum multimedia, personal web pages, company sites, and other types of 1990s new media] are databases in a more basic sense. They appear as collections of items on which the user can perform various operations: view, navigate, search. The user experience of such computerized collections is therefore quite distinct from reading a narrative or watching a film or navigating an architectural site.” To underlie the importance of a computerized collection, I called a database the “symbolic form” of our time.

In 2000s, the web was reshaped by new economic, social and technological forces: web commerce (e.g., Amazon, iTunes), blogs, social networks and social media (e.g., Orkut, Flickr, YouTube, Facebook, Twitter, Sina Weibo, and many other media sharing sites and social networks), and mobile computing (smart phones, tablets, ultraportables). So what happens to the database form in this decade? Is it still the key information form, co-existing with other forms (for instance, a spatial narrative in video games)?

I want to suggest that in social media, as it developed until now (2004-2012), database no longer rules. Instead, social media brings forward a new form: a data stream. Instead of browsing or searching a collection of objects, a user experiences the continuous flow of events. These events are typically presented as a single column. New events appearing on top push the earlier ones from the immediate view. The most important event is always the one that is about to appear next because it heightens the experience of the “data present.” All events below immediately become “old news” – still worth reading, but not in the same category.

The data streams of Facebook and Twitter are perfect examples of this information form. (In design patterns terminology, it has been called "activity stream" pattern). [ ] In the center of Facebook is News Feed, featuring an updated list of user’s friends activities: conversations, status updates, profile changes, and other types of events. Even more immediate is Facebook Ticker that displays the updates instantly. [ ]

Twitter’s design also puts forward the “social stream” experience. Depending on how many users you follow, your experience maybe ambient (infrequent updates) or very dynamic, with new tweets from different users appearing rapidly one after another. (According to the Twitter help article, currently each user is limited to 1000 tweets a day, and user is normally allowed to follow up to 2000 users). [ ]

In Facebook and Twitter interfaces, individual broadcasts from spatially distributed users are formatted into a single constantly growing montage. (However, since no single author organized this montage, the events often have no connection to each other, so “montage” maybe is not the best term. We also can’t compare this with a surrealist intentional juxtapositions of completely unrelated objects; if you have many friends with similar backgrounds and interests, at least parts of your stream are likely to refer to similar topics and experiences.)

Watching the collective data stream formatted into a single column can be fascinating and mesmerizing. There is a pleasure in being "in the stream,” in watching rapidly growing conversation or a series of comments, in expectations about what new messages will appear next. And if you are switching your attention back and forth between the data stream and other social activities such as walking, talking with a friend, or doing homework, nothing important is lost because you can always scroll down to see the recent events you missed.

Data stream can be a called a quintessential modern experience (“Make it New”), but intensified and speeded up. But comparing data streams generated by hundreds of millions of people at the same time to navigating a modern metropolis or reading a newspaper a hundred years ago is as useful as comparing 2012 feature films (shot at 4K and put through the software where you can adjust every pixel) to first flicks by Edison and Lumière brothers. What they share pales in comparison to all their differences. You also can’t call a stream user a “voyeur” either since s/he actively participates in stream construction, posting and responding to posts by others.

If you do want to evoke some modern phenomenon, it maybe more meaningful to compare the data stream experience to that flaneur, as described by Balzac, Charles Baudelaire, Walter Benjamin, and many other modern writers. The flaneur navigates through the flows of passerbys and the city streets, enjoying the density of stimuli and information provided by the modern metropolis. He can intensify his experience of “being in the flow” by choosing particular places and times of days. (My own favorite place is Garosugil area in Gangnam part of Seoul that is busy the whole day, seven days a week.) [ ] Although Twitter or Facebook user experience of the data stream may appear passive - just watching from the outside the flow of updates on the screen – in fact she is like a flaneur because she actively seeks information density. Like flaneur, she also can control it. By subscribing and subscribing to different people, groups and lists, choosing what kinds of events will appear, and controlling her stream in other ways, she can adjust the density of the experience and how predictable or unpredictable the information will be.

At the end, all such comparisons to 19th or 20th century modern figures have limited usefulness. Because social networks are used by people for many different purposes and in different ways, with the patterns of use varying between age groups and genders, no single figure (voyeur, flaneur, etc.) can capture it all. (For example, the survey of U.S. Facebook users by Pew Research Center found that "Women average 21 updates to their Facebook status per month while men average 6." They also found that the average number of friends a user has varies dramatically between age groups. [; ]

Moving to the art, the first artistic representation of the collective web data stream was the amazing installation Listening Post by Mark Hansen and Ben Rubin (2002). [ ] In this installation, the bits of conversations pulled from multiple Internet chat rooms and bulletin boards forums were displayed simultaneously in six dynamic layouts across a large display wall made from 300+ small screens. Listening Post anticipated data flow interfaces of Facebook and Twitter by about five years – and today it keeps reminding us that these interfaces are not the only possible ways to format the data streams.

[End of Part 1]

"Big Data, Visualization, and Digital Humanities" - course at CUNY Graduate Center, Spring 2013

One million manga pages
Exploring a visualization of 1 million manga pages on 287 megapixel HIperSpace visualization system at Calit2, 2010.

Big Data, Visualization, and Digital Humanities
CUNY (City University of New York) Graduate Center, 365 5th Avenue, New York City.
Instructor: Lev Manovich
Course numbers: IDS 81650 / MALS 78500
Format: graduate seminar open to PhD and MA students

Want to visit the course and sit in on particular meetings?
It is possible but please email me first [manovich dot lev at gmail dot com].

Classes meet on Mondays, 2pm-4pm. Room: 3309.
First class meeting: Monday, January 28.
CUNY Graduate Center 2012-2013 academic calendar

Schedule, lectures, readings, tutorials, resources (will be updated during the semester).

Use the #digitalgc hashtag on twitter and blog posts.

Graduate Center Digital Initiatives:

Other Digital Humanities / Digital Media courses at CUNY this semester:

Arienne Dwyer - MALS 75500 – Digital Humanities: Methods and Practices - Mondays, 11:45 a.m.-1:45 p.m.

Luke Waltzer and Chris Stein - ITCP 70020 - Interactive Technology and the University: Theory, Design, and Practice - Tuesdays - 4:15 pm - 6:15 pm

Current courses at other universities which cover related topics:

Lauren Klein (CUNY 2011 PhD graduate), Georgia Institute of Technology: Studies in Communication and Culture: Data

Stefan Sinclair, McGill University: Digital Studies/Citizenry

Katy Börner, David Polley, Scott Weingart. Indiana University. Information visualization

Course description:

The explosive growth of social media on the web, combined with the digitization of cultural artifacts by libraries and museums opens up exiting new possibilities for the study of cultural processes. For the first time, we have access to massive amounts of cultural data from both the past and the present.

How do we navigate and interact with massive cultural collections (billions of objects)?

How do we combine close reading of individual artifacts and “distant reading” of patterns across millions of these artifacts?

What visualization and computational tools are particularly suited for working with large cultural data sets?

How do we use exploratory visualization as a research method in the humanities and social science?

How to understand visualization theoretically in relation to other visual media, past and present?

This course explores the possibilities, the methods, and the tools for working with large cultural data sets, with a particular focus on data visualization and the analysis of visual media (images and video). It also covers relevant work from digital art and design, media theory and software studies,

We will also discuss cultural, social and technical developments that placed "information" and "data" in the center of contemporary social and economic life (the concepts of information society, network society, software society).

We will critically examine the fundamental paradigms developed by modern societies to analyze patterns in data - statistics, visualization, data mining. This will help us to employ computational tools more reflexively. At the same time, the practical work with these tools will help us to better understand how they are used in society at large - the modes of thinking they enable, their strengths and weaknesses, the often unexamined assumptions behind their use.

Finally, we also want to ask general questions about theory and art in a "postdigital" society.
The arrival of social media and the gradual move of all knowledge and media distribution and cultural communication to networked digital forms has created a new cultural landscape which challenges our existing methods and assumptions:

What new theoretical concepts and models we need to deal with the new scale of born-digital culture?

What will be covered?

The course is suitable for students from any area of humanities or social sciences.
No technical skills are required beyond the basic digital media literacy.

Because I expect people from a variety of backgrounds, I wll not go deeply into statistics, data analytics, and data mining. As examples of what we will cover, I will explain PCA, and show how to do it in R; I will also talk about the concepts of "features" and "features space" and show examples of features for text, sound, image, and spatial data.

I will demo data analysis and visualization tools, and demonstrate their use in class. However, there wil be no required technical assignments. (Given variety of student backgrounds, such assingmemnts will likely to be too simple for some and too challenging for others.) You are strongly encouraged to try the tools and the techniques shown in class outside of the class meetings, and use them in your practical course projects. I will provide simple data sets and exerises you should work on if you want to learn these tools.

As my own research focuses on analysis and visualization of images, video, and interactive media, visual media be the focus of demos and examples. We will learn about Image Processing, how to extract features from images and visualize image and video collections. (I will not be giving a comprehensive overview of all DH tools people use to work with texts, maps, or historical data - but we will cover basic concepts).

Info about my research work:

You may want to take a look at my most recent classes, since the new class will draw from them:

Course structure:

1/3 instructor presentations;
1/3 discussions of readings and relevant projects from digital media, art, design, cinema, artistic visualization, architecture, museum design, digital humaniities;
1/3 demos of software and tools tutorials;

A typical "large cultural data analysis" project involves three parts: data, analysis/visualization, interface. Each concept will be discused in relation to current industry approaches, relevant projects the arts and design (historical and recent), media and software studies theory, and practical techniques and tools.

Course requirements:

Students will have a choice of doing the following: a final paper, a series of blog posts examining concepts or presenting a project; a practical project which can be done individually or in a collaboration with other students.

Digital Art History events in NYC, 12/2012 and 2/2013

Digital wave is finally reaching art history - here are two forthcoming events (I will be speaking in both):

Digital Art History Colloquium at Institute of Fine Arts.

Organized by:
Jim Coddington (Chief Conservator, MoMA).

Dates: Friday, November 30th and Saturday, December 1st. (I will add schedule here when it becomes available.)

Location: New York City.

Program is here

THATCamp CAA - a digital art history unconference in association with the College Art Association.

Organized by:
Beth Harris, Dean, Art & History, Khan Academy
Steven Zucker, Dean, Art & History, Khan Academy
Barbara Rockenbach, Director, Humanities and History Libraries, Columbia University
Carole Ann Fabian, Director, Avery Architectural and Fine Arts Library, Columbia University
Ileana Selejan, THATCamp CAA Coordinator

Dates: Monday, February 11th (12-5) and Tuesday, February 12th (9-3). You need to register to attend.

Location: New York City.

At CAA 2011, I co-organized a panel on "Visualization as a Method in Art History" - it was very well attended, and some people very clearly exited. I am looking forward to these two forthcoming events in NYC - let’s hope that some adventurous graduate students and institutions are ready to get info computational and visualization work with visual data.

Veja.vis: resultados preliminares

A pesquisa Veja.vis é um projeto inovador no campo das chamadas "Humanidades Digitais" (em inglês Digital Humanities) que busca demonstrar a potencialidade da investigação em ciências sociais aliada às tecnologias de análise quantitativa de dados (data analytics). Em nosso caso específico, seguimos a proposta teórica de Lev Manovich de analisar também fatores culturais através de processamento de dados em larga escala com a produção e aplicação de algoritmos "culturais" de reconhecimento de faces, gênero, entre outros, para tratar de temas que levariam talvez décadas para serem obtidos, normalmente em extensas pesquisas em bibliotecas ou mesmo acervos. Para o processamento desses dados utilizamos computação de alta performance em grid e as imagens que seguem foram renderizadas utilizando esse método. Nos casos analisados abaixo, fica demonstrada a potencialidade da proposta do nosso grupo de estudos no campo que está se estabelecendo denominado Humanidades Digitais (Digital Humanities). Em nosso projeto, além de dados quantitativos, criamos alguns algoritmos "qualitativos" com o intuito de provocar uma discussão sobre como a revista com maior circulação nacional vem tratando de alguns temas, mesmo que inconscientemente, em suas capas. Por questão de limitação em nossa capacidade de hardware e também de tempo computacional para renderizar todas as imagens, utilizamos somente as imagens digitalizadas das capas da revista, que foram objeto de nossa análise neste projeto acadêmico que vem sendo desenvolvido pelo mestre em Comunicação Marcio Santos sob a orientação de Cicero Inacio da Silva. Os resultados analisam questões qualitativas e quantitativas chegando a dados como:

a) as mulheres negras representam 0,33% do total das capas, apesar de o último censo do IBGE (2010) demonstrar que metade da população feminina é negra/parda;

b) apenas 12% das capas da Revista Veja apresentam mulheres como tema principal, sendo que elas hoje são 51% da população;

c) a mulher negra é apresentada em apenas 3 tipos de temas: Guerra (refugiados), Carnaval e política (1 capa= 0,0041%), reforçando o estereótipo clássico do papel da mulher na sociedade brasileira;

d) os homens negros aparecem em uma porcentagem um pouco maior, um total de 2% das capas totais de VEJA (47 capas);

e) nas temáticas em que as capas masculinas com homens negros aparecem, três temas se destacam: esporte (40%), Crime (11%) e Política (19%);

f) Das capas com homens negros, 66% são brasileiros e o restante personalidades estrangeiras.

O trabalho Veja.Vis foi apresentado em inúmeros congressos nacionais e internacionais, entre eles o Digital Humanities (Humanidades Digitais) 2012 em Sheffield, Inglaterra.

Cultural Analytics in the Middle East

I was invited to present our cultural analytics research at a wonderful event in Beirut:

Inverted Worlds - Congress on Cultural Motion in the Arab Region

In parallel to this congress, another event taking place on the same dates brought together a larger number of internet activists from accross Middle East:

Share Beirut: Internet, Activism, Culture

Beirut is a remarkable and vibrant place - if you have not visited it before, you don't know what you are missing.

The Interface Century

Google Scholar alert service found this sentence from my tweet which made its way into a spread in the new book Interactive Design: An Introduction to the Theory and Application of User-centered Design by A. Pratt and J. Nunes. I like the design of the spread, so here it is:


The tweet was originaly quoted by Aaron Koblin in his 2011 TED talk. Aaron's work is always a source of inspiration to me - he finds imaginative new interfaces to data, turning into art and making it human.