We are hiring: Computer Vision researcher position (freelance - you don't need to move)

Visualization of 23,581 photos taken during 24 hours in Brooklyn area. See http://phototrails.net/instagram-cities/

We are a digital humanities research lab (www.softwarestudies.com) at University of California, San Diego (UCSD) and The Graduate Center, City University of New York (CUNY). We are working on analyzing and visualizing cultural patterns in large sets of images and video. The examples of our projects include analysis/visualization of 2.3 million Instagram photos, 1 million manga pages, and all paintings of Vincent van Gogh.

We started this research in 2007, and we call it cultural analytics.

We are looking for a Computer Vision researcher to join us (freelance working remotely, or full-time).

You can live anywhere (as long as you have US social security number) and work freelance, choosing your own hours. So if you already have a full-time position, but have some free time, this job is for you. (Email, Skype or Google Hangout - we love them all). This freelance position is available for up to 1 year.

Or, if you happen to live in San Diego (where our lab is located) and prefer to work full-time (with benefits), this position is available for up to 2 years.


So far, we only used low-level visual features in our projects, and we want to go further - with your help:

You will use appropriate computer vision techniques to analyze visual characteristics and content of large sets of user-captured photographs from media sharing sites. Specifically:

1) You will use state of the art scene classification methods to automatically classify large sets of photographs from media sharing sites. [1 - See examples below]

2) Photo classification for a few selected object types that can be identified with high accuracy (such as faces / figures). [2 - see reference for current state of the art in objects classification task].

3) Analysis of visual attributes such as color, lines, shapes, composition. (We will use them not only as input to (2) and (3) but also for visualizing visual patterns over time, and the differences between image sets.)

In addition to photographs, you will also work on analyzing cultural images such as paintings, comics, or magazine covers and pages. The goal is to implement features which can capture stylistic evolution in image sets (for , all works by an artist, or all pages in a magazine over a number of years.) [3 - see examples below]

You can use any software (OpenCV, Matlab, etc.).

The analysis does not need to run in real-time.

Some photo sets may have EXIF data and other capture metadata; others may have only titles and tags; still others may only have even less data associated with them.

Note that our goal is not to come up with new or best possible algorithms - instead, we want to apply existing algorithms on large sets of images (and possibly video) to find out interesting things about culture (patterns in photos shared online, evolution of artists, etc.)

We are flexible and open - there are lots of cultural image sets waiting to be analyzed. The idea is to take well-performing computational methods and apply to them to image sets where we can get interesting results. We don't have particular narrow "problems" which need to be solved - instead, we want to explore patterns in any interesting cultural image set where computer methods can produce results.

Research outputs:

Our work is presented on the web (see, for example, Phototrais web site), in intenational exhibitions, and in articles and book chapters. Your name will appear on all our outputs which use your work. If you want to publish technical paper(s) about the research done with our lab, you can be the first author on these publications.

Here are examples of media coverage of our most recent project Phototrails:

Wired: Using 2 Million Instagram Pics to Map a City’s Visual Signature.

Fast.Company Co.Create: See Your City's Unique Visual Signature, Created by its Intagram Photos.

Creators Project: What Do Your Instagram Photos Say About Your City?

The Atlantic Cities: The Visual Signature of Your City.

The Guardian: San Francisco viewed through Instagram photos.

Qualifications - Required:

1) PhD in computer vision, image processing, machine learning, or related fields (MA considered).

2) Record of computer science publications in scene classification and/or object recognition in photographs.

Qualifications -Desired:

1) Experience with analyzing art images / aesthetic features of photographs, including implementing high-level "artistic features" (composition, etc.)

2) Experience with analyzing big image sets (millions of images).

3) Experience with automatic video analysis.

4) Experience with image collection visualization.


Open - depends on your experience and demonstrated results in relevant areas.

To apply:

Send email with your CV, publications and projects links, salary requirements and availability (starting date, number of hours per week) to: manovich@softwarestudies.com.

The review of applications begins now (July 25, 2013), and the offer will be made as soon as we find the right person.


[1] Examples of scene classification research:

Automatic Context Analysis for Image Classification and Retrieval.

Photo Classification by Integrating Image Content and Camera Metadata.

[2] Examples of analyzing art images / aesthetic features of photographs:

Affective Image Classification using Features Inspired by Psychology and Art Theory.

Studying Aesthetics in Photographic Images Using a Computational Approach.

[3] State of the art in object classification in photographs:

Visual Object Classes Challenge 2012 (VOC2012).

summary: http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/workshop/history_analysis.pdf

The cover of "Software Takes Command" - visualizing 62.5 hours of video gameplay

Software Takes Command cover spread
The front and back cover of Lev Manovich's Software Takes Command (Bloomsbury Academic, 2012).

The background image is taken from visualization of video gameplay created in our lab by William Huber. The data are the game play sessions of the video games Kingdom Hearts (2002, Square Co., Ltd.) The was played from the beginning to the end in 29 sessions over 20 days. All together, these sessions took place 62.5 hours.

The video captured from all game sessions were assembled into a singe sequence. The sequences were sampled at 6 frames per second. This resulted in 225,000 frames for Kingdom Hearts gameplay. The visualizations use only every 10th frame from the complete frame sets (22,500 frames). Frames are organized in a grid in order of game play (left to right, top to bottom).

Kingdom Hearts is a franchise of video games and other media properties created in 2002 via a collaboration between Tokyo-based videogame publisher Square (now Square-Enix) and The Walt Disney Company, in which original characters created by Square travel through worlds representing Disney-owned media properties (e.g., Tarzan, Alice in Wonderland, The Nightmare before Christmas, etc.). Each world has its distinct characters derived from the respective Disney-produced films. It also features a distinct color palettes and rendering styles, which are related to visual styles of the corresponding Disney film.

Like other software-based artifacts, video games can have infinite varied realizations (since each game traversal is unique). Compressing many hours of game play into a single image and placing a number of such visualizations next to each other allows us to see the patterns of similarity and differences between these realizations. Such visualizations are also useful in comparing different releases of the popular games – such as the two releases of Kingdom Hearts shown in the two visualizations below.

Kingdom Hearts videogame traversal
The complete visualization of Kingdom Hearts (2002) game play: 62.5 hours, in 29 sessions over 20 days. Full size visualization is 10810 x 8000 pixels (download from Flickr).

Kingdom Hearts II videogame traversal
Visualization of Kingdom Hearts II (2005) game play: 37 hours, in 16 sessions over 18 days. Full size visualization is 10759 x 8000 pixels (download from Flickr).

Visualization of 33292 Instagram photos on The Big Wall (66 million pixels tiled display)

Phototrails update 1

Lev Manovich Instagram Data-6

Last week we published our new project Phototrails - analysis and visualizations of 2.3 million Instagram photos from 13 global cities. Each visualization shows large numbers of photos organized by various attributes such as upload times, filters, or visual characteristics (hue, saturation, etc.)

Such big visual data looks best on big displays - such as The Big Wall constructed by Calit2 (California Institute for Telecommunication and Information). You can both see details of individual photos and the larger patterns at the same time. In contrast, when you use standard desktop applications or media sharing sites, you can see either one or another, but not both at the same time.

Below are the few photos showing two of Phototrails collaborators (Lev Manovich and Jay Chow) with one of our Instagram visualizations on The Big Wall. The visualization shows 33292 Instagram photos shared by people in Tel Aviv during one week. You can also view and download this and other visualizations (at smaller size) from the project site, or from our Flickr set.

The Big Wall s a tiled display environment consisting from 32 narrow-bezel LCD 55" displays. Each of the displays has full HD resolution (1920x1080 pixels), adding up to 66 million pixels on the entire wall (15,360 x 4,320 pixels).

Photos by Alex Matthews, Calit2.

Lev Manovich Instagram Data-2

Lev Manovich Instagram Data-14

Lev Manovich Instagram Data-7

Lev Manovich Instagram Data-8

Lev Manovich Instagram Data-21

Illustrations of early media computing systems - now in high resolution


1972-1973: Dick Shoup creates SuperPaint, the first complete 8-bit paint system, including hardware and software, at Xerox Palo Alto Research Center (PARC). This black and white illustration shows SuperPaint menu in 1975. For detailed history of early paint systems, see Alvy Ray Smith, Digital Paint Systems: An Anecdotal and Historical Overview, IEEE Annals of the History of Computing. 2011. We scaled the original illustration to the larger size of 2667 x 2000 pixels and cleaned up the result, so everything looks sharp.

Download high resolution image: 2667 x 2000 pixels.

My new book Software Takes Command discusses key computer systems from the 1960s-1970s that led to modern media authoring software (Photoshop, Illustrator, After Effects, etc.) As I was looking for illustrations of these systems, I was struck by the quality of documentation which survived.

There are a number of video clips showing Skecthpad (for example, this 10 minute TV show) and Xerox Alto in action. We also have full video of Douglas Engelbart' famous 1968 90 minute demo of his NLS system (which came to be called The Mother of All Demos).

These clips are invaluable because they allow us to see these systems in action, and to understand their innovations. But the resolution of all of them is low, and details are hard to see. There are a few illustrations in the articles and technical papers that were published about these systems, but their digital copies available online do not reproduce them well.

Together with Jay Chow (researcher at Software Studies Initiative), we took the best available images, scaled them up and cleaned the results in Photoshop and Illustrator. They appear in my book, but you can also download our high-resolution image files right here. Since I know that many people discuss these early media computing systems in their classes in a number of fields - software art, human computer interaction, media history, media archeology, etc. - I hope that these high resolution images will be useful. I have also added he descriptions of the corresponding systems/concepts from my book next to images.

Ivan Sutherland. "Sketchpad: A Man-Machine Graphical Communication System." 1963. Frames from Sketchpad demo video illustrating the program’s use of constraints. Left column: a user selects parts of a drawing. Right column: Sketchpad automatically adjusts the drawing. (The captured frames were edited in Photoshop to show the Sketchpad screen more clearly.)

"Created by Sutherland as a part of his PhD thesis at MIT, Sketchpad deeply influenced all subsequent work in computational media (including that of Kay) not only because it was the first interactive media authoring program but also because it made it clear that computer simulations of physical media can add many exciting new properties to the media being simulated. In Sutherland’s own words, “The major feature which distinguishes a Sketchpad drawing from a paper and pencil drawing is the user’s ability to specify to Sketchpad mathematical conditions on already drawn parts of his drawing which will be automatically satisfied by the computer to make the drawing take the exact shape desired.” For instance, if a user drew a few lines, and then gave the appropriate command, Sketchpad automatically moved these lines until they were parallel to each other. If a user gave a different command and selected a particular line, Sketchpad moved the lines in such a way so they would parallel to each other and perpendicular to the selected line."

Download the full resolution image: 2932 x 4374 pixels.


Examples of “view control” as implemented in Douglas Engelbart' NLS. During the demo, Engelbart shows how the same information can be presented in multiple ways. Top left: a hierarchical view of a shopping list. Top right: a collapsed view sorted by location. Bottom: a graph view showing the sequence of locations. (Text and graphics were traced from the original video of Engelbart’s 1968 demo.)

"As Engelbart points out, the new writing medium could switch at the user’s wish between many different views of the same information. A text file could be sorted in different ways. It could also be organized as a hierarchy with a number of levels, as in outline processors or outlining mode of contemporary word processors such as Microsoft Word. For example, a list of items can be organized by categories and individual categories can be collapsed and expanded. Engelbart next shows another example of view control, which today, forty-five years after his demo, is still not available in popular document management software. He makes a long 'to do' list and organizes it by locations. He then instructs the computer to display these locations as a visual graph (a set of points connected by lines.) In front of our eyes, representation in one medium changes into another medium—text becomes a graph."

Download the full resolution image: 1287 x 1768 pixels.


A diagram of the Xerox Star UI from D. Smith, C. Irby, R. Kimball, B. Verplank, B., E. Harslem, “Designing the Star User Interface,” Byte, vol. 7, issue 4 (1982), 242–82. The universal commands are located in the dedicated keyword on the left part of the keyboard. (The original illustration from the article was redrawn in Illustrator.)

"Let us look at contemporary omnipresent Copy, Cut and Paste commands. These operations already existed in some computer text editors in the 1960s. In 1974–1975 Larry Tesler implemented these commands in a text editor as part of Xerox PARC’s work on a personal computer. Recognizing that these commands can be used in all types of applications, the designers of Xerox Star (released in 1981) put dedicated keys for these commands in a special keypad. The keypad contained keys marked Again, Find, Same, Open, Delete, Copy, Merge, and Move. A user could select any object in an application or on the desktop and then select one of these commands. Xerox PARC team called them “universal commands.” Apple similarly made these commands available in all applications running under its unified GUI but got rid of the dedicated keys. Instead, the commands were placed under the Edit pull-down menu."

Download the full resolution image: 4000 x 5800 pixels.


P.S. You can also find many high quality images of early computer hardware and other artifacts in the online exhibition of Computer History Museum.

our new paper "Zooming into an Instagram City: Reading the local through social media" proposes "multi-scale reading"


Zooming into an Instagram City: Reading the local through social media

Nadav Hochman and Lev Manovich

Published in First Monday, July 1, 2013.

"This paper combines perspectives from social computing, digital humanities, and software studies in order to “read” and analyze visual social media data. Similar to researchers in the field of social computing, we study large sets of contemporary user generated social media, and use computational approaches in our analysis. We respond to the key question of digital humanities – how to combine “distant reading” of patterns with “close reading” of particular artifacts – by proposing a multi–scale reading. To accomplish this in practice, we use special visualization techniques (radial image plot, and image montage), which show all images in a large set organized by metadata and/or visual properties. Finally, we follow software studies paradigm by looking very closely at the interfaces, tools and affordances of the software (in this case Instagram) that enable the practice of social media."

More visualizations: phototrails.net

Phototails: visualizing 2.3 M Instagram photos from 13 global cities

What do billions of Instagram photographs can tell us about the world? How can we see larger cultural patterns contained in such massive visual social data? Do these images reflect the specificity of local places?

A group of researchers from the Art History department at the University of Pittsburgh, the Software Studies Initiative at California Institute for Telecommunication and Information and the Computer Science program at The Graduate Center, City University of New York collaborated to investigate these questions.

Their research is the first academic study to investigate Instagram’s big visual data. The result is a project called Phototrails (phototrails.net), which developed new visualization techniques to analyze and compare more than 2.3 million publicly shared Instagram photos from 13 cities such as New York, San Francisco, London and Tokyo.

The team’s findings are published in the July issue of First Monday (http://www.firstmonday.org), an open-access peer–reviewed journal. In addition, all visualizations and findings are available on the project’s web site at www.phototrails.net.

The researchers found that each city has its own unique visual signature on Instagram. Based on measurements of multiple visual attributes such as hue, brightness, line orientation etc., Bangkok was found to be the most visually different from other cities, followed by Singapore, and Tokyo.

“Our visualizations allow us to uncover the aggregated visual characteristics of each city as well as to examine the impact of exceptional events such as hurricane Sandy”, says Nadav Hochman, a Ph.D. student in the History of Art and Architecture department at the University of Pittsburgh.

The study also looked at the patterns of Instagram use among 312,694 people during a four months time period. The great majority of people only uploaded one or a few photos. The proportions of these active users vary significantly from city to city. For example, the percentages of people who uploaded more than 30 photos are 2% in NYC, 6.7% in Moscow, and 10.9% in Tel Aviv.

The researchers also found differences in the use of Instagram filters. In their sample, the proportion of photos to which Instagram users applied filters varies between 68 and 81 percent. The cities with the highest percentage of filtered photos are Tel Aviv, London, and San Francisco, while the city with lowest percentage is New York.

The project is a collaboration between Nadav Hochman (PhD student, History of Art and Architecture University of Pittsburgh), Lev Manovich (Professor at The Graduate Center, CUNY, Visiting Researcher @ Calit2, and Director of Software Studies Initiative), and Jay Chow (graduate of the Interdisciplinary Computing and the Arts undergraduate program at UCSD, and Researcher @ Software Studies Initiative).

Project web site: http://phototrails.net/

For further details and high-resolution visualizations contact: