Exploring Time in Structured Video

How can we characterize mass quantities of video in ways that make it more tractable to data exploration without losing aesthetic richness? Three new techniques are briefly discussed here:

  1. Time Frames, in which the time-axis is collapsed to create a single combined image (much like a long-exposure photograph)
  2. Time Slices, in which the video is "sliced" along the time-axis to reveal recurrent patterns not visible on the surface (much like counting the rings in a tree trunk)
  3. Time Structures, in which the video is arranged as a stack of images and then sculpted in 3d (much like a 3d medical image such as a brain scan)

The resulting combinations, cross-sections, and 3d models can all be used:
  • as thumbnails to characterize complex video clips
  • as analytic images to suggest regions and segments for further analysis
  • as classifiers for clustering large video clip data sets

Like common techniques of representing video such as film strips or montages of key frames, the above representations all work to make time simultaneously apprehensible. Unlike film strips and montages, however, these frames, slices, and structures all emphasize variation and continuity over time.

Examples below deal exclusively with video gameplay recordings, however experiments applying these techniques to other types of software-generated video and video in general are underway.

Time Frames: collapsing the time axis of video

Asteroids (Atari 2600): 3 sample frames

Asteroids (Atari 2600): mean frame of 10 mins
Each pixel represents the arithmetic mean of that location over the entire 10 minute recording of gameplay. The player

Asteroids (Atari 2600): standard deviation frame of 10 mins

Each pixel represents the amount of change over the course of the 10 minute recording of gameplay. Dangerous asteroids spawn thickly at the perimeter, and thin out as they approach the clear center column protected by the spinning player.

Orbient (Nintendo Wii): 3 sample frames

Orbient (Nintendo Wii): mean frame of 10 levels

Orbient (Nintendo Wii): mean frame for each of 9 levels

Desktop Tower Defense 1.5 (Internet): mean frame for each of two players

Desktop Tower Defense 1.5 (Internet): mean frame of 40 players

Time Slices: exploring the time axis of video

Cubello (Nintendo Wii): sample frame

Cubello (Nintendo Wii): mean frame of 5 minutes of play.
Image indicates consistent HUD-style user interface elements as possible areas of interest along three edges of the screen.

Cubello (Nintendo Wii): time slices of 5 minutes of play
Slices are taken from the ammo magazine column (left) and from the "bonus time" row (bottom). Each slice can be read as a graph.

Frogger (Atari 2600): 3 sample screenshots

Frogger (Atari 2600): mean frame of 5 minutes of play

Frogger (Atari 2600): "lily pads" time slice of 5 minutes of play

Reading from top, green bars indicate the duration the pad has been marked complete (filled with a Frogger face). Orange marks indicate the appearance of bonus flies.

Frogger (Atari 2600): "countdown timer" time slice of 5 minutes of play
Reading from top, orange bars indicate changing score, blue bar indicates countdown timer, which resets every time Frogger either fills a lily pad or loses a life.

Time Structures: scultping the space-time cube of video

Rockband (Nintendo Wii): sample frame

Rockband (Nintendo Wii): regions of interest in user interface

Rockband (Nintendo Wii): mean frame with two time slices.
Left: time slice of "Band meter" bar. Bottom: time slice of musical note sequence as tablature.

Rockband (Nintendo Wii): orthoganal views of band meter and musical note time slices.

Rockband (Nintendo Wii): 3-dimensional structures extracted from video regions of interest: tablature in 3d view, wheel-shaped meter (from center bottom of UI) and instrument marker (from inside left of UI)