32. VideoScheme System Overview

"There are only five musical notes, yet the combination of these five give rise to more melodies than can ever be heard."
-- Sun Tzu (500-200 BC). The Art of War. translated by Thomas Cleary, Shambhda Publications, 1988

VideoScheme is implemented as an application for the Apple Macintosh, written in C and totaling approximately 100KB of executable code. It provides a visual browser for viewing and listening to digital movies, using Apple's QuickTime system software for movie storage and decompression [Qui92]. The browser displays video and audio tracks in a time-line fashion, at various levels of temporal detail. Clicking on the video tracks displays individual frames in a "flip book" fashion, while clicking on the audio track plays it back; clicking twice in rapid succession plays back both audio and video.


Figure III.3 The VideoScheme System

The concept of providing additional power in an editing program through the use of programmability is not new [Ten93]. Common text editing programs such as Microsoft Word and Emacs provide a programming engine and build complex functions in their associated languages. Emacs, in fact, has a very LISP-like programming language. VideoScheme is the first video editor to implement this concept. Of course, programming is an advanced skill and many users will not have the necessary experience or knowledge to utilize the programming features of VideoScheme. In such normal cases, VideoScheme allows a skilled developer to write potentially complex editing functions in the Scheme programming language, while still providing simple capabilities for the naive user. Once a program is developed and tested, it can be mapped to menu or keystroke options for the normal user. Hence, the capabilities of the editor can be easily extended. VideoScheme includes an interpreter for the LISP-dialect Scheme, built on the SIOD (Scheme-in-one-Defun) implementation, along with text windows for editing and executing Scheme functions [Sio92]. Functions typed into the text windows can be immediately selected and evaluated. The text windows co-exist with the video windows, allowing very quick switches between manual editing operations and programming (see figure III.3). The environment, while deficient in debugging facilities, offers such standard LISP/Scheme programming features as garbage collection and a context-sensitive editor (for parentheses matching). In addition, it offers a full complement of arithmetic functions for dynamically-sized arrays, an important feature for handling digital video and audio.

Scheme was chosen over other alternatives (such as Tcl, Pascal, and HyperTalk) for a number of reasons. Scheme treats functions as first class objects, so they can be passed as arguments to other functions. This makes it easier to compose new functions out of existing ones, and adds greatly to the expressive power of the language. Scheme is also easily interpreted, a benefit for rapid prototyping. Scheme includes vector data types, which map very naturally to the basic data types of digital multimedia, namely pixel maps and audio samples. Finally, Scheme is easily implemented in a small amount of portable code, an advantage for research use. The most significant drawback to using Scheme is the programming syntax, which non-programmers (and even some programmers) find difficult to use. Desirable alternative languages include Logo and Dylan, languages with the positive properties of Scheme but with more attractive syntax.

We chose the SIOD Scheme interpreter for its small size, support of array data types, and its extendibility. This last feature made it possible to add new built-in functions which bridge the gap between the Scheme environment and video editing. The functions are designed to be independent of the lower-level QuickTime-based implementation; they could be re-implemented on another platform, to allow for portability of VideoScheme programs (figure III.4).


Figure III.4 VideoScheme's functional layout

The fact that Scheme is an interpreted language makes it ideal for a distributed environment. VideoScheme can be considered to have two major components, a graphical front end, and a Scheme back end. These two components need not always run in the same system. The graphical front end can run on a local machine and the interpreter back end can be run on a remote video server. The common component of the two, the Scheme programming language, is interpreted and, as such, is common to the two environments, even though they may be totally disparate architectures and operating systems.