originally published in "Music Works," v. 56, p. 48-54, 1993

by Don Ritter


The first device capable of synchronizing imagery with music was developed in 1734 by Louis-Bertrand Castel, a priest, mathematician and philosopher, who created an instrument which provided coloured light in response to music (Peacock 1988). His instrument, a modified clavichord, contained coloured tapes illuminated from behind by candles. A few years later, Karl von Eckartshausen, author of Disclosures of Magic from Tested Experiences of Occult Philosophic Sciences and Veiled Secrets of Nature, wrote of a similar device which had been influenced by Castel's design.

    I have long tried to determine the harmony of all sense impressions, to make it manifest and perceptible. To this end I improved the ocular music invented by Pere Castel. I constructed this machine in all its perfection, so that whole chords can be produced, just like tonal chords. Here is a description of the instrument. I had cylindrical glasses, about half an inch in diameter, made of equal size, and filled them with diluted chemical colors. I arranged these glasses like the keys of a clavichord, placing the shades of color like the notes. Behind these glasses I placed little lobes of brass, which covered the glasses so that no colour could be seen. These lobes were connected by wires with the keyboard of the clavichord, so that the lobe was lifted when a key was struck, rendering the color visible. Just as a note dies away when the finger is removed from a key, so the color disappears, since the metal lobe drops quickly because of its weight, covering the color. The clavichord is illuminated from behind by wax candles. The beauty of the colors is indescribable, surpassing the most splendid of jewels. Nor can one express the visual impression awakened by the various color chords. (von Eckartshausen, 1791)

Over the centuries, the synchronization of imagery with music has been known as ocular music, visual music, color-music, colour organ, light organ, and music for the eyes (Burnham 1968; Castel 1725; Eisenstein 1947; Klein 1927). Even though this field of study originates in the early eighteenth century, some contemporary researchers who are pursuing similar work with computerized equipment claim the concept began with sequencing software and real time computer graphics (Pfitzer 1991). In Colour-Music: The Art of Light, a book devoted to the exploration of synchronized sound and image, Adrien Bernard Klein (1927) wrote "...it is an odd fact that almost everyone who develops a color-organ is under the misapprehension that he, or she, is the first mortal to attempt to do so."

During this century, the most well know composition using synchronized imagery and music is probably Scriabin's Prometheus, the Poem of Fire, written in 1910 and first performed in New York City in 1915. The piece used the Chromola, a manual device which presented various patterns of light as indicated in the score (Galeyev 1988). Although composers and visual artists have created other works which incorporated synchronized image and sound during this century (Burnham 1968), the most refined utilization is found within sound film.

This application was pioneered by film maker Sergei Eisenstein as discussed in his books The Film Sense (1942) and The Film Form(1947), and incorporated into his sound films, such as Alexander Nevsky(1938) and Ivan the Terrible Part I (1944). His approach, however, did not support the absolute pairing of musical qualities with visual qualities--such as pitch with colour--but rather proposed that music be combined with imagery to provide a unified and desired mood. "The search for correspondence must proceed from the intention of matching both picture and music to the general, complex "imagery" produced by the whole." (Eisenstein 1947, p78)." This approach opposes Castel and others who had proposed strict correspondence between visual and musical qualities.

    The question must, nevertheless, be undertaken, for the problem of achieving such absolute correspondence is still disturbing many minds, even those of American film producers. Only a few years ago I came across in an American magazine quite serious speculations as to the absolute correspondence of the piccolo's tone to--yellow!
    (Eisenstein 1942, p. 109)


Synchronized music and imagery within a film, however, is different from a performance of imagery which responds to live music. Unlike a performance, a film does not provide a live experience to an audience; instead, it displays the result of various procedures disconnected in time and location. Before an audience can view a film, footage is typically shot at numerous locations, the film must be processed in a lab, music and audio recorded at various locations, sounds and images combined in an edit room, and the completed film eventually projected at thousands of locations around the world.

This disconnected process produces a result that is similar to recorded music because viewers of film experience an intricately composed product rather than a work's creation. "An essential part of "real" music is the live element, the indefinable but undeniable interaction between players and audience which makes music exciting" (Puckette 1991).

My own involvement with performances of synchronized video and music is motivated by a desire to create moving imagery that is composed directly before an audience. While the moment of creation corresponds with the moment of experience for some visual art forms--such as performance art--most visual media have separated these acts into creation and exhibition. During an exhibition of paintings, for example, an audience is typically presented with the result of creative acts that were probably executed at distant studios. Although it is technically possible to display the creation of a painting to an audience, such as Yves Klein's action-spectacles (Restany 1982), for most visual media--including painting--an audience is usually not given this opportunity.

Musical instruments, by design, permit the playing of music by a performer and the simultaneous hearing of the music by an audience. If the presentation of music is the result of two separate creative acts--composing and performing--the potential to observe the entire creation of a work is especially true when an audience experiences improvised music.

After numerous collaborations with various improvising musicians, it became apparent that traditional visual media are unable to combine the moment of a work's creation with the moment of experience by an audience. After becoming accustomed to creating works alone in my studio and later presenting these works in a gallery, the ability to create an interactive video performance before a live audience became an exciting manner of presentation that was completely different from other visual media. Perhaps the most appealing quality within an work of interactive video is that it seems to become alive when responding to stimulation, such as in responding to improvised music. When a visual medium is manipulated live before an audience, the evolution of a work rather than a finished work is observed.

As a result of these observations, I have concluded that musical media-- especially improvised music--are expressions of life, while visual media, such as painting, drawing and sculpture, are expressions of death. This difference becomes obvious by comparing the experience of visual media with the experience of live music. In a museum or gallery, for example, visitors calmly observe inanimate objects, rarely speaking and never clapping or cheering in approval. These viewers are like the bereaved at a wake, paying respect to a friend who will later be entombed in a storage room.

While music performances are presented in a variety of styles, from classical to new music to rap, they all provoke audiences to clapping, cheering, dancing, and a whole range of physical activities that are strictly verboten in a museum of fine arts. Pop music performances can stir thousands of screaming fans into uncontrollable frenzies of exuberance, but a viewer who claps and cheers before a favorite painting within a museum might be considered insane and ushered to the door by a solemn guard. The distinction between dead and living art is not intended to be a qualitative, but rather to emphasize perceptual and social differences.

As an attempt to make visual art a living medium, I have been collaborating with musicians to create works that are created live before an audience. These performances present video controlled by music, but the images can also influence musicians who are improvising.

I was fascinated with mechanical and electrical objects when I was a boy, and by my tenth birthday I was repairing the family television set. By high school I was infatuated with audio equipment and considered myself an audiophile, an addiction supported by a part time job in a stereo store. While electronics and audio equipment satisfied my mechanical curiosity and aural needs, visual art was my real passion. I decided to be practical after completing high school, however, and began studying electronics engineering in Edmonton. After completing these studies, I was hired by the telecommunications manufacturer Northern Telecom in Toronto where I designed telecommunication hardware and trained other designers. A few years later, I began studying fine arts and psychology at the University of Waterloo. During summer months of those studies, I worked for Northern Telecom and also for Bell-Northern Research in Ottawa where I researched and designed human interfaces for software and telecommunications equipment. Although I was constantly using computer equipment at that time, my paintings and sculptures were constructed from traditional art materials.

After completing the arts degrees, I began graduate studies at the Center for Advanced Visual Studies and the Media Lab at MIT, both being places where art and technology are explored. After working in these two fields separately for nearly 20 years, at MIT I began combining them in the form of videotapes, electronic music, video installations, and computer animation. During the second year of studies, I created an animation software that allowed real-time control of video through a computer keyboard. While enrolled in a class at the MIT Media Lab, I worked with guest musician George Lewis who was interested in presenting a performance of video controlled by his improvised trombone playing. The software was then modified to allow control of video through music, or more specifically, through MIDI. Our performance was presented during the Hyperinstruments concert at the MIT Media Lab in 1988, a public event featuring four interactive music systems created by students in the class. The event was well attended and it was covered by some Boston newspapers, including the MIT student paper:

    More audience abuse followed, however, with George Lewis and Don Ritter's onanistic Nose Against Glass for trombone and computer animation. A quarter hour of strange, meaningless sounds triggered projections on a large screen. At first a large pink blotch appeared, expanding and contracting like a giant amoeba chewing gum. Next came a sequence which would be at home in the world of Monty Python--a human face with arms and hands coming out of its orifices. Some people laughed, others recoiled in disgust at the tastelessness of the animation, the pretentiousness of the random sounds masquerading as music, and the degree to which the concert's promoters felt they could insult those who had come to listen and observe. (Richmond 1988)

Thanks to Phill Niblock, Lewis and I performed a few months later at the Alternative Museum in New York City in collaboration with Leo Smith and Richard Teitelbaum. That performance used two interactive video systems with imagery controlled by Lewis' trombone and Teitelbaum's keyboards.

During 1989 and 1990, Lewis and I performed approximately 20 interactive video and music pieces at various venues, including New Music America 89 (New York), The School of the Art Institute of Chicago, The Verona Jazz Festival, and FIMAV (Victoriaville). Although the imagery varied for these performances, the technical format was always similar. A radio microphone attached to Lewis' trombone transmitted his signal to a pair of computers which analyzed the notes and presented synchronized imagery onto a video projection screen. These computers used my software which was christened Orpheus in 1989.

Over the years of development, Orpheus has gone through many changes, though its interactive character has always been to provide real time video in response to music. Within the software, music in the form of MIDI data is first categorized according to selected combinations of pitch, dynamics, note duration, rest length, tempo, rhythm, intervals, note density and measure. These musical elements can be combined in different manners to create appropriate strategies for different music styles and instruments. A musical categorization is like an aesthetic judgement which permits the pairing of imagery with music having similar qualities. Each categorization of music provokes an associated sequence of digital video frames accompanied by a cinematic effect. Because individual frames can be presented up to 30 times per second, the video output appears to be a video tape synchronized with music rather than a series of static images being controlled by individual musical elements(Ritter 1992).

A variation of the typical performance occurred in 1990 at A Space in Toronto during Media Play: Given the Trombone and the Television. During the piece, a television channel was randomly selected and its video imagery was mixed with interactive images of tooth brushing, shaving and washing. These images were controlled by the television's live audio, while another layer of interactive imagery was controlled by Lewis' trombone. The resulting combination of broadcast and video imagery with broadcast audio and improvised trombone completely transformed the television program and its advertisements. Later that year, I presented a solo performance without Lewis, titled Media Play: Given the Television, at STEIM in Amsterdam which used the same images combined and controlled with a World Cup soccer game.

During New Music America 90 in Montreal, David Rokeby, Thomas Dimuzio and myself presented three collaborative works using Rokeby's interactive music system and Orpheus. The performances used interactive music and interactive video controlled by the movements of dancer Leslie Coles who performed adjacent to a video projection screen.

One of my most recent performances was A Structural Theory of Emotions, presented with percussionist Trevor Tureski at the Monterey Convention Center in California. The work was motivated by physiologist and musician Dr. Manfred Clynes who proposes that music is the best form to communicate emotional experiences among ourselves (Clynes 1977). In this performance, my system rated Tureski's percussive and sampled sounds and assigned them into one of eight emotional categories. These categories--such as sad, happy and angry--stimulated imagery of a human face having similar emotions.

Since 1989, I have presented other performances of interactive video controlled by music in collaboration with bassist Lisle Ellis (Montreal), saxophonist Amy Denio (Seattle), keyboardist Thomas Dimuzio (Boston), the CEE (Toronto), guitarist Nick Didkovski (NYC) and trumpeter Ben Neill (NYC). For these events, the form of the responsive imagery was typically based on the performer's instrument and playing style.

Audience reactions to these performances have varied, ranging from extreme enthusiasm to hostile rejection. Some viewers commented that they were unable to see any synchronization between music and imagery, while others stated the correspondence was too tight. Some audience members also praised the musicians for playing along in perfect time with the video tape they assumed was being played.

Presenting performances of synchronized video and music has been a pleasant and unexpected consequence to my personal interests in art and technology. I am most grateful to have collaborated with superb performers whose musical skills and patience for my naive musical questions were influential in the creation of a contemporary instrument which brings visual art to life.


Burnham, J. 1968. Beyond Modern Sculpture. New York: George Braziller

Castel, L.B. 1725. Clavecin pour les yeux. in Mercure de France(p.2557-2558). in Peacock, K. 1988. Instruments to Perform Color-Music: Two Centuries of Technological Experimentation. Leonardo Vol. 21(3) p. 399

Clynes, M. 1977. Sentics. New York: Anchor Press

von Eckartshausen, K. 1791. Disclosures of Magic from Tested Experiences of Occult Philosophic Sciences and Veiled Secrets of Nature. in Eisenstein, Sergei. 1942. The Film Sense. New York:Harcourt, Brace & World, p88

Eisenstein, Sergei. 1942. The Film Sense. New York:Harcourt, Brace & World

Eisenstein, Sergei. 1947. The Film Form. New York: Harcourt, Brace & Jovanovich

Galeyev, B.M. 1988. The Fire of Prometheus Music-Kinetic Art Experiments in the USSR. in Leonardo vol 21(4)p 383-396

Klein, A. B. 1927. Colour-Music: The Art of Light. London: Crosby Lockwood and Son. p.21

Peacock, K. 1988. Instruments to Perform Color-Music: Two Centuries of Technological Experimentation. in Leonardo Vol. 21(3) p. 399

Restany, Pierre 1982. Yves Klein. New York: Harry A. Abrams

Richmond, J. 1988. Electronic noodling around in Media Lab doesn't make music. in The Tech, Cambridge, MA: MIT. June 21, p.9

Pfitzer, Gary 1991 Music & Motion. in Computer Graphics World, July, p68-74

Puckette, M. 1991. Something Digital. in Computer Music Journal15(4), p65

Ritter, Don. 1992. A Flexible Approach for Synchronizing Video With Live Music. unpublished manuscript