"Your mission, should you choose to accept it, is to produce the music and sound effects for a major international documentary series, largely in a home MIDI studio." Could it be done? In Part 1 of their two‑part assignment, Special Agents PAUL D. LEHRMAN and STEVE OLENICK found out... This is the first article in a two‑part series.
How far has MIDI come in 10 years? Pundits (including the authors) have long been spreading the gospel of MIDI as a force for artistic democracy: it has brought the means of production down to the financial level where even a modest home or project studio can create audio worthy of the finest professional clients. But is this really true? Yes, post‑production houses have MIDI suites, Hollywood film composers use extensive MIDI setups on the scoring stage, and countless performing acts go before club and stadium crowds with nothing but their sequencers and samplers. But what happens when you decide to produce all of the audio for a major international television project using nothing but MIDI gear? Has anyone ever done it? Can it be done?
We don't know the answer to the first question, but we were willing to test the second. And we almost succeeded: with the exception of dialogue editing and final mixing, all of the sound for an ambitious three‑hour documentary series was produced, recorded, and synchronised using MIDI sequencers, synthesizers, and samplers. And the producers — experienced documentary filmmakers with no particular interest in the technology, only the results — could not have been more pleased. This article, and the second part to follow next month, details the decisions, equipment and techniques that went into pulling it off.
The project was called Blood & Iron: The Story of the German War Machine. In three one‑hour episodes, it detailed the history of the relationship between Germany's large industrial cartels and the nation's government, and how the lust for profits and expansion steered the country's highly destructive foreign policies. A nice light topic, to be sure. The time frame was from the 1860s — the reign of Bismarck — to the 1950s, when Adenauer's Federal Republic of Germany was admitted into NATO. The footage was both triumphant and grim: battles at sea, Hitler inaugurating the Autobahn, the trenches of France, ceremonial unveilings of new ships and factories, aerial dogfights, Zeppelins terrorizing London, and concentration camp slave labour. The producers were Robert Ross and Herb Krosney, partners in a New York production company called Krossfire Enterprises, who brought to the project very impressive worldwide credits for documentaries and books on international relations, warfare, and economics.
Although there was to be a constant spoken narrative, as well as readings of historical quotes by actors, Robert and Herb were counting on music and sound to contribute a great deal to the storytelling. Still photographs and graphics were an important part of the project, and the great majority of the motion picture footage, especially in the first two episodes, was silent. The producers' initial estimates were that about 65% of the footage would require new effects and/or music, but this turned out to be closer to 90%, and the figure for the first episode was almost 100%. Some of the footage was familiar to war buffs, but a lot of it was not: many reels of historical film had only recently emerged from the archives of the former East German government, and had not been seen outside the Soviet bloc, if at all, in over 50 years.
I work from my home in the Boston area, 200 miles from New York, and initially bid on doing both music and sound effects, explaining to the producers how MIDI and SMPTE could be used to streamline the production, and showing them multiple examples of my video work. Using only electronic instruments and mixing in a home studio meant that costs could be kept low. When, somewhat to my surprise, my proposal was accepted, I realised immediately that I had bitten off far more than I could chew, especially given the three‑month production schedule. So I sub‑contracted Steve Olenick, a more experienced sound designer who is also a composer — and also in the Boston area — to handle the effects. We decided to operate in parallel: although we were constantly communicating with each other for scheduling and technical purposes, only occasionally would one of us listen to what the other was producing. This was so that we wouldn't get bogged down in deciding whether the effects or the music should be emphasised in any one segment or shot, but instead could save those decisions for the mixing phase. We did, however, check in periodically to make sure the sounds we were coming up with would not grossly interfere with each other. The producers visited us on a couple of occasions to assure themselves everything was going smoothly.
This parallel production scenario is very popular in Hollywood these days: not because it makes better films or even because it's more efficient, but strictly due to economics. The cost of borrowing the money to produce a major feature is so enormous that every day which can be cut from the production schedule means significant savings, no matter how it's done.
Several practical considerations, besides the obvious ones of limited time and money, dictated how we were going to work. Several underwriters had contributed to the show, including the Arts & Entertainment (A&E) network, a cable service in North America; the Discovery Channel in Europe; and the France 3 broadcast network. Each of these required a different version of the series: the A&E version needed to allow spaces for ads; the Discovery Channel version needed to fill almost the full hour (this version was also going to be used for home video sales and rentals in America); and the French network required a version with music and effects (M+E) but no voice‑overs — they would be dubbed in later by a French cast, reading from a translation of the original script. The M+E version would also serve in other non‑English speaking countries where Krossfire hoped to sell the series. Therefore, keeping all of the elements separate until the final stage of mixing was crucial.
We were also confronted with constantly changing programs: each week Krossfire would send us new tapes, with scenes shortened or lengthened, added or cut, or moved around. This necessitated us working in an environment where the audio could be arranged in 'chunks', and manipulated as required — in other words, the perfect environment for MIDI sequencing. The video editing was being done on an Avid (Macintosh‑based) off‑line nonlinear system. Early in the process, it was cheaper for me to dub SMPTE time code and generate 'window burn' in my studio (using Mark of the Unicorn's MIDI and Video Time Pieces) than for Krossfire to try to construct a timecode track on the Avid, so the first few edits were sent un‑striped. I would make two copies with timecode and 'window burn', and give one to Steve.
Exactly what timecode numbers were used for each version were not particularly important — eventually, of course, they would all have to match up, but this would just be a matter of adjusting the start (offset) times of each sequence, and the rest of the cues would fall right into place. Soon, however, Krossfire started supplying timecoded tapes, and the matter became moot.
A major criterion for the sound effects was that they couldn't overwhelm the visuals: it might be great fun to garnish the sinking of the Lusitania with gobs of explosions and screams, but it would seem out of balance, even grotesque, against the grainy stills and the jumpy black‑and‑white movie footage. Hitting every single gunshot and splash of an oar into the water with an appropriate sound would likewise be highly inappropriate. So we made the decision that except for very dramatic battle scenes, the effects would be more ambience‑oriented than event‑linked. Still photographs would have no effects under them at all. Some of the truly horrific scenes — gas attacks, shots of concentration camp labour — would also have no effects, and would hopefully be more chilling that way.
The equipment used to manipulate the sound effects consisted of a Roland S750 sampler with 18Mb of RAM, and a Kurzweil K2000S sampler with 40Mb. Both were under the control of a Macintosh running Mark of the Unicorn's Performer Version 4 software. We'll talk more about how the sound effects were created, edited, and placed in the second part of this article next month.
The criteria for the music were different. It needed to sound historically correct, although trying to make it sound as if it was recorded at the time of the scene being shown would be difficult, especially with MIDI equipment. Besides, a deliberately poorly‑recorded score would be distracting and would probably soon become annoying. So although we used modern instrumental sounds, we strove to make the music compositionally authentic.
To make the sound as 'classical' as possible, I composed on a Kurzweil K2000S equipped with the Orchestral ROM sound block and 8Mb of sample RAM. This was augmented by an Emu Proteus/1 with orchestral expansion, a Kurzweil 1000 PX with Sound Blocks A&B, a Roland SC55 Sound Canvas (great honky‑tonk piano, and very clean percussion), and three Yamaha TX7 modules. A Kat dk10 percussion controller was used for the ubiquitous drum rolls and military tattoos. Sequencing was completed on a Macintosh with Passport Designs' Master Tracks Pro 5.2.
Matching the music (see box 'The Music: Getting What the Producer Really Wants') to the video was, for the most part, a straightforward process. I'm an old hand at using SMPTE and MIDI Time Code with sequencers, the only difference in this project being the sheer size of it.
One interesting technical issue was that the producers were using drop‑frame SMPTE timecode — something you lucky folks outside the US, Canada, and Japan never have to deal with. Drop‑frame code is used in North American broadcast television because of the odd frame rate of NTSC video: about 29.97 frames per second. Since SMPTE code counts 30 frames in a second, SMPTE time and 'real' time can drift apart as a program goes along. By 'dropping' (actually, skipping their numbers) a couple of frames every minute or so, you can keep the two time bases from getting too far away from each other.
I borrowed from the styles of a number of composers to underlay the historical footage. Ideas were borrowed from Mahler, von Weber, Satie, Debussy, Richard and Johann Strauss, Glenn Miller, Elgar, and Webern.
For industrial, educational, and other video programs that are not highly time‑sensitive, drop‑frame code is not usually necessary — the other kind, 'non‑drop', is close enough, and is easier to work with. But for broadcast television productions, which have to be timed accurately to the fraction of a second after 30 or 60 minutes or more, it's essential. Most sequencers that work with SMPTE and MIDI Time Code (MTC) have a provision for reading drop‑frame code. However, if you're not used to it (and I wasn't), you can get very confused: if you have two music cues exactly 15 seconds apart, and the first one starts at SMPTE frame 01:04:58:15, the second one doesn't begin where you would expect, at 01:05:13:15. Because two SMPTE frames are 'dropped' on the minute boundary, the starting frame of the second cue should actually be 01:05:13:17! On many occasions during the composing stage when the music seemed to go slightly out of sync with the picture, this arcane arithmetic turned out to be the culprit.
Much use was made of Master Tracks Pro 5's 'Fit Time' feature. My first task when scoring a segment was to insert markers into a blank sequence conforming to visual events that I wanted to accent, or 'hit'. Often I could compose from one marker to the next by improvising at the keyboard. With the sequencer locked to the video (through a MIDI Time Piece), I watched the picture on the TV monitor, and as the markers passed on the computer screen, I'd play along with the changes in action and mood. If I liked a section I'd played, I would stop the sequencer, take it off‑line from the video, and fill out the orchestration. Then I'd add or subtract measures to make the cue fit the action more tightly (see screens, far right). Classical music often uses a technique called 'liquidation', in which a theme is repeated in shorter and shorter segments, as a means of building tension and approaching a climax, and I often made use of this technique to stretch cues. With a final trim of the tempo using 'Fit Time', the musical changes were placed absolutely dead‑on.
I composed in short sequences, but so that things could go relatively quickly at the mixing stage, I then strung them together to make longer ones, some as long as 20 minutes. (Although the software can easily handle much longer sequences, moving around a huge sequence to find an individual measure or a note can be very slow.) Serious problems could have arisen if each sequence used a different set of instruments and patches: cutting and pasting them together in Pro 5's 'linear' format could result in problems, if one track was supposed to be a K2000 horn in one sequence and the same track in the sequence being appended to it was a Proteus orchestral percussion section. I therefore decided that every sequence would, at least initially, assign the same instruments and patches to the same tracks and MIDI channels.
To cut down the confusion even further, I limited myself at first to using just the K2000, and loaded into it Kurzweil's own General MIDI sound set (available on disk). Pro 5 has a built‑in 'device file' that lists all of the General MIDI patch assignments, which made it very easy to find sounds. Of course, a lot of the GM sounds were not particularly useful for the project (distorted guitar, shanai, koto, most of the sound effects bank, to name but a few), so I cleared those slots and filled them with numerous variations on orchestral strings, woodwinds, and brasses, as well as some of the Kurzweil's more effective atmospheric patches. I also made up a new percussion bank, drawing from the main ROM and the orchestral ROM, and including such useful sounds as snare rolls, orchestral cymbals and bass drums, gongs, police whistles, and train bells.
Using a little‑known feature of Pro 5, I saved an empty file, with all of my channel and program assignments, under the name 'template'. Every time I opened a new sequence after that, the template opened (with the name 'Untitled'). Even though I would be using different specific sounds in different sections, I designated four tracks for strings, five for woodwinds, four for brass, one for timpani, and two (assigned to the same channel) for percussion. The last channel had a 'silent' patch with a MIDI‑controllable reverb attached to it. I tried to minimise the number of patch changes within any sequence, so that when transferring material from one sequence to another there wouldn't be any unpleasant surprises. It wasn't possible to totally eliminate patch changes, however, and this led to some frantic searching for renegade program commands when sequences were combined.
Early on, the producers and myself decided that the music track should be essentially mono. A wide stereo stage, or even a narrow one with instruments at different locations, would be distracting and out of place given the historical (and monochrome) nature of the footage. The keyboard version of the K2000 has only four independent outputs, so breaking the orchestra up into more than two stereo pairs would have been impossible anyway. So mixing in mono had the added advantage of being a lot easier.
Since the same channels were used in every sequence for similar instruments, output assignments could be made at the MIDI channel level, rather than having to go into each program and change the output parameters. String channels were assigned to the K2000's left 'A' output, and woodwinds to the right 'A' output. Brass went to the left 'B' output, and timpani and percussion to the right 'B' jack. The 'A' outputs went through the K2000's internal effects processor, while the 'B' outputs didn't. MIDI volume commands were inserted into every sequencer track to set relative levels, so that there would be little or no need to worry about fader movements during the music mix.
Extra synths were used not so much for their unique sounds, but because several cues needed more polyphony than the K2000 could comfortably provide. For example, holding down the sustain pedal during a timpani roll makes it sound much more realistic, but it also eats up voices fast. To take full advantage of their flexibility, the Proteus, Kurzweil 1000, and Sound Canvas were similarly set up as multiple‑mono sources: four solo string patches (for a string quartet) on the Proteus were assigned to the two Main and two Sub1 outputs, while brass and timpani went to the two Sub 2 outputs. Strings on the Kurzweil 1000 all went out at the left side, with horns and pads assigned to the right. The Sound Canvas piano was sent out of the left channel and percussion out of the right, with the on‑board reverb on the Sound Canvas being used on the percussion track. Some of the reverb leaked into the left channel, but it was no problem.
The ability of reverb to help make the instruments sound more realistic was important to me. As mentioned, the K2000's 'A' outputs went through its internal processor, with the instrument set up so that the 'master effects' channel was 16; this contained the aforementioned silent patch, whose reverb and pre‑delay could be controlled from a dedicated track on the sequencer. Changes in spatial characteristics were thus programmable and automated within every sequence. The K2000's 'B' outputs were fed into the two inputs of an AKG ADR68K reverb, set up in 'split' mode to simulate two different 1‑in/2‑out devices. The brass side used a program with a pronounced slapback echo and a long, fairly dull main reverb, while the percussion side used no pre‑delay, and a short, bright reverb. Their parameters were also controllable over MIDI, and another sequencer track was assigned to that duty.
Mixing was handled by a Mackie Designs CR1604, which has six effects sends. The Yamaha TX7s, Kurzweil 1000, and Sound Canvas, as well as the Proteus's Sub 2 outputs, were manually routed to the AKG as needed, using the mixer sends. The solo strings on the Proteus were fed to a Lexicon LXP1, set to a bright chamber program without any MIDI control.
All of the source signals were panned dead centre on the mixer. The outputs of all of the reverbs were stereo, however, and we decided to take advantage of that to give the sound some spaciousness. A second CR1604, combined with the first using Mackie's Mixer Mixer, was used to handle the reverb returns. The reverb was panned wide — if, when we got to the final mix, it sounded too wide, it would be easy enough to collapse. A dbx 166 stereo compressor/limiter was wired into the desk's outputs and set for modest peak limiting.
Next month, we'll discuss how the sound effects were recorded, organized, processed, sequenced, and synchronised, and also the process of mixing all of the audio elements of Blood and Iron together — using yet more Macintoshes!
Co‑producer Robert Ross, with the assistance of editor Cindy Kaplan‑Rooney, supplied me with 'scratch tracks' of music on the audio portion of early versions of the shows. Robert and Cindy are both very knowledgeable musically, and so were able to find recorded music that they considered appropriate for many scenes and put it on the scratch track. It was my job to interpret their ideas: sometimes I imitated the scratch music closely, although taking care not to copy it (at one point, Cindy was so fooled by the imitation that she was afraid we might all get sued); sometimes I borrowed from it sparingly, and sometimes I ignored it completely and substituted my own ideas. In almost all cases, the producers agreed with my decisions. One notable exception was a jaunty Hogan's Heroes‑style theme accompanying scenes of Allied tanks rolling into the Ruhr near the end of World War II. Robert explained that when the scene was shot there was still plenty of fighting going on, and in fact he had asked Steve at that point to put in distant sounds of gunfire — so the music shouldn't be quite so cheerful.
In a few cases, Robert had very specific pieces in mind for the soundtrack, which he sent to me on CDs that he had bought in Germany when he was researching the film. Mostly military marches, these pieces were all in the public domain. The recordings, however, were recent, and still under copyright. Rather than get licenses for the recordings, Robert asked me to transcribe the pieces from the records and re‑record them with the sequencer and synths. This didn't happen too often, which was fortunate for me — not because it was any more difficult than writing original music would be, but because my future composing royalties are based on the amount of original music in the show. Public domain music, alas, no matter how cleverly rearranged, doesn't generate any royalties.
I borrowed from the styles of a number of composers to underlay the historical footage. Between Robert, Cindy, and myself, ideas were borrowed from Mahler, von Weber, Satie, Debussy, Richard and Johann Strauss, Glenn Miller, Elgar, and Webern. I cheated a little bit in the first show (which ran though the end of World War I), by using a piece of imitation Beethoven I had written some years ago in a very different context.
While at college, I had been given a term project assignment to write a piece closely imitating a classical composer. I chose to use as my model a Beethoven piano sonata, the 'Waldstein', but using entirely new melodic, rhythmic and harmonic material. As an academic assignment, it was very successful, but as a piano piece, it was unplayable (too much unrelenting left‑hand repetition). The piece, however, seemed to fit very well into the film score, and I got even more mileage out of it by arranging sections for string quartet, wind band and full orchestra, and using them to convey very different moods.
For one segment, describing the attack on the Lusitania, Robert wanted something very close to Beethoven's 'Moonlight' Sonata, but he didn't want to use the actual piece — it was so well known that it would undoubtedly seem corny. So I took the mood and rhythm of Beethoven's piece, and imposed on them the melody and harmony (in minor mode) of my academic opus. The result was a haunting, original piece that would seem somehow familiar to audiences, even though they would never be able to positively identify it.
Other music that Robert and Cindy supplied as scratch material were the soundtracks from Das Boot and Schindler's List. These tracks were so appropriate for various scenes that I had to take extra care to make sure they were imitated very indirectly, as anything too similar to the originals would be easily identified. (We can't resist repeating a wonderful story told by composer John Williams about when he was first approached by director Steven Spielberg to score Schindler's List. Williams was so affected by the film that he told Spielberg, "You need a better composer than I for this." "You're right," replied Spielberg, "but they're all dead.")