You are here

A Practical Guide To Working With Pictures, Part 3

Tips & Tricks By Hugh Robjohns
Published August 2000

The major players in the market for post‑production and dubbing mixers are Harrison, SSL and AMS Neve. This is the Neve Logic 2 desk in the Television Dubbing Theatre at Pinewood Studios.The major players in the market for post‑production and dubbing mixers are Harrison, SSL and AMS Neve. This is the Neve Logic 2 desk in the Television Dubbing Theatre at Pinewood Studios.

Hugh Robjohns looks at how a music track for film or video is handled at the dubbing stage, and explains much of the jargon associated with this element of a programme's construction. This is the last article in a three‑part series.

Composing, recording, mixing and producing your music tracks may seem a hard enough achievement — and it undoubtedly is — but there is still a long way to go before your music sees the light of the cathode ray tube! The dubbing stage is where all of the sound elements are combined to produce the final soundtrack and, as with any specialist area of sound production, the many complex techniques and the indecipherable jargon can be completely bewildering to the unfamiliar. The aim of this article is to help to demystify the process so that, should you be invited to attend the dub, you will stand a fair chance of understanding what is going on.


Akai's DD8 digital dubber is a popular magneto‑optical 8‑track recorder designed for post‑production work.Akai's DD8 digital dubber is a popular magneto‑optical 8‑track recorder designed for post‑production work.

Long before the dub (or possibly running alongside if it is a very long dubbing process, such as for a feature film), the dubbing editor(s) will have been assembling the required sound elements. In days of old, this would have meant transferring everything to an audio equivalent of the picture film — essentially 16 or 35mm recording tape with sprocket holes punched down one side, called 'sep‑mag' (short for separate magnetic track). By 'everything' I am referring to all of the constituent sounds — the dialogue and sound effects recorded on location, ADR (dialogue replacement tracks recorded in the studio), foley effects (specific synchronous sound effects, such as footsteps, replicated by someone watching the on‑screen action in the studio), library sound effects typically from CD collections, and music (your bit!).

Dozens (or even tens of dozens) of these sound reels would typically be created. Each reel would contain a logical group of sounds such as just location dialogues, various background effects, foreground 'spot' effects, music and so on. Each sound element would be physically edited to the right length and positioned within the reel using silent leader tape to ensure accurate synchronisation with the associated pictures (which would already have been edited).

Having created each reel, the dubbing editor would draw up a 'dubbing chart'. This consisted of a collection of large sheets of paper, each divided into vertical strips representing the various reels. Shaded areas on the chart represented sound elements on each reel to show the mixer what sounds were on which reels and at what times. The shape of the shading gave an indication as to how the editor envisaged the various tracks being combined, and there were usually written notes too.

These days, the role of dubbing editor remains, but the numerous sound elements are typically compiled and loaded into some form of hard‑disk audio workstation (Pro Tools or Akai systems, for example). Once on the hard drive or magneto‑optical (MO) disk, these elements can then be assembled in a playlist to create many tracks (performing the role of the old reels) with each audio element positioned to match the required synchronisation. Another advantage of workstations is that the playlist screen forms an automatically scrolling dubbing chart, allowing the mix engineer to see when various sound elements are about to happen, and on which tracks.

The Dub

The sheer number of different elements involved in a feature film soundtrack means that major dubbing studios can require substantial desks. This is the SSL 5000M in Theatre 1 at Pinewood.The sheer number of different elements involved in a feature film soundtrack means that major dubbing studios can require substantial desks. This is the SSL 5000M in Theatre 1 at Pinewood.

For the dub, the various pre‑edited audio components arranged on a collection of hard drives, often from a range of different systems, are then moved across to the dubbing theatre and configured to replay into the mixing console. It is not unusual to find that the dialogues have been compiled on one workstation system, the sound effects on another and the music on a third — dubbing editors, not unreasonably, like to use the editing tool best suited for each particular task. Until recently, it has not been possible to take a disk pack recorded on a Pro Tools system and install it in, say, an Akai system or an Audiofile, and still be able to edit and modify the files (although it has been possible to play native files across many machines for some time).

However, the recent AES31 specification for hard disk audio systems defines a common file format and protocol which should make audio workstations from different manufacturers entirely compatible... when they have all implemented it, anyway. At last, we might re‑attain the ability to work with material recorded on any platform, on any other platform — just as a reel of quarter‑inch tape recorded on a Revox could be replayed and edited on a Tascam, Studer or Otari machine!

Not all audio sources at a dub will necessarily be replayed from hard‑disk formats. There may also be a range of material on tape‑based systems, typically DTRS machines, but also timecode DAT. All of these disparate audio replay devices would be synchronised to the picture source via timecode — the pictures being projected from a conventional film projector, a video master, or possibly from a professional random‑access digital video recorder of some form. Film does not have an inherent timecode capability and film projectors produce something called a 'bi‑phase' control signal. This relates the speed and direction of the film transport, but not the absolute position. However, bi‑phase‑to‑timecode converters are available for synchronising more familiar transports.

Although digital workstations locate and enter play very fast, and modern tape transports synchronise far quicker than any previous generation of transport, it still takes several seconds for a complex timecode synchronisation system to achieve frame‑lock and stable speed. This, however, is a lot quicker than a machine room full of 'clockwork' sep‑mag recorders!

The Dubbing Theatre

Dubbing theatres come in many different shapes and sizes. The smallest are one‑man operations, typically centred around something like a Yamaha O2R console with an Akai or SADiE workstation. Moving up the scale are AMS Neve Logic consoles with Audiofile editors, and the Fairlight FAME systems, to name but two from the many alternatives. The top‑end dubbing theatres, generally only used for mega‑budget feature films, employ two or three audio engineers sitting behind an enormous console with a vast array of audio sources. Harrison, SSL, and AMS Neve are the major console players in this part of the market, and digital consoles are increasingly becoming de rigeur.

The idea of the multi‑operator console is that each engineer can look after a specific subset of all the audio sources during each premix pass, thereby making it much faster and easier to handle the huge number of original sound components and to produce the finished result. This approach makes a lot of sense when you realise that a major feature film could easily have well over a hundred reels of audio tracks running at any one time, many of these sound elements being stereo or even surround encoded already, all needing sophisticated equalisation, panning and other types of signal processing!

The Process

The dubbing process usually starts with 'premixes'. This is the first stage of submixing to reduce the hundreds of separate audio elements into something rather more manageable! The various background effects might be mixed first to create a couple of different background premixes, then the spot effects, foleys and so on. The dialogues will also be premixed, combining the location and ADR voices to produce a seamless voice track.

Eventually, the premixes will, in turn, be mixed to produce the M&E (see page 122) and, finally, the finished soundtrack. Another term frequently used instead of premix is 'Stem'. A Stem is a complete premix, usually in a multitrack format arranged for surround sound. So you might have an effects stem, a music stem, a dialogue stem, and so on.

Again, the practice has changed somewhat with modern technology. This process used to involve recording the premix onto a new reel of sep‑mag so that the dozens of separate original reels would gradually be replaced with a smaller number of premix reels. Now, the premixes are either recorded directly onto a digital workstation or, in the cases of the largest consoles with enough inputs to accommodate all the replay audio sources simultaneously, the audio is not recorded at all — the console automation data is recorded instead!

One of the major problems with physical premixes is that if some part of the balance doesn't work when all of the premixes are combined (if, for example, just one spot effect is a little too loud or quiet in the context of the complete mix), the offending premix reel has to be remixed again to correct the problem. In the past that meant taking off all the other premix reels, loading up the original effects tracks and remixing all (or part) of the relevant premix. Then the source reels were replaced by the other premixes and another trial of the final balance could be done — a very slow and laborious way of working!

The advantage of the huge modern digital consoles with advanced automation is that they can accommodate a vast number of input sources with negligible deterioration of the noise performance, and tweaking a premix balance is very easy and incredibly fast. Making a minor change to the automation data can usually be done on a computer screen in seconds, or the appropriate control can be accessed and adjusted on the fly, the necessary alteration being made live during a second pass.


M&E stands for 'music and effects' and is a complete soundtrack mix of everything except the dialogue tracks. The M&E is used as the basis for foreign‑language versions of a programme, where all the voices would be dubbed in the relevant language in a specialist studio. One of the reasons for performing premixes, aside from the need to reduce the number of sound components to something easier to handle when constructing the final mix, is that it naturally leads to the M&E mix as part of the process.

Any film or television programme with potential overseas sales should have an M&E track made up, although many production companies try to cut costs by not having this premix recorded separately. It is, of course, far easier (and considerably cheaper) to prepare an M&E at the time of the dub, rather than having to come back to it some time later and work through all the premixes again!

It is also possible that a number of different versions of the M&E mix might have to be made because of copyright clearances on the various music tracks. For example, it might be possible to acquire rights to use a commercial music track in a programme for public viewing in the UK, but not for use overseas. In this case, a separate 'world' M&E would have to be rebuilt using alternative music tracks.

Final Mix

The nature of the final mix depends, to some extent, on the intended audience. For example, the acoustic character of a large cinema is entirely different to that of a domestic living room, and the balance (both spectrally and dynamically) would be adjusted accordingly. Loudspeaker monitoring systems are aligned to well‑documented standards for different end‑listening conditions — the 'X‑curve' is used for cinema mixes, for example, where a lot of HF energy is built in to the mix to compensate for the air losses in a large auditorium. That is one reason why THX‑approved home‑cinema systems roll off the high end a little to avoid feature films appearing excessively bright in a domestic environment.

The final mix, whether in stereo, matrix or discrete surround, would usually be archived onto a timecode DAT or DTRS tape and eventually laid onto the film or video master, again with timecode as the synchronisation reference. If it is anticipated that some further dubbing work might need to be done (such as re‑editing a long programme into a series, or vice versa), then the various premixes might also be archived as stems onto tape or hard drives to minimise the amount of remixing (and therefore cost and time) required.

Reverse EQ

Sep‑mag machines really only ran at normal speed — either forwards or backwards — which meant that if you ran through a section to rehearse a premix pass, you had to run backwards through it again to get back to the top. Dubbing mixers quickly learnt to use this reverse play pass to their advantage and to minimise wasted time, by building the fader balance on the forward pass and, for example, making any equalisation adjustments as the tracks replayed backwards! With digital workstations, the ability to instantly recue has made this skill redundant, but it was a fascinating thing to watch and hear.

Lightpipes Explained

The dubbing chart is a major aid to the dubbing mixer, but another visual assistant, usually only found in the older, film‑based theatres, is the lightpipe. This system looks a little like a collection of giant LED bar‑graph meters under the projection screen, but in fact each represents the timeline of a specific replay track. A typical system might have 100 LEDs in each bar with a driver circuit which detects the presence of audio above a preset threshold. The audio signal would be derived from a pre‑read head on an analogue replay system (or in conjunction with a delay unit in a digital system) operating typically four seconds in advance of the real output appearing at the sound desk.

If audio is present, the first LED would be illuminated, and this illumination then clocked down the line once per video frame (25 frames per second). The visual effect is of a block of illuminated LEDs travelling across the bar graph, the length of the block directly related to the length of the sound. When the block reaches the right‑hand edge of the bar‑graph, the sound is heard. In this way, the mixer can prepare to open and close the fader with greater accuracy than is possible from looking at the dubbing chart and comparing timecode readings.


The chances of mixing several dozen effects reels to produce the perfect premix in a single pass are pretty slim, even for the most experienced dubbing mixers. Consequently, it is necessary to have some means of dropping in to a mix just before the point where it all went horribly wrong, and then continue mixing until the next time it falls apart! Drop‑ins are nothing new, of course — musicians have been dropping into to separate tracks on a multitrack recorder to replace a guitar part, for example, for decades. However, dropping in on a complete mix is a little more challenging.

The problem is that at the point of the drop‑in the old mix and the new mix have to be 100 percent identical, otherwise the mismatch will produce a click, a gap or some other clearly audible artefact. Also, if the synchronisation of the disparate replay sources is not stable, there might even be some audible phasing or flanging through the drop‑in crossfade period! With console automation, matching the positions of the faders is relatively easy, but in days of old the positions of everything had to be remembered (or marked with wax pencils on the fader escutcheons!) and the balance matched by ear.

This is where the term 'PEC/Direct' switching raises its head. Look at a brochure for any dubbing console and you will find mention of this obscure‑sounding facility. In fact, this is really nothing more than a set of multi‑channel monitoring switches — multi‑channel because of the need to record either stereo, matrix surround (four tracks), or discrete surround (six or eight tracks). PEC stands for 'photo‑electric cell', which was the original method of decoding the optical sound track on a film. Although optical tracks are still available on every theatrical release film print, they are not generally used in the dubbing process any more. However, the term has continued to be used, these days referring to the output of the multi‑channel recording device (tape, sep‑mag or hard disk). The 'Direct' position refers to the mix output of the console.

The idea is that by operating the PEC/Direct switch rapidly back and forth as the film plays over the section immediately prior to the point where it all went wrong, it is possible to compare the previous recording pass with the current output of the console. The dubbing engineer can then adjust the faders and other controls until the sound is identical in both positions of the switch. At this point it is possible to drop into record, safe in the knowledge that the punch‑in will be inaudible, and have another bash at mixing. If it all goes wrong again the next drop‑in would normally be performed slightly earlier than the last one, otherwise you could end up with a sequence of drop‑ins in close proximity which might become audible.