Recording The Spoken Word

Published June 1994

Recording the spoken word requires a slightly different approach to recording vocals, and intelligibility takes over from musicality as the prime consideration. Paul White spreads the word.

When talking about recording the human voice, it's all too easy to make the assumption that we mean singing or vocals, but with the emergence of multimedia, and with more sound engineers working on sound for picture, recording the spoken word is an area of great importance in its own right. Even if you spend most of your time recording music, the chances are that at some time or another, you'll be approached to record an advertising jingle, or maybe a radio play, so it's important to know how to tackle voice recording when the need arises.

We are used to hearing our own voices in the context of typical domestic or workplace acoustics, but when we hear an announcer's voice on the radio, we are hearing it in a relatively dry studio. Radio studios aren't totally non‑reverberant, because that would sound unnatural, but it is essential that the reverberation time be short in order to maintain intelligibility.

Because various means can be used to simulate a natural acoustic environment, as may be necessary when recording a play or commercial, it is generally best to record the voice with as little contribution from room acoustics as possible. Most professional 'voice‑over' studios include a dedicated vocal booth which is almost anechoic, but this may not be a practical proposition in a home or small private studio, in which case there is a need to improvise. Instead of using a completely absorptive booth, we have to find other ways of ensuring that what enters the microphone is mainly direct vocal sound and that reflected sound from the room is, as far as possible, excluded.

A clear vocal sound starts with a good microphone, and a capacitor microphone is the best choice for two reasons:

  • Firstly, its extended bandwidth enables it to capture the subtle high‑frequency vocal detail that dynamic microphones tend to miss.
  • Secondly, virtually any capacitor microphone will be quieter than a dynamic model.

Even a good back‑electret mic will yield excellent results, the best‑selling AKG C1000 being an excellent choice, but with the introduction of low‑cost Russian and (former) East German models such as the Oktava and Microtech ranges, almost anyone who's serious about recording can afford a good‑sounding mic.

Assuming that a vocal booth is out of the question, using a cardioid pattern mic will go a little way to excluding reflected sound coming from the sides and from behind the mic. A more serious problem is sound reflecting back into the mic from the wall behind the speaker, so to get the inverse square law on our side, the speaker should be situated well away from any walls, somewhere near (but not exactly at) the centre of the room. In a typical room, this will still result in some reflected sound finding its way into the mic, so the next step is to improvise some absorptive baffles. These needn't be complicated or expensive, and for a temporary solution, a duvet or sleeping bag draped over a clothes drying frame works exceedingly well. Ideally, an area of at least one square metre directly behind the speaker's head needs to be treated. A similar effect can be achieved using thick furniture foam or acoustic foam tiles, and if these are spaced away from the walls, they will be more effective than if they are fixed directly to them.

Having sorted out the area directly behind the mic, it must be remembered that cardioid mics aren't completely 'deaf' behind and to the sides, and further absoption may be needed in these areas. Fortunately, this needn't be elaborate either, and simple blankets hung over frames, as shown in Figure 1, will usually suffice. If the ceiling is very low or exceptionally reflective, it may help to suspend a foam acoustic tile on a boom stand about a foot above the mic.

As with sung vocals, a pop shield should be considered an essential, not an option — in the absence of any background music to hide the flaws, any pops or thuds will stand out like Cilla Black at an elocution exam. An improvised shield of fine mesh stocking material stretched over an embroidery hoop or wire frame (the classic bent coathanger) will work as well as any commercially available pop shield, and should be positioned between three and six inches away from the microphone, directly in‑line with the speaker's mouth.


Should you or should you not compress? The answer to that one depends both on the speaker and on the project. If you're working on a jingle, you may need to use quite a lot of hard‑knee compression to keep the sound level even and to create the illusion of immediacy. In the case of a radio announcement or documentary voice‑over for TV, compression may still be desirable to keep the overall level even, but to keep the sound natural, it may be better either to choose soft‑knee compression, or to select a lower compression ratio in the case of hard‑knee models. For jingle work, a compression ratio of 8:1 or more might be needed, whereas for routine level control, a ratio of 4:1, or maybe even less, may be enough. The lower the threshold is set, the more gain reduction will take place and the more obvious the compression will be. For routine work, trimming between 6 and 10dB off the peak level is usually enough, but for a really intimate, up‑front sound, you might have to go as far as 15dB or more. Keep in mind that 10dB of compression is also a 10dB reduction in the signal‑to‑noise ratio, so if you need to use heavy compression, you have to ensure that your source material is as quiet as possible. To this end, it may help to apply some compression while recording and more when mixing, rather than doing it all in one go. If noise does become a problem, the judicious use of a noise gate or, better still, a dynamic noise filter, will usually improve matters considerably.


The secret of good spoken word recording is to use restraint when it comes to adding effects. Most of the time, a small amount of short room reverb is all that's needed — if you actually notice the reverb, it's probably too loud, unless you're trying to simulate a headmaster's speech or a church sermon. The room settings available on most reverb units are suitable for spoken word treatment, and if they are too bright, try rolling the top off the reverb returns; real reverb is invariably duller than its synthetic counterpart.

When trying to create the illusion of distance in a radio play, rather than just turning down the level, try rolling off a little top too. Conversely, when a person is supposed to be very close, brighten up the sound very slightly and reduce the level of reverb. Indeed, the reverb and tone controls can be used almost as a front‑to‑back panpot, as you can demonstrate using a recording of footsteps on a gravel path, created simply by walking on the spot. Start off with the footsteps quiet, roll off some top and add just a hint of reverb. Now, turn up the level, and at the same time, bring up the top end (back towards normal) and turn the reverb return level down. You may need three hands to do this, but with a little practice, you'll have the illusion perfected.

Working with the spoken word doesn't need elaborate or expensive equipment, and you can achieve quite acceptable results using a dynamic mic if you don't have a capacitor model (though a capacitor model will be noticeably better). It's true that you need a quiet environment, and you may have to drape a few blankets around the place, but you should be able to achieve professional results with only a little experimentation.

Matters Of Tone

  • If distant traffic rumble is a problem, try using the high‑pass filter on your mic or desk (or both at once).
  • If you feel the voice needs more impact, try an exciter or a specialist equaliser such as the SPL Vitalizer.
  • As a rule, the less EQ you use, the more natural the sound will be, but if you do have to use EQ, use it sparingly or you may end up with something hard hitting but grossly unnatural and unpleasant.
  • Pay particular attention to sibilance; if you can cure it by moving the mic position or changing the microphone type, that's preferable to trying to salvage a bad recording using a de‑esser.


Resist the temptation to position any scripts on a flat table, as the reflected sound will adversely affect the recorded sound. Pro voice‑over studios use metal mesh tables covered in fabric, to prevent reflections, but a cheap music stand often works as well. If the scripts do need to be set out on a flat surface, place a 2‑inch thickness of furniture foam on top of a lightweight table and then put the scripts on the foam. In the case of a long script, this may be the only way to avoid having to turn pages — and as you might imagine, it's almost impossible to turn pages without rustling paper sounds being picked up by the mic.