With your lead vocal performance comped and tuned, and its timing tightened, is there anything else to do before you start mixing?
Over the past couple of months, I’ve written about comping vocal performances from multiple takes and correcting any tuning/timing issues they might have. And, up until about decade ago, that was pretty much all there was to the mix‑preparation process, as far as vocals were concerned. In recent years, however, facilities for manipulating individual audio regions/clips within your DAW have continually improved, and professional vocal producers now routinely use the hell out of those functions to refine lead‑vocal sonics and lyric intelligibility. In this article I’m going to explore some of these techniques, and how you might get the best out of them.
The first vital tool is the humble ‘clip gain’ setting, which lets you apply a simple level offset to any audio region, and it’s advisable to use this to even out big level disparities between words/phrases before you apply any general mixdown compression.
But isn’t compression supposed to sort this kind of stuff out anyway? Well, yes, but compressors often won’t actually do what’s most musically appropriate. For example, if your singer accidentally moved closer to the mic for one phrase, then the compressor will just see more signal level and will compress more aggressively, potentially giving that phrase a different sound flavour or emotional character. By using clip gain to turn down the overloud phrase instead, before it hits the compressor, the compression and vocal sound remain more consistent. Similarly, if you’ve set up your compressor with a slow release time in search of a fairly natural sound, but then the singer suddenly hits one particularly loud note, the compressor may trigger heavy gain reduction that lingers too long, making subsequent notes too quiet. Again, a simple clip‑gain move neatly side‑steps this problem.
You might also want to use clip gain for more detailed rebalancing within vocal phrases. So where you’re planning on creating a heavily compressed vocal sound, say, it can make sense to fade down the breaths and sibilants with clip gain, knowing that these will invariably be overemphasised by most compressor designs otherwise. Plus some elements of a vocal recording are inherently more difficult to bring out in a busy mix than others, so I regularly use clip‑gain adjustments to improve the intelligibility of low‑energy consonants such as ‘h’, ‘l’, ‘m’, ‘n’, ‘ng’, and ‘v’. That said, sometimes singers will swallow a consonant so badly that there’s actually nothing there to boost! In those cases, I’ll often search for a replacement consonant from an alternate vocal take if possible, and occasionally a consonant from elsewhere in the timeline will also work, although your success there will depend heavily on the specific adjacent vowels in each instance.
Although clip‑gain settings are already tremendously useful on their own, the real power of region‑specific adjustment lies in the ability to apply EQ to individual audio snippets. For instance, if your singer has ‘popped’ the mic, resulting in strong low‑frequency wind‑blast pulses on plosive consonants such as ‘p’ and ‘b’, the best way to sort this out is by high‑pass filtering just the offending moments. Admittedly, you might be able to get similar plosive reduction by just high‑pass filtering the whole vocal channel, but that runs the risk of thinning out other sections of the performance too, unlike the more precisely targeted region‑specific strategy. Furthermore, it’s easy to apply different high‑pass filter settings to different plosives if some of them are more severe than others (as is often the case) — or even to deal with other sporadic low‑frequency problems such as the thumps of accidental mic‑stand collisions or a musician absentmindedly tapping their foot on the stand base.
A similar tactical approach can also elegantly address over‑prominent breaths or noise consonants (such as ‘k’, ‘ch’, ‘s’, ‘sh’, and ‘t’ sounds), because you can tailor the EQ processing to respond to each moment’s unique demands. For example, many studio condenser mics have a strong brightness peak in the 10‑12 kHz zone that doesn’t subjectively enhance vowel sounds nearly as much as it does noise consonants, so it’s easy to end up with an apparently paradoxical mix scenario in which the singer simultaneously seems to lack ‘air’ (because the vowels need more high end) and sound abrasive (because the noise consonants are over‑hyped). Applying simple 12dB/octave low‑pass filtering to just the abrasive consonants can sort this out instantly, although you may want to tweak the filter’s cutoff frequency...