Just how do you make a mix sound loud without squeezing the life out of the music? Find out in our in-depth guide...
Whatever your views on the 'Loudness Wars' and the irreversible sonic damage (discussed throughout this article) that can be done to tracks in pursuit of loudness, when it comes to rock or pop music, and some electronic and urban styles, some degree of loudness processing inevitably becomes necessary. Why? While some playback devices include automatic level balancing, not all do, and tracks played on the radio or in shuffle mode on an MP3 player with no automatic level balancing need to be at nominally the same subjective level as commercial tracks if they're to bear comparison with those tracks.
Even if you don't want to mix loud, if the client expects it, you'll probably have little choice. You could educate the client on the benefits of a quieter, more dynamic mix; walk away from the project; or do your best to achieve a good but loud mix. If you choose to ignore their expectations, they can still invite a mastering engineer to make your mix loud, and you'd have no control over the impact this has on the tonal balance of your mix. Far better in this scenario to do your mixing with a loud result in mind. In this article, we'll explore tools and techniques to use during the mixing and mastering stages to make your mixes subjectively louder, and to create a tonal balance that will work in a loud track.
The Human Factor
Loudness seems such a simple concept: the higher the acoustic level, the greater the excursion of the eardrum, and our brains interpret this as meaning louder. So, you push a fader up or apply an EQ boost and things get louder; drag it down or cut frequencies and things become quieter. But our perception of loudness depends on much more. It's perfectly possible for two pieces of audio that a meter tells us are equal in level to be very different in perceived volume, and there are a number of reasons for this. If you're to follow any strategies for loudness, you need first to understand a little about how we humans perceive loudness...
1. Our 'Frequency Response'
The human hearing system is far from being ruler-flat. The ear's sensitivity to different frequencies varies dramatically with the acoustic intensity of the sound. The biggest difference is in our perception of low frequencies, and the flattest response occurs for sounds around 80-90dB SPL. Even then, it isn't flat, because of the ear-canal resonance and a natural insensitivity to high frequencies — as shown in the Fletcher Munson graphs, or 'Equal Loudness Contours' (pictured). These comprise a set of sensitivity curves at different levels, corresponding to the way a typical human hears sound.
At a nominal listening level (say, 83dB SPL), the response below 1kHz is pretty flat (within about 5dB). The sensitivity peak between roughly 2kHz and 6kHz is always there regardless of SPL, because it is caused by a physical resonance of the ear canal. Above about 6kHz, the sensitivity reduces by almost 20dB at 20kHz (ref. 1kHz). It follows that the more energy there is in the 2-6kHz region, the louder something will seem to be. That's one reason why an excessive boost here can make a mix sound aggressive and almost painful to listen to when played back at higher levels. It also explains why a subtle boost in this area, or choosing a bass synth with plenty of mid-range content, can help some sounds stand out better in a mix.
2. Energy: Frequency & Duration
Natural or acoustic low-frequency sounds generally contain more energy than higher-frequency ones. This is a contributory reason (it's not the only one) for why, when you put a spectrum analyser across a well-balanced mix, bass frequencies are usually shown to be the highest in amplitude, with the level falling off (at about -3dB per octave) as you move up the spectrum. Don't be tempted to aim for a 'flat-line' frequency balance in your mix! Our loudness perception is also influenced by the duration of sounds, again because we're sensitive to the amount of energy in a sound. The longer a sound lasts, the more energy it contains, so very short sounds (drum hits, for example) will sound quieter than sustained sounds of the same amplitude and frequency content. This is one reason why gated reverb makes drums sound bigger: it makes the individual hits longer.
3. Dynamic Range
The human hearing system also responds to the dynamic range of sounds, by which we mean the contrast between the loud and quiet parts of a sound. Broadly speaking, the greater the difference between loud and quiet, the greater the impression of the loudness of the loudest bits. This is why mixes, or even individual sounds, that have been over-compressed seem to lose their vitality. Dynamic range is more complex than this description implies, particularly when it comes to actually measuring it and our perception of it. For example, dynamic range might be a very short term thing, or it might relate to the changes in level over the duration of a song. To explore the subject in more depth, see the 'Metering' box, and read Emmanuel Deruty's article in SOS September 2011 (/sos/sep11/articles/loudness.htm). With a basic understanding of how we perceive loudness, we can formulate strategies to increase a track's subjective level, so let's look at the techniques we can use.
Where To Start?
We've said it already, but it's worth repeating: making a mix sound loud is not all about loudness-enhancement plug-ins or the mastering process. Many 'newbs' are thrilled at the promise offered by strapping a brick-wall limiter, or a 'maximising' plug-in like Sonnox Inflator or one of the Waves L-series plug-ins, across the mix-bus or after bouncing a track down: everything seems louder and better; there's more 'in-your-face attitude'; and so on. There can be a place for such plug-ins during the mix and mastering stages — but they're a very bad place to start in your quest for loudness.
In a digital recording system, there's a limit beyond which the actual level of the waveform simply cannot be further increased, and we call this 'Digital Full Scale'. Any loudness enhancing trickery has to be performed below and up to this level, as there's nowhere for the signal to go above that — if you try, the waveform will simply be clipped and anharmonic aliasing distortions will be generated as a result. (If you know what you're doing, clipping can nonetheless be a useful tool during the mix — more about that later).
There are more ways to manipulate loudness below this 0dBFS ceiling than you might think: we can control the average signal level using dynamics processing (for example, compression, transient design), control peak excursions using dynamic control (limiting, clipping) and make spectral adjustments (EQ, harmonic enhancers). We can also automate faders, and use tools such as stereo-width enhancers to good effect. However, don't forget that we can also influence the perception of loudness by choosing appropriate sounds in the first place. Initial musical arrangement and instrumentation decisions are extremely important. (You can find some bonus web-only material about this at /sos/mar12/articles/loudandproudmedia.htm).
Sculpting The Spectrum
Several tools and techniques fall into the category of spectral manipulation, from choosing an initial sound with an appropriate harmonic balance to using EQ and distortion to modify the sound's harmonic balance. If you trawl through some of the bass patches in a software synth and adjust them so they all read the same on your DAW's peak meters, you can compare how loud or quiet they seem. You'll find that the more they have in the way of mid-range harmonics, the louder they'll appear to be. Bass sounds that are close to being pure sine waves sound much quieter. (We may still choose to use them in some forms of music, just to get trousers flapping in front of a big club PA system!)
Low-frequency sounds can cause particular problems, as you can end up with huge-amplitude signals at extremely low frequencies that are almost inaudible. This wastes headroom, thus inhibiting your ability to make the track loud. Bass instruments and kick drums are obviously supposed to generate low frequencies, but the beater impact on a kick-drum skin or the hand pushing on the bass strings can generate extremely low-frequency components that aren't musically useful. Even vocals can cause problems: a pop filter may solve audible issues, but you'll often still see superfluous low-frequency energy peaking on plosive sounds.
The answer is to use low-cut (high-pass) filtering to remove anything below the area of interest. Cutting the lows on a bass or kick drum below around 40Hz can really help conserve headroom, enabling you to make the useful part of the sound louder. For non-bass instruments, cutting below 60 to 80Hz can help clean up the sound while saving headroom, but you can get away with surprisingly high turnover frequencies on some instruments in a busy mix, and this will help reduce muddiness in your mix too. Set the turnover frequency by ear, making judgements in the context of the mix, and you shouldn't go far wrong.
EQ: Boosts & Cuts
EQ boosts can be applied to emphasise the most audible elements of a sound — but it's all too easy to make something sound abrasive and unpleasant. A safer way of increasing loudness through brightness is to use a high shelving filter, set somewhere between 8 and 10 kHz, and then add a little boost to open up the high end. This can add detail and clarity without boosting the 4kHz region where an unpleasant harshness might otherwise result. This is a very common technique for lifting vocals and acoustic instruments out of a mix, but bear in mind that some elements need to sound more forward than others: if you try to lift out everything, you'll just end up with an unpleasantly bright mix!
An alternative approach is to use one very high quality EQ to apply a shelving boost on the master bus at the start of your mix, and sculpt individual tracks accordingly while mixing. The theory is that you don't need to boost with so many 'lesser' EQs on individual sources. That's less of an issue with powerful computers and the availability of multiple instances of plug-ins, but it can still be helpful if using the very best DSP-intensive software EQs in this role.
Another application of EQ is to use a mid-range dip to mimic those Fletcher Munson curves mentioned earlier. This is essentially what the 'Loudness' button does on some hi-fi systems: a 'smile' curve (so-called because of the appearance on a frequency/amplitude graph such as you find on many plug-in EQs) is applied so that the lows and highs seem to be pushed forward slightly, while the 200 to 1.5kHz region is pulled back a touch.
You can do this with individual elements in a mix where appropriate, a typical use being to create that 'scooped mid' rock guitar sound. Not only does the mid-range scoop create the impression of a big sound, even at lower volumes, it also makes a little more room for other voices and instruments competing for the same spectral space. Be careful of taking this too far when processing a full mix, though, because cranking up the levels on a playback system will reinforce the effect, which may sound terrible.
Adding Harmonic Interest
EQ can only rebalance frequencies that are already present: there's no point trying to EQ a sine-wave bass line, as there's nothing there other than the fundamental pitch — so all EQ will do is make the sound louder or quieter. This is where distortion comes in to play. Distortion is not just for guitar — we're not talking about fuzz boxes here! Subjecting any audio signal to a non-linear amplification process (such as tape saturation, a hard-driven valve circuit or an overloaded mixer input) will add harmonics that weren't present in the original signal.
When it comes to loudness enhancement, distortion works on two levels. First, the added harmonics make the sound more audible, pushing it towards the front of the mix. Second, overdriving an analogue circuit or analogue tape (or an emulation of them) also alters the dynamics of the sound. At lower levels, there's very little distortion, but as the level increases soft clipping is applied to the signal peaks, thus reducing the dynamic range. As the level is increased further, the amount of 'squash' applied to the signal peaks increases, until eventually the signal reaches the maximum voltage that the circuit can pass — at which point hard clipping occurs.
In some ways, then, this progressive distortion is like compression. It can reduce the dynamic range of the signal, and this has the effect of increasing its average level... and as we know, increasing the average level of a sound also increases its subjective loudness. The fundamental difference is that compression works on the overall level envelope of the sound, so little harmonic distortion is introduced, whereas overdriving a circuit changes the shape of the waveform.
The best-known device that uses deliberate distortion combined with compression, to increase both subjective level and brightness, is the Aphex Aural Exciter. By adding distortion and compression only to frequencies in the mid range and above, and mixing these with the original clean sound, the illusion of improved clarity is created, along with an impression of enhanced loudness. There are plenty of plug-ins that offer a variation on this basic theme, from the likes of Universal Audio, Melda Production, Waves and BBE. Again, it's easy to overdo this and end up with a brittle-sounding mix, so tread with care.
Dynamic Range Processing
While compression from analogue circuits can sound very pleasant, it can be hard to control and you don't always want the distortion. That's why we usually look to dedicated processors such as compressors and limiters when manipulating the dynamic range.
A typical compressor has a threshold control, so that only signals exceeding the threshold are 'turned down'. If you then adjust the output gain, or 'make-up gain', to put the maximum signal-level back to where it was before you added compression, the quieter parts of the signal will be louder, and the average level greater. It's worth bearing in mind that compressors will react differently according to whether their side-chain sensing circuit is peak or RMS sensing. The latter mode makes far more sense in musical applications and provides more gentle gain-reduction, though some compressors offer the option of switching between the two. In contrast, limiters, which we'll discuss shortly, employ peak-sensing detection circuits.
A problem with compression is that it only smacks down the loud parts of a signal, and in the sort of rock and pop mixes we're talking about here, this probably means the snare and kick drums, which tend to trigger most of the gain reduction. This leaves you with a conundrum: with a very fast attack time, a compressor can bring the drum hits under control very quickly, but the drum sounds can lose impact; yet, if you set a longer attack time to allow the transients to get through, you'll retain clarity but will still need to leave headroom to accommodate those brief transient spikes that escape the gain-reduction.
So how do you make this work? The first strategy is to control the dynamic range of percussive parts at the individual track level in a mix. This way, you can reduce the peak level of the transients while EQ'ing or otherwise processing the part to suit the track — and as the transient peaks aren't so prominent, they then won't interfere unduly with any drum-group or master-bus compression.
Quite what processing you apply to these individual sounds is another question, as there are several approaches that can be used on their own or combined: which tools and techniques work best is a subjective decision. The most common approach is to use compression, and two stages of more gentle compression in series (one compressor after the other) tend to be more sympathetic to the original sound than a single stage of heavy compression. A variation on this theme is to follow up the compressor(s) with a fast limiter to catch the transient peaks. In this way, you can usually claw back 3 or 4 dB of peak level without strangling the transients.
If compression and limiting appears to be dulling the transients, you can use EQ or harmonic enhancement of some sort to help compensate for this. The secret is not to overdo any single stage of processing — there's always more that can be done when mastering, so trying to make your finished mixes sound as loud as a commercial record, which has already been mastered, isn't really something to strive for.
Clipping By Design
Deliberate clipping can also enhance loudness by both increasing the average signal level and adding distortion (and thus harmonics in that part of the spectrum to which human hearing is most sensitive) — but it has to be used with care, because it can easily ruin an otherwise great recording.
Analogue clipping via your A-D converters (loop the track out of the computer through a device that adds gain and bring it back in again) tends to be a little more forgiving, as it generates harmonic distortions — where the added distortion components appear higher in frequency than the clipped fundamentals. Digital clipping, on the other hand, generates aliasing distortions which are anharmonic, and the added distortion artifacts will appear below the clipped fundamentals. Moreover, the distortion artifacts from analogue harmonic distortion are musically related to the fundamental frequencies, whereas those produced by digital anharmonic distortion are mathematically related to the sampling frequency and not the musical source.
In the case of sounds that are quite noise-like (cymbals, snare drums, hand claps and so on), the anharmonic distortions produced by digital clipping will also be noise like… which is why you can get away with it. But sources with strong harmonic structures (pianos, voices and so on) when digitally clipped sound very artificial and very obviously wrong with even the tiniest hint of anharmonic distortions. The same advice applies to your full mix!
If, as discussed earlier, conventional compression seems to be choking the life out of transients, parallel compression can be an effective tool to counter this, both when mixing and mastering. With parallel compression, we do what all the entry-level books tell us not to do — we feed the compressor from an auxiliary send, as we typically would with effects such as reverb or delay. (If you do this in a DAW, make sure the plug-in delay compensation is turned on, as the dry and compressed signals must be in perfect alignment.)
The parallel compressor is usually set to a relatively high ratio, and the threshold low enough that the amount of gain reduction can be as much as 20-30 dB, depending on the source. (If you solo this signal it will sound pretty disgusting!) Use a fast attack and set the release so that any audible gain pumping reinforces the rhythm of the song. Mixing this signal back in with the dry signal has the effect of making the sound seem bigger, louder, fatter and more punchy. Normally, you only need add a small amount of the compressed signal to make the magic happen — typically -15dB or so relative to the dry signal. Adjust the amount using only the compressor channel's fader rather than the send level, as adjusting that would affect the amount of compression taking place.
As the processed signal is usually so highly compressed that it is at an almost constant level, when you blend this back in with the dry signal, it forms only a small fraction of what you hear during loud peaks. However, when the dry signal is quieter, the compressed signal makes up a greater proportion of what you hear. Essentially then, instead of reducing the dynamic range by cutting down the signal peaks, we're achieving it by adding more to the sound during quieter sections. The technique also serves to lengthen percussive sounds, so you score extra loudness points there. Best of all, parallel compression still lets the drum hits breathe because they're no longer being heavily compressed. Parallel compression can also often be used during mastering to add life to dance tracks, or to rock mixes that have been mixed with insufficient attitude. So, if you haven't tried it before, it's worth setting aside some time to experiment with it on your next mix.
The more work you do to shape the individual sounds, the less you should need to rely on processors on the master bus. Some engineers prefer to work with no bus-processing at all, but when aiming for the impression of loudness, processing on the bus can play a crucial role.
The most popular tool used on the mix bus is the compressor. There's nothing fundamentally different in principle about using a compressor on the mix bus than on anything else, but there are subtle nuances. We explored this issue in depth back in SOS May 2008, and you can read that article at /sos/may08/articles/mixcompression.htm — it is well worth a read.
Your choice of compressor(s) can be critical, simply due to the complexity of the material and the tonality imparted by different designs. Whether you want the classic 'glue' effect of an SSL bus compressor at a 2:1 ratio, though, or the smoothness or attitude of a different model, it's rare that you'll want any more than about 3-4dB of gain reduction. A gentle ratio and a low threshold will give more transparent results, as will deploying a pair of compressors in series. Parallel compression is a perfectly valid approach here, though you'll often find that you don't need to be so aggressive in terms of gain reduction on the mix bus as we suggested earlier.
There's little point adding a mix-bus compressor when you're putting the final touches to a mix: if you're putting it on at the end, you might as well leave this until the mastering stage! The whole point is that you mix into it and make adjustments to your mix in light of the compressor's tonal impact — so start by getting a rough static mix balance together with, say, drums, bass and key elements like vocals and guitars, then strap your compressor across the mix-bus and start mixing in earnest.
There can be a place for EQ on the mix bus, though it's less frequently used than compression. As well as the trick mentioned earlier about using a single, high-quality, high-shelf boost on the mix bus, it can be common to use a 'safety' low-cut/high-pass filter, just to catch any rogue low frequencies before you hit a compressor. You could also consider using a high-pass filter in the side-chain of your bus compressor, so that its detection circuit ignores high-level low-frequency sounds like kick drums and electric bass. The result is usually a slightly cleaner sound with less audible pumping.
Of course, pumping might be exactly what you want, simply because it can give the impression that a mix is trying to break free of its digital confines. To some extent, this is mimicking the human hearing system's natural protective compression when confronted with very loud sounds. You can achieve this effect by side-chaining the compressor from the kick-drum, with the attack set to let the drum through before clamping down on the signal. You then balance the release with the tempo of the song, so that the level is only just restored by the next drum beat. You can get a similar, but more subtle effect that's suitable for other genres by playing with the attack, and particularly the release times, but without having to side-chain the compressor.
It's perfectly acceptable to use an EQ for other purposes on the mix bus — after all, what sounds right is right. But do try to be conscious about why you're using an EQ at that stage. Often it's far more effective to go back and juggle the faders: remember that the best way to make one sound seem louder is to 'reveal it', by dipping frequencies or fader levels for other elements of the mix. Also, remember that a bus compressor will react differently according to whether the EQ precedes or follows it. Both approaches are valid, and you can even use both together: EQ is typically used pre-compressor (or in the compressor's side-chain) to get rid of problem areas that would otherwise be emphasised, and post-compressor for tonal shaping.
Mastering 'Preview' Plug-ins
An important consideration at the mixing stage is to try to gauge the effect any further dynamics processing might have — for example at the mastering stage, or if broadcast on radio or TV. Compression and limiting tend to emphasise dominant frequencies, so the tonal balance of the mix needs to be judged with this sort of processing in mind. You can't be certain what settings a professional mastering engineer will use, but it can still be useful to insert a compressor and limiter setup in your main mix bus to get some kind of feel for the effect mastering might have.
Use a low-ratio compressor, as described in the following section, and set both the compressor and limiter to give between 3 and 4 dB of gain reduction on the signal peaks. It's probably not a good idea to mix through such plug-ins all through the mix, though. Instead, keep them bypassed most of the time, and from time to time preview your mix through them — just as you'd periodically check things for mono compatibility.
Once you're happy that the combined effect of these notional mastering plug-ins and your mix/group bus processing isn't harming your mix, make sure you bypass the mastering plug-ins prior to bouncing your mix. You can always create an alternative bounce with the processing on, to show a mastering engineer the sort of tonal balance and loudness that you're aiming for. Even if you're planning on doing the mastering/finalising process yourself, it's better to do this after the mix stage, as you can approach the finer details of your loudness processing alongside other mastering-style enhancements, such as stereo width and detailed EQ'ing. To get that right, it helps to have a fresh pair of ears!
The main reason to employ a mastering engineer is the combination of their ears and experience. You're unlikely to be able to match the quality of a decent, sympathetic professional mastering job, but if you want to go the DIY route, you can get reasonably good results if you know what you're doing. Clearly, you won't have the same accurate acoustics and monitoring as a pro mastering house, but if you check out your mixes on a number of different systems and adjust the mixes accordingly until they sound acceptable on all those systems, you can give it a reasonable stab. There are many aspects to mastering, but we'll concentrate on the processes that can be used to make a mix sound both loud and great — or, if misused, loud and horrible!
The processing chain usually includes EQ, which again may appear before or after a compressor, or both. Assuming you don't possess esoteric hardware, the better plug-in emulations of passive equalisers, such as the Pultec or the Manley Massive Passive tend to sound the sweetest. A subtle mid-range scoop and a decibel or so of high shelving boost can help clarify a mix in a suitably discreet way — but it's all down to what the track needs, rather than sticking to a fixed recipe. Adjustments may be made to compensate for shortcomings in the original mixing environment, but any changes you make here should be quite gentle: there's always the option of firing up the mix again and doing more precise tweaks on individual sources.
When it comes to using dynamic gain-reduction to maximise the loudness of your stereo track, the usual process is to take a full-band compressor set to a very low ratio (maybe as low as 1.1:1 or 1.2:1) and then pull the threshold right down until you see 4-5 dB of gain reduction on the peaks. Using such a low ratio pulls in the dynamic range of the whole track, rather than bludgeoning the peaks, so you get a more cohesive sound without losing the impact of the transient hits. Compressors with auto release settings can work well here, as the optimum release time may change throughout a track. Many dedicated bus compressors have a minimum 2:1 ratio, which isn't low enough for this job, unless you're using the compressor in parallel.
At the very end of the chain — at least the last thing before dither is applied — comes the limiter. Rather than set the ceiling of these at 0 or -0.1 dBFS, as many presets seem to do, it's a good idea to set them with a ceiling of -1dBFS, or even slightly lower. This allows some headroom for the digital filtering process involved in any subsequent MP3 conversion, and reconstruction of the waveform by consumer D-A converters without clipping (see the 'Metering' box for more on this). It's usually OK to trim 3-4 dB off the signal peaks, but use the meters as a guide only: let your ears decide what's right. Too much compression and/or limiting can make a mix sound really confused, aggressively 'shouty' and hard to listen to — and there's little point in making your mix sound loud if the end user simply gets the urge to turn it down or off because it sounds unpleasant!
In the quest for 'more of everything', the above compressor-plus-limiter approach is often augmented by an additional stage of compression. One option is to follow the low-ratio compressor with a second compressor using a more conventional setting at a higher ratio (between 2:1 and 4:1 or thereabouts) to squash the peaks by a modest amount before the signal hits the limiter. If you can pull the peaks down by a further 2dB without obvious sonic damage, you'll gain 2dB of subjective loudness. An alternative is to use valve gear or analogue tape (or an emulation plug-in such as the UAD Ampex or Tone Boosters Ferox) to add a little organic squash.
For a mix to sound loud, it needs to sound 'big', and that's as much about making best use of the stereo panorama as it is about levels. It can be very helpful during mastering to think of the stereo mix in terms of Mid/Sides signals, rather than Left/Right. For example, it's perfectly possible for a track to be firmly pinned down in the centre, while having much more dynamic range in the sides signal. You're likely to perceive such a track as being both louder and more dynamic than a stereo track with a similar overall 'crest factor' (see the 'Metering' box) that measures equal on both its middle and sides signals.
This balance will affect the tonality of the mix — with it typically sounding brighter the more Sides signal that's present — and can alter the overall level, so it makes sense to place any M/S widening processors before the limiter. The two processors will interact to a certain extent, so it's not necessarily a case of set and forget: you'll need to set the limiter, adjust the stereo balance, then re-tweak the limiter settings accordingly.
Several M/S width processors also offer the option of 'collapsing' the signal to mono beneath a user-defined frequency. It's a technique that used to be employed to ensure that bass frequencies would sit nicely in a vinyl record, but it's also useful in our quest for loudness. By placing bass frequencies in the centre, you're ensuring that the powerful bass is distributed equally between the left and right channels of any dynamics processor — not to mention between speakers during playback. Usefully, this tactic can also add a reassuring solidity to the bottom end.
This brings us neatly on to various other multi-band processors. Arguably, if you need to use multi-band processing to tweak the mix tonality, that's a sign that the track could have been better mixed, but they can be useful and powerful tools if used sensibly. A long-time favourite processor is Drawmer's Masterflow, a hardware mastering box that splits the audio into three bands, in which compression, limiting, distortion and stereo width can then be controlled independently. Most of these functions are also available within plug-ins. Some, such as Izotope Ozone, present a range of processors in one plug-in, while others, such as the Melda range, offer dedicated multi-band versions of just about any processor you can think of. If you're struggling to find the right processor, you could always construct your own, using a crossover in a modular plug-in hosting environment, such as DDMF's excellent Metaplugin.
Multi-band compression can be useful when you need to keep low-frequency, high-excursion sounds (such as kick and bass) under control without the compression also 'punishing' the mids and the highs. It's also true that most of our subjective impressions of the quality of a track are based on the musical mid-range where speech resides, so by using multi-band processing, you can keep the mid-range relatively natural-sounding while still being quite assertive with the low end. Multi-band limiting can also be used to minimise mid-range processing while keeping peaks at the extremes of the audio spectrum under firm control, and compressor/limiter recovery times can be optimised to their respective frequency bands to minimise waveform/envelope distortion.
It seems that new multi-band processors become available all the time, and there are a couple of types that are particularly worth investigating. Multi-band distortion can be incredibly useful, but you really do have to use it in moderation on a mix. Again leaving the mid-range fairly clean, you can add more drive to the lows to give warmth and fatness to the bass sounds, while increasing drive to the highs creates an exciter-like brightness. Another option, which will, again, interact quite a lot with any other dynamics processing, particularly the limiter, is a multi-band transient designer. Used in moderation, these can help to compensate for the side-effects of heavy limiting. Note that we're getting into restoration territory here, though — so if you find yourself using these tools too often, ask yourself the question: "have I pushed things too far?”
Finding your own combination of plug-ins and processors to create the optimum sense of loudness without inducing tears can be very rewarding. When it comes to processing the final stereo mix or stereo submixes, several manufacturers have combined many of the ingredients discussed above into plug-ins that present the user with a mercifully small number of controls that let you manipulate the loudness of a mix or submix in a very intuitive way. We've tried a number of them, including Izotope's Ozone 5, the various Waves L-series processors, the Sonnox Inflator and Limiter, the UA Precision Limiter, and Steve Slate Digital's FG-X, which won our coveted SOS Editor's Choice award at the NAMM show last year, as well as very capable offerings from TC Electronic, DDMF, Melda Production... and many more. As we said at the outset, it's easy to get initially impressive-sounding results using such plug-ins. Get to know them well and try various combinations on different material and you can achieve some great results. We don't have the space here, but we'll soon be testing a range of these processors side-by-side, and presenting our conclusions in these pages.
But anyone can place these processors on a mix and lower the threshold, and it's very easy to get truly bad results if you don't understand what the tools are doing. The key to being able to mix loud is the same as the key to being able to mix well per se: you need to understand, place, control and balance the various individual sounds that together make up your mix. Get the right balance of frequencies, stereo interest and dynamics control, and work to fool the psychoacoustic aspects of the human hearing system, and you'll be able to push your maximising processors even further — if you really want to!
Metering: Measuring Loudness, Distortion & Dynamic Range
Given the complexity of the human hearing system, it's no surprise that measuring perceived loudness is less straightforward than we'd assume it to be. Although some DAWs and audio editors offer off-line analysis facilities, the basic real-time metering tools included with most DAWs aren't really up to the task — so you'll need to look towards third-party plug-ins if you want a visual indication of loudness while mixing. Here's a quick round-up of the different types of meter available, and what they can offer you in your quest for sonically pleasing loudness. Just remember that you need to use your ears as well as your eyes!
Sample Peak Meters: Although sample peak meters — such as you get by default in the mixers of most DAWs — do have a role in avoiding clipping, they tell you very little about how loud a sound will appear, because the tiniest transient might make the meters appear 'hot' on your screen, even though the average level is actually pretty modest. In fact, a sample peak meter might not even tell you the whole picture when it comes to peaks: the highest point of the reconstructed waveform between two sample points might appear between the individual samples and thus not register on the meter. This means that even if you can't hear problems when mixing, someone playing back your track on a consumer playback system with oversampling or delta-sigma D-A converters might hear distortion. Equally, such peaks can cause audible nasties when converting to MP3 or other data-compressed formats. To spot such problems, it's well worth downloading SSL's freeware X-ISM (www.solidstatelogic.com/music/X-ISM), which highlights exactly this problem in real time.
RMS Averaging Meters: Meters that display an average level over a defined time period — the VU meter being the most common example — give a better indication of loudness, partly because they cater for our perception that sustained sounds are louder than short ones. However, a standard VU meter will not give us a reliably accurate and consistent impression of perceived loudness, as it fails to take into account how we perceive the relative loudness at different frequencies.
Crest Factor & Dynamic Range: The 'crest factor' is the ratio of the peak to RMS measurements of a signal, and it tells us slightly more about why a sound can be compressed to within an inch of its life, yet fail to sound loud to the human ear. The greater the crest factor, the greater the 'dynamic range' of the material, and as we've seen in the main text this is one of several factors that makes us perceive music as being loud. We all accept that compressors and limiters reduce the overall dynamic range, but there are two things going on here: the first is the instantaneous reduction in crest factor by pulling down transient peaks, but the other is the longer-term effect of reducing the amplitude difference between the ppps (pianississimo) and fffs (fortississimo).
A few plug-ins offer a measurement that is based either wholly or in part on the crest factor, including Z-Plane's v3 PPMulatorXL, which uses a two-needle meter displaying both peak and average readings, Roger Nichols Digital's Finis, Brainworx' bx_Meter and the TT Dynamic Range Meter from the Pleasurize Music Foundation (the last is free, but only if signing up for the Foundation). Such meters will indicate whether the 'dynamic range' (we're talking about short-term variations here, not the macro-dynamics of the whole track) has been reduced too far or whether it might usefully be further reduced. Interpreting the reading isn't as straightforward as you might imagine, though. First, the desirable dynamic range of a track will naturally vary according to its genre and constituent sounds, so you need to look out for different numbers according to the material you're mixing. Second, not all plug-ins seem to measure exactly the same thing, as they don't always give readings that are consistent with each other. Perhaps more importantly, they don't all take account of the variation in dynamic range between the middle and sides (M/S) signals, which is discussed in the main article. Usefully, the Brainworx meter offers L/R, M/S and summed modes, which allows you to investigate all of this, so it's well worth checking out.
New Broadcast Standards For Loudness Metering: Loudness metering standards (BS1770/1 and R128 if you want to look them up) have been developed by the EBU (European Broadcasting Union) and look likely to be adopted by the AES (Audio Engineering Society) and other major players in the broadcast and music industries (including Apple, for their ubiquitous iTunes playback software), and some software meters already reflect these standards. Like some of the meters already discussed, these display frequency-weighted or filtered RMS levels in reference to a defined maximum peak level (which can be different for different broadcasters). To take account of the equal loudness contours, the meters reflect a signal in which the low frequencies are attenuated and high frequencies are treated to a shelving boost. They also discount any sound below a fixed threshold below the peak, in order to avoid results being skewed by things such as fades and drop-outs in the arrangement. The intention is that broadcasters are able to use software to maintain a much more consistent level across different program material simply by boosting or attenuating, without having to further compress the audio — and that sound balancers can achieve very consistent programme volumes when mixing live because they'll have a reliable loudness meter.
Plug-in versions of such meters include Nugen Audio's excellent Audio VisLM-H, which offers real-time metering and displays a loudness history of the track (you can even generate a detailed graph showing all sorts of useful information). Such tools are indispensable when mixing for broadcast or anywhere else where there are defined maximum levels and consistent loudness requirements. Of course, they're designed precisely to ensure that over-loud programme material can be automatically attenuated — so it's possible that these standards will lead to the end of the loudness wars, particularly if adopted for iTunes, YouTube, Soundcloud and the like, because there'll no longer be any subjective advantage in trying to make a track as loud as possible: such mixes will simply get turned down to be the same loudness as everything else! Matt Houghton & Hugh Robjohns
Given that our ears respond to frequencies differently at different levels, you need a consistent reference level while monitoring if you're to be able to judge reliably and consistently how 'loud' a track sounds, and what the sonic trade-offs are when compressing and limiting. Unfortunately, search engines will throw up plenty of well-intentioned misinformation about monitoring levels! The SMPTE RP200 standard dictates an average monitoring level of 85dB SPL, C-weighted from each loudspeaker, when pink noise is played at an RMS level of -20dBFS. The implication is that peaks will be hitting 105dB SPL — and that's a lot, given that most pop tracks are heavily compressed and hover close to 0dBFS all the time! Hence the Katz K-system monitoring scheme, which keeps the acoustic reference point the same (85dBC) but reduces the headroom margin. Heavily compressed pop music would use the K12 meter and thus produce peaks at around 97dB SPL instead (85+12)!
It's important to understand that these reference SPL standards are only really suitable for a large-ish room — and if you aim for these levels in a small home studio, you'll find yourself being pinned to the back wall! So, for example, a lot of TV sound control rooms use a lower level (79dBA) as a reference because it suits the size of room and the home listener's levels — as well as the peak SPL capability of nearfield monitors — rather better. This is a more appropriate ball-park level for a typical home studio in a small room. Don't get too hung up on the exact SPL figure, though: unless you are specifically trying to match known calibrated reproduction levels elsewhere (as they do in cinemas and cinema mix rooms, for example) the actual monitoring level is not critical — though lower is generally better, and while 85dB SPL C-weighted is an appropriate maximum exposure level for a large professional mix room it is, in my experience, too high for a small room. Hugh Robjohns
Some Listening Suggestions
If you want to get a clear idea of how the side-effects of heavy loudness processing can impinge on your listening enjoyment, you could do worse than subjecting yourself to some of the most notorious casualties of the 'loudness wars'. Metallica's Death Magnetic has probably been the industry's most recent bête noire, and the track 'The Day That Never Comes' is fairly representative of the album. Although many people have complained about the grittiness of the distortion introduced by the extravagant digital clipping on this master, this is actually more forgivable to my ears than the lack of low-end power; not only does the bottom octave of the bass guitar spectrum appear to have been murdered with EQ to save on headroom, but the kick also loses low end as a side-effect of the clipping. Not a particularly appealing listen, whichever way you slice it.
However, for my money, the mastering on Red Hot Chili Peppers' Californication a decade earlier is even less defensible, because not only does this share the same kind of LF slimness, but it also allows its clipping to sully cleaner, steady-state sounds that are much less good at masking the distortion artefacts — the vocals on 'Scar Tissue', for instance, really suffer. (It's also worth pointing out that this record is mostly in mono, which is another means by which some producers seek to maximise the apparent loudness of their masters.) In my opinion, an album like Linkin Park's Hybrid Theory (the track 'Forgotten', for instance) strikes a better balance between the pros and cons, delivering a still very respectable comparative loudness in return for less clipping grit and more solid LF foundations.
It's by no means just rock records that are playing this game, of course, because many dance records have just as much grungey stuff in the mix to mask the effects of clipping distortion at the high end. The Chemical Brothers' classic 'Block Rockin Beats' and Dizzee Rascal's nailed-to-the-endstops club hit 'Bonkers' are both cases in point, using prominent distortions and synth resonances to mask the unrelenting stream of uncorrelated left/right clipping-distortion 'sprinkles'. Again, however, it's the lack of LF kick-drum punch that detracts most strongly from the effectiveness of these mastering jobs for me, and something like Deadmau5's 'Ghosts 'n' Stuff' feels like a better compromise between punch and level in this regard — unless ringtones are all you care about! Swapping out the clipping for limiting can help to reduce distortion issues, but is by no means a panacea if you're after hard-edged drum sounds, because in my experience it can tend to subjectively 'blunt' each hit unsuitably, leaving a rounder sound more along the lines of that in Lady Gaga's 'Born This Way'.
Outside noise-rich styles, distortion does become a bigger no-no, however. In urban styles, a common trick seems to be to make the most of clipping on the drum peaks while using compression and/or limiting to keep periodic waveforms from hitting the clipping ceiling — a speciality of Serban Ghenea, the mix engineer responsible for tracks such as Katy Perry's 'California Gurls'. However, there's also an art in firmly controlling the amount of headroom-munching sub-100Hz low end, while nonetheless maintaining a decent sense of warmth in the frequency balance. I've always considered Britney Spears' 2003 hit 'Toxic' something of a masterclass in this regard, and it bears comparison with a track like Pussy Cat Dolls 'Takin' Over The World' , which deliberately sacrifices loudness so that it can make your subwoofer break a sweat.
Where the style heads more 'Radio 2', with less prominent drum levels and cleaner overall textures that provide little cover for clipping artifacts, sophisticated dynamics processing has to take on more of the heavy lifting if loudness remains important. Most X-Factor singles provide ample demonstration of this kind of approach, and the main side-effect I look out for here (assuming you don't treat the processor like a fuzz box) is the ducking of drum hits under important musical accents — for example, the kick drum which starts off the second chorus of Matt Cardle's 2010 UK Christmas number one 'When We Collide'. Mike Senior
When Is Loud Too Loud?
Bar a couple of exceptions (such as inter-sample peaks), the unpleasant side-effects that can be introduced by excessive loudness processing need to be assessed by ear. Good monitoring is a prerequisite, and remember that decent headphones can convey far more detail than budget speakers of the same price. Here are the key things to listen out for...
COMPRESSION: Excessive compression without much limiting doesn't generally cause the mix to sound distorted or harsh, but it does rob drum-hits of clarity, and the sound tends to become very congested, with no air for the individual instruments and voices to breathe. Even judging what is meant by excessive isn't always easy, because artifacts like gain pumping can be used deliberately, so a lot comes down to listener experience and comparison with similar commercial material. Just remember, when mixing, that commercial material has already been mastered, so to make comparisons you really need to insert temporary mastering plug-ins to get an idea of how your mixes stack up. Even then, it's often helpful to attenuate a commercial mix by a few dB, because you'll otherwise be trying to chase an unrealistic loudness target at this stage of the project.
LIMITING: Limiting can sound quite benign when used just to trim a few dBs from peaks, but excessive limiting causes unpleasant artifacts, emasculating snappy snares and turning kicks into dull thuds. Music needs space to breathe, and while much of this comes from a good arrangement, you still need to exercise restraint with dynamic processing.
DISTORTION & ENHANCEMENT: Although a small amount of distortion can add warmth to the lows and sparkle to the highs, pushing it too far results in a lack of sonic focus, a blurring of bass sounds and an aggressive edge that makes you want to turn down the volume! If you've applied enough distortion that you can actually hear it (fuzziness and crackling on peaks), you've gone way too far.
CLIPPING: The most intense form of distortion is hard clipping. Research by broadcasters back in the 1930s showed that if the period of clipping is below around 1ms the human ear doesn't notice the distortion, which is why clipping is sometimes useful to sharpen drum sounds. However, the brain's ability to pick up on clipping depends not only on the duration of the clipped event, but also on how close together these clipped events are and on the relationship between the fundamentals and the distortion artifacts. The closer the clipped events, the more noticeable they become. As discussed in the main article, anharmonic distortion artifacts can be detected at far lower levels than harmonic distortion artifacts.
Mixes are occasionally deliberately clipped to squeeze out a little more loudness, and some limiters are designed to allow a very short period of clipping before they act. This seems a strange tactic: unacceptable and unnecessary (if you get the rest of it right) trade-offs are being made for very little reward. It is all too easy to create harsh-sounding mixes in this way — there's a fine line between the invisible control of peaks and the onset of a desire to turn the volume down. Paul White