We all know music is getting louder. But is it less dynamic? Our ground-breaking research proves beyond any doubt that the answer is no — and that popular beliefs about the 'loudness war' need a radical rethink.
Why Music Sounds Worse”. "Fans Complain After Death Magnetic Sounds Better on Guitar Hero Than CD”. "Everything Louder Than Everything Else”. "Even Heavy-Metal Fans Complain That Today's Music Is Too Loud!” "Dynamic Range Day heralds new movement against loudness.” "The Death of High Fidelity”... In the press and on the Web, the backlash is growing against the 'loudness war', the practice of trying to make recordings sound as loud as possible, so they are perceived as 'hotter' than rival releases. According to articles like these, unreasonable mastering practices and, more specifically, the abuse of brickwall limiters, has put music in jeopardy. Modern productions lack subtlety, and sacrifice quality for level. Bob Dylan, in a 2006 interview, went as far as stating that "You listen to these modern records, they're atrocious, they have sound all over them. There's no definition of nothing, no vocal, no nothing, just like — static.”
But is Dylan's remark just a replay of the quarrel between the ancients and the moderns? It would not be the first time the old guard despises what the new generation does. True, many sound engineers have joined the cause of "more dynamic” music. But are they speaking out for what is objectively better — or are they simply voicing their preference for a particular style of sound? My research aims to answer this question. We'll find out whether recent music is really louder, and whether it's really less dynamic. We'll also consider the hypothesis that loudness may be a stylistic marker for specific recent music styles, instead of being a bad habit only motivated by despicable commercial reasons. Finally, we'll take a close look at Metallica's notorious Death Magnetic, and see why so many people claim it doesn't sound good.
Is Music Really Louder Now?
Yes it is, and there is no doubt about that. Let's take a large number of best-selling and/or very well received 'pop' music pieces recorded and produced between 1969 and 2010, normalise them so they peak at 0dB full scale, and measure their RMS value. Then let's sort all the values according to the year of release of the track to which they correspond. The first diagram, left, shows the experiment's outcome, and it is indeed spectacular! The red line shows the RMS median value for each year, and the rectangles give an indication of the distribution: the darker the rectangle, the more pieces showing such a level. There is, without question, a constant growth in average levels between 1982 and 2005, and today's records are roughly 5dB louder than they were in the '70s.
Admittedly, measuring the signal's RMS value only gives information about the 'electrical' or 'physical' content of the audio file, not a measure of loudness as we perceive it. For that, we evaluate the 'integrated loudness', as defined by the EBU 3341 normative recommendation. As seen on the second diagram to the left, in the context of our corpus of songs such a measure is highly correlated to the signal's RMS value, and the two graphs are very similar to each other. This second set of results confirms the first one.
Let's repeat the experiment using other criteria. For instance, one criterion commonly used to describe the dynamic behaviour of a piece of recorded music is the 'crest' factor. Put simply, the crest factor is the difference between the RMS level and the peak level over the course of the song. Intuitively, it measures the amplitude of the emerging 'peaks' in the audio stream. It's considered a good marker of the amount of dynamic compression that was applied to the music: more compression generally means a lower crest factor. Some professionals consider good handling of the crest factor as the cornerstone of successful mastering. Also, still generally speaking, the lower the crest factor, the louder the music.
The third diagram on the first page shows the evolution of a measure that's analogous to the crest factor. Based on the same 4500 tracks, this simplified crest factor is shown falling by 3dB since the beginning of the '80s, reinforcing the suspicion that the increase in loudness we've been witnessing since the '90s was brought by dynamic compression. You'll see that the evolution of the crest factor can be divided into three stages. First, from 1969 to 1980, the crest factor increases, probably due to the improvement of studio gear in terms of signal-to-noise ratio and dynamic transparency. From 1980 to 1990, the crest factor remains relatively stable. Then, from 1990 to 2010 — the era of the loudness war — the crest factor is dramatically reduced.
Finally, another relevant and helpful descriptor is the proportion of samples in a piece of recorded music that are close to 0dBFS once the piece is normalised. A high density of very loud samples suggests that the master recording has been allowed to clip, or that a lookahead brickwall limiter such as the Waves L-series has been employed. The fourth diagram traces the density of peak samples in the same 4500-track corpus. The first two diagrams show that music has got louder; the third indicates that this evolution is probably due to dynamic compression; and this illustration shows that such compression is probably applied via digital brickwall limiters.
What Is The Dynamic Range Of A Piece Of Music?
This is a surprisingly difficult question to answer. Intuitively, we feel that dynamic range ought to measure how 'variable' or 'mobile' the music level is. Let's try to give this intuition some substance. The first diagram on the previous page compares the evolution of the signal's RMS value for extracts from two songs: 'Fuk' by Plastikman, and 'Smells Like Teen Spirit', by Nirvana. Apparently, the level of 'Smells Like Teen Spirit' is more mobile than that of 'Fuk'. This is no surprise, considering that Plastikman's music is minimalist techno, whereas Nirvana's productions often feature soft verses and loud choruses.
However, the results change radically if we perform the analysis using an analysis window of 100 milliseconds instead of two seconds. Over the long term, Plastikman's music is more stable in terms of RMS levels — but in the short term, as you can see from the second diagram, it appears to feature more variations in level, because of its loud, dry drums. So if we want to establish a measure of 'level mobility', we need to think about what time scale to employ.
There is also the question of how to actually compute this level mobility: how to get a numerical value that could be a measure of 'dynamic range'. Conceivably, we could measure the overall vertical amplitude of the RMS curve corresponding to a music piece for a given time scale, by summing the amplitude of each vertical movement. Intuitively, it makes perfect sense: looking again at the top diagram on the second page of this article, on which the blue curve looks more mobile than the red one, the blue curve's overall vertical amplitude is greater than the red one's. (Mathematically, this would amount to evaluating the sum of the RMS derivative.)
In practice, however, this method proves to be unreliable. Amongst other problems, an isolated peak in an otherwise flat RMS curve would distort the measure, giving a false impression of significant RMS mobility. A better method, similar to the one used by the EBU to evaluate loudness range, consists of dealing with the RMS variability instead of its mobility. Instead of directly evaluating an 'RMS mobility', we compute the distribution of RMS values encountered during the analysis. Such a distribution is shown on the third diagram of the group I've been referring to. Then we measure the 'spread' of the distribution curve using a trick similar to the 'interquartile range method' in descriptive statistics: the spread of the curve will leave alone the top five percent and the bottom 10 percent values. We can see that for an analysis window of two seconds, 'Smells Like Teen Spirit' has a greater RMS spread than 'Fuk'.
Let's change the time scale again and measure this RMS 'spread' with RMS values every 0.1s. The outcome of the experiment is shown in the fourth diagram, and again the results are reversed: the spread for 'Fuk' is greater than it is for 'Smells Like Teen Spirit'. Suppose that we now repeat the same experiment for a variety of analysis windows. The result is shown on the last diagram of the same group. Interestingly, level variability for 'Smells Like Teen Spirit' is always greater, except for windows below 0.18 seconds, where the drum parts in 'Fuk' show a decisive influence.
What is shown in the fifth diagram is a very good candidate for a measure of 'dynamic range' of a piece of music. Suppose now that instead of dealing with the signal's RMS, we deal with a measure of perceptual loudness, such as the one mentioned in the ITU recommendation BS 1770: we would now be dealing with 'loudness range'. This is, in fact, the basis of how the EBU defines 'loudness range' in their EBU Tech 3342 document, as explained in the 'EBU Measure Of Loudness Range' box.
There remains the question of whether one should use such a term as 'dynamic range' at all: there is no official definition for it, and it may be confused with the dynamic range of a recording medium, which is basically the difference between the highest and lowest level it can handle. During the course of this article, therefore, I won't talk about 'dynamic range' in relation to a piece of music. Instead, I will be using 'RMS variability', or more generally 'dynamic variability'. The term 'dynamic range' will be reserved for the measure of signal-to-noise ratio of a recording medium. I will use the term 'loudness range' in strict reference to the EBU 3342 document, and the term 'loudness variability' in other cases involving loudness instead of RMS.
Has Loudness Range Decreased?
Here's where things get surprising. We can prove beyond any doubt that the 'loudness war' has not decreased the loudness range, as defined in EBU 3342! Nor has it reduced level variability or loudness variability in any way. Music from the last decade seems to exhibit as much dynamic variability as music from the '70s or the '80s. Let's substantiate this assertion.
As we saw above, descriptors such as RMS level, integrated loudness, simplified crest factor, and proportion of samples above -1dBFS show spectacular evolution from the beginning of the '90s until sometime near 2005. This is the effect of the loudness war. So surely the EBU's loudness range measure should do the same? As shown on the first diagram of the group on page 179, it doesn't. What we see is that loudness range appears to be decreasing from 1969 to 1980, then stabilises until 1991. After 1991, instead of going down as expected, it follows a rather inconclusive evolution, and certainly doesn't decrease in any clear manner.
As we also saw above, the density of high-level samples in the audio signal rises spectacularly after the beginning of the '90s. This indicates increasing use of compression, and, more particularly, digital brickwall limiters, which in turn raise the overall level of the music corpus we're dealing with. But can the use of such limiters be linked to a diminution in loudness range? Let's answer that question by displaying EBU 3342 values versus high-level sample density — in other words, by plotting loudness range versus the amount of limiting applied. This is what is displayed in the second diagram, which shows extremely clearly that the answer is no. The increasing amount of limiting performed during the loudness war era didn't decrease the observed loudness range in any way.
This is not to say that processing audio with a brickwall limiter will not reduce its loudness range. As we'll see later in the article, it does. The observation here is just that from the analysis of actual records, the loudness war did not result in any obvious reduction in the loudness range of music.
Still, 'loudness range' as defined by EBU 3342 deals with time scales near and above three seconds. Let's see what happens using other window analyses. For that, let's evaluate the gated RMS variability based on 0.05 to 12.8s-long windows. And to be even more specific, let's modify the evaluation of RMS variability so that it singles out the respective influence of each time scale. This way, we will be able to see whether the loudness war reduced level variability at any time scale. The result for both experiments are shown in the third diagram. Not only does it corroborate the previous findings, it also goes much further, showing that the loudness war has had no clearly identifiable influence on level variabilities at any scale. This is quite a drastic conclusion: contrarily to what one can often read on the Internet, the loudness war did not cause any reduction in level variability. There is as much level variability now as there was in the '70s or '80s.
In order to confirm these findings, I asked Dr Damien Tardieu, signal processing specialist at IRCAM in Paris, to perform similar analyses on a totally different music corpus: 20,000 songs randomly selected from the EMI catalogue. Admittedly, the albums in this catalogue are referenced via copyright dates, so the analyses will be made a bit less reliable by compilations gathering older tracks under a more recent copyright, or by remastered editions. However, what we need here is a general estimation of a global phenomenon, so we can afford a slight margin of error. The fourth and fifth illustrations on the previous page show the evolution of loudness range measured according to EBU 3342, as well as the density of very loud samples corresponding to this corpus. They show that loudness range doesn't decrease after 1990, even though limiting gets much more drastic. There is no doubt about it: contrary to general belief, there has been no obvious decrease in loudness range due to the loudness war, and brickwall limiters have not reduced the loudness range in music production.
So What's Going On?
As we saw earlier, the amount of compression/limiting used in mastering drastically increased between 1990 and 2000. Yet at the same time, and even though limiting may in many cases reduce the loudness range of a piece of music (see 'Loudness Range & Limiters' box), it isn't possible to observe an overall reduction in loudness range in productions. How can we resolve this apparent contradiction?
The first possibility is that mastering engineers may actually have been reasonable after all, only applying an amount of limiting that hasn't led to obvious loss of loudness range. This, as shown in the 'Loudness Range & Limiting' box, is theoretically possible, since the audio material's RMS variability may show a certain amount of resilience to limiting. I don't believe this is the case, though. Significant limiting can be measured or observed on the waveform, and can easily be heard: attacks are modified in a very specific way, everything seems to be more dense, more solid, and often brighter. Having listened to a very large number of tracks from the corpus I used for this article, it's obvious that a large proportion of recent tracks are limited in quite a heavy manner.
There remains only one solution I can think of: the loudness range of the music prior to mastering or even mixing has been increasing at the same time as compressing/limiting has been getting more drastic. In other words, the source material has more initial variability, and is more resilient to limiting. This is borne out by stylistic changes in music during the era of the 'loudness war'. The beginning of the '90s, which correspond to the beginning of the loudness war, witnessed the emergence of mass-audience rap artists, and rap music typically has sparse production with very loud kick and snare parts, which increase level variability at very small scales (0.1s or so). Around the same time, metal music evolved into 'nu metal', which integrated elements of funk and rap, and with it more percussive elements. On a slightly larger time scale, patterns at the end of musical phrases also evolved around the beginning of the '90s. Whereas many hits from the '80s would transition from one musical phrase to another using a mellow tom roll, hip-hop producers from the '90s preferred drastic 'cuts' in the sound, which may be liable to increase level variability at scales near 0.5s.
On a still wider time scale, related to the structure of songs, one could put forward the idea that modern productions use contrasts in level, where older pop songs might have employed key or chord changes to delineate different song sections. It's quite common to hear rap or even R&B tracks where the verses are so miminalist it's difficult to even extract a chord sequence from them, while at the same time, the chorus is buried under dense vocal harmonies and/or lavish tonal keyboard parts, which increase the RMS level quite a bit. 'Lollipop' by Lil'Wayne or 'Gangsta's Paradise' by Coolio are reasonably good examples, and so is, to a certain extent, 'Single Ladies' by Beyonce. In productions like this, level variation is being used to create a structure for the song.
To illustrate the point, it's interesting to compare two very different songs from different eras: the Beatles' 'Come Together' (1969), and Lady Gaga's 'Telephone' (2010). The top image overleaf shows RMS analysis for the two songs. The white lines indicate the song's structural limits as annotated by ear. The two checkerboard-like diagrams show the self-similarity matrices for the RMS. In such self-similarity representations, the clearer squares indicate parts that are different from each other in terms of level, whereas darker squares indicate parts of similar levels. This comparison is a case in point: the large-scale level variations are greater in 'Telephone', and very much synchronised to the song's structure. This is a single example, but helps provides a plausible explanation for the idea that large-scale RMS variability prior to mastering might be greater in the case of more recent music.
Can Limited Music Have Musical Dynamics?
Definitely. But the way musical dynamics are expressed may change. Imagine you're listening to some music. You want to it louder. You walk to the volume control, and simply raise the volume. By doing so, you increase the signal's RMS, increase its peak level, and leave its crest factor untouched. We'll call that the 'first loudness paradigm'. Suppose now that you've got a region in Pro Tools that peaks at 0dBFS. You can't raise its volume in the traditional way, or it's going to distort. But you can insert a limiter, and lower its Threshold slider. By doing so, you still increase the signal's RMS, but this time its peak level remains stable and its crest factor gets reduced. That's what we'll call the 'second loudness paradigm'.
When Wagner writes an orchestral crescendo, he uses the first paradigm, by adding more instruments. But, using limiters, you can create a crescendo that employs the second paradigm. The difference in terms of resulting waveform is shown in the top image opposite: Mike Oldfield uses the first paradigm at the end of the first part of Tubular Bells, while the second is used in Trent Reznor's 'Closer'.
To get a more precise idea of the difference between both paradigms, let's take six crescendos from six different recordings, three of which use the first paradigm and three the second. Let's analyse them in terms of RMS, peak level and crest factor. The result of this analysis is shown on the second diagram, right. The first graph shows that all crescendos are based on an increase in RMS level. The second graph clearly distinguishes the tracks that use the two paradigms: in case of the second, the peak level is constant. The third graph shows the crest factor systematically decreasing in these crescendos, but suggests that in the others, there is no link between crest factor and loudness.
It could be argued that crescendos using the second paradigm are not 'pure' dynamic events: the louder the music gets, the more the limiter is allowed to change the signal, and the more it will modify the original timbre. But is the same not true of traditional crescendos? Performing a crescendo on a single violin note will not only change its level, it will change its timbre. And most orchestral crescendos incorporate additional instruments as they develop. The combination of the two factors results in a much more drastic change to timbre than any brickwall limiter could ever cause.
The Case Of Death Magnetic
Metallica's most recent album has become a cause celèbre for opponents of current mastering practices. As far as I can tell, the main problem with Death Magnetic is a collision between the way it has been mastered and its guitar sound. The very aggressive mastering simply is not suited to Metallica's production style, which dates back to the '80s and relies heavily on solid, distorted guitars. To sum it up, the result is a music that's generally stable, and at the same time features very low crest-factor values. From a perceptual point of view, this translates as 'compact all the time'.
Diagram 1 from the group on the final page shows a distribution of the 4500 simplified crest-factor values corresponding to the corpus we've been using for the article, along with the values for the tracks from Metallica's Master Of Puppets and Death Magnetic. Analysis of other Metallica albums such as ...And Justice For All or the 'Black' album show crest-factor values similar to those of Master Of Puppets. Looking at this diagram, we can see not only that all the tracks from Death Magnetic exhibit crest-factor values that are considerably lower than 'normal' Metallica albums, but that those values are simply extremely low compared to any music from the corpus.
Such crest-factor values are comparable to what can be found on tracks from Kanye West's My Beautiful Dark Twisted Fantasy, or 50 Cent's Get Rich Or Die Tryin'. Those are stylistically loud urban music albums with really strong percussive elements that articulate the writing, and are better suited to low crest-factor values than Metallica's constantly buzzing guitars. They are also comparable to tracks from MGMT's Oracular Spectacular or Congratulations, two albums with a sound so distinctive that a constant use of the second loudness paradigm and/or dynamic compression artifacts is not a problem at all. But Metallica's 'classic' sound simply doesn't easily allow for sonic extravaganza.
Diagram 2, from the same group, shows Death Magnetic's RMS variability in comparison to that of Master Of Puppets, as well as two other albums with low crest-factor values: My Beautiful Dark Twisted Fantasy and Congratulations. This is where the real trouble begins. Not only does Death Magnetic sound very 'compact' because of its low crest-factor values, but it's also very stable (low RMS variability). Which means it's exaggerately compact... all the time. Diagram 3, from the same group, sums that up, by showing how unusual such a combination of low crest-factor values and reduced EBU 3442 loudness range is. It's comparable to no more than three songs from MGMT. Even the sometimes incredibly compressed My Beautiful Dark Twisted Fantasy can't compete: it retains much more contrast than Death Magnetic. And though it's roughly as stable as the music of Dagoba, an industrial metal band with death metal vocals who specialise in spectacularly loud, compact and thick productions, Death Magnetic is way more compressed. In my opinion, that does it: you don't want traditional, mainstream metal to sound more compact than purposely extreme industrial/death metal. Or if you do, then you've got to change the music itself, to build in more contrast, so it can afford or even benefit from so much compression.
Is The Loudness War A Problem?
It's easy to find people, documents, web pages and so on that unanimously blame the loudness war for damaging music. Many of them also link the loudness war to a reduction in "dynamic range”, though they usually don't explain what dynamic range might be. Examples of such articles can be found online at http://lakefieldmusic.com/the-loudness-war-stops-here-high-dynamic-range-audio-recordings, http://dynamicrangeday.co.uk/about/, on Wikipedia (http://en.wikipedia.org/wiki/Loudness_war#Dynamic_range_reduction), and even in the respected scientific magazine IEEE Spectrum (http://spectrum.ieee.org/computing/software/the-future-of-music). However, we've seen during this article that the loudness war actually didn't result in any reduction in the closest well-defined descriptor there is to "dynamic range”, which is loudness range as defined by the EBU 3342 technical document. Neither is it possible to ascertain any decrease of dynamic variability at any scale.
So what's the problem with the loudness war? Obviously, limiting does something 'wrong' with the signal, otherwise people wouldn't be complaining so much — even though they apparently point at the wrong signal descriptor.
To answer that question properly, it may be useful to adopt a point of view generally used in image processing, where it's possible to analyse a photograph or any picture in terms of luminance distribution. Photoshop does that in a dialogue called 'Levels'. To evaluate such a distribution, an algorithm makes an inventory of all the pixels in the image, and sorts them according to their luminance. This results in a distribution graph that shows if the picture, as a whole, includes predominantly light, medium or dark areas, and to which degree. The same process can be followed with audio files: we take an inventory of all the samples from a song, and sort them according to their absolute level. As shown on the image overleaf, the resulting distribution curve can teach us many things.
Look at the mean distribution curve for songs produced in 2007. It peaks at a higher level than the mean curve for 1967 songs. This means the songs are generally louder in 2007. Then look at the 'widths' of both curves: they're comparable, which basically means that something closely related to dynamic variability hasn't changed between 1967 and 2007. Now look at the little indentation at the right of the 2007 curve: songs from this year feature a density of high-level samples that's unnaturally high: level distribution suddenly stops following Gauss's normal distribution near the high levels. Compare the shapes of the two curves: it looks like the blue one was literally 'pushed' towards the right. This shows the result of brickwall limiting.
To go on with the comparison with images, it's as if, for the last 20 years, all pictures in books and magazines have been getting brighter and brighter. There are still deep blacks, the contrast remains intact, but all images look brighter. This is illustrated with the Tower Bridge pictures on the image. It's as if everything these days is supposed to look 'flashy', even though common sense suggests there are some images that shouldn't look flashy at all, in any situation. This is all the more true in the case of audio content, for which 'brighter' doesn't simply mean a higher density of clearer pixels. It also means reduced crest factor, envelope modifications, use of the second loudness paradigm and, in the worst cases, distortion. Common sense suggests that although there is nothing wrong with these characteristics as such, they shouldn't be on virtually all records.
In the end, it's all about style. Reduced crest factor values bring a 'compact' aspect to the sound; Waves describe it as a "heavily in-your-face signal that rocks the house” on their MaxxBCL page. It may be suited to your kind of music, or it may not. You might want to remain 'soft' on purpose. If you're doing heavy techno music, though, 'compact' is probably a good idea. Similarly, the two loudness paradigms described earlier each have a very distinctive 'flavour', and you may prefer one or the other. Do you want every loud attack modified by compressor/limiter? It might be a good idea in many cases, but it might prove disastrous in others. Do you want to reduce the loudness range of your music without changing anything else? Then you're probably better off with volume automation than with a limiter, since we saw that loudness range is naturally resilient to a certain amount of limiting.
The important thing in this matter is to know what you're doing, and why, according to what sound you want. Some specific tools can also help, such as the TT Dynamic Range Meter (see www.dynamicrangemetering.com/free-downloads — although this really measures the crest factor of the signal and not any kind of 'dynamic range'). And if you like compression anyway, but you fear that Mr Bob Dylan wouldn't approve of your sound because it's too "modern”, and resembles "static”, don't worry. He's probably not listening.
The EBU Measure Of Loudness Range
In December 2010, the EBU released the Tech 3342 document, as a part of the loudness recommendation EBU R128. It gives very precise guidelines for measuring 'loudness range', a descriptor that may very well become a standard for the measure of the dynamic variability of audio content, so it's worth taking a few minutes to study in details what is in fact a measure of the 'three-second window, gated K-weighted RMS variability' of audio content. Let's break that down.
The analysis window length is three seconds, sampled every second. It means that this measure concerns dynamic phenomena more than three seconds in length. Thus, at one extreme, it will not take into consideration percussive sounds. At the other, loudness variations due to structural changes may not be clearly visible: they can be masked by variations happening at smaller scales. It's a compromise that was chosen by the EBU.
Instead of looking at RMS values, the measurement protocol looks at loudness values as defined in ITU-R BS 1770. This measure of loudness is simple: take the original file, EQ it, and then evaluate its RMS. The filter used in that case is quite basic, as shown in the diagram. It may come as a surprise that the ITU uses such basic filtering to define the difference between RMS and loudness, but as they put it, "for typical monophonic broadcast material, a simple energy-based loudness measure is similarly robust compared to more complex measures that may include detailed perceptual models”. The ITU calls such a filter 'K-weighting', and gives 'LKFS' as a loudness unit. At this point, the descriptor we're dealing with is a sequence of loudness values, which, on a side note, corresponds to "short-term loudness” as defined in EBU 3341. Though those values are measured in LKFS, the EBU favours the acronym 'LUFS' (Loudness Unit Full Scale) in that case.
This sequence of values is now gated. There are two successive gating processes. The first one, 'absolute gating', excludes from the measurement all values below -70LKFS, and is supposed to ensure that silence and background noise are not wrongly included in the measurement. The second gating process is called 'relative'. Once rid of the very soft parts of the signal, a mean loudness is evaluated. Relative gating will now exclude all loudness values more than 20dB below the mean loudness. If the mean loudness after absolute gating is, say, -15LKFS, then all values below -35LKFS will be removed from loudness range evaluation. This relative gating is used to remove 'atypical' parts of the signal. At this point, the descriptor we're dealing with is a sequence of 'three-second window, gated K-weighted RMS' values.
And now for the crucial part: the loudness range evaluation. It is done by computing the variability of this sequence of "three-second window, gated K-weighted RMS” values, using the statistical method described above, and illustrated by diagrams three and four in the group on the previous page. As such, we're really in the presence of a "three-second window, gated K-weighted RMS variability”, and the unit for it is LU (Loudness Unit).
If you want to know more, you can find EBU 3341 (loudness measure) at http://tech.ebu.ch/webdav/site/tech/shared/tech/tech3341.pdf. EBU 3342 (loudness range measure) is at http://tech.ebu.ch/docs/tech/tech3342.pdf. ITU BS 1770 (K-weighting) is at www.itu.int/rec/R-REC-BS.1770-0-200607-S/en. It was revised in early 2011, the link for this more recent version being www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-2-201103-I!!PDF-E.pdf.
Loudness Range & Limiters
Limiters reduce loudness ranges, don't they? Well, yes — and no. In fact, this issue is much more complex than it seems. Imagine you've got an audio file that is normalised: you can't add any more gain without getting distortion. Using a limiter or a compressor on such a file will nevertheless add gain to its content: the RMS levels will be increased. This adds dynamic range to the medium: instead of being, in the case of a 16-bit file, 96dB, it will increase to perhaps 100 or 105 dB. On the diagram to the right, this additional available dynamic range is illustrated by the grey rectangle. From that point of view, limiters don't decrease the loudness range, they increase it.
The idea that a compressor or limiter might expand the available dynamic range is interesting, but not new. Many decades ago, engineers would compress the signal between the microphone and the recorder in order to increase the available dynamic range of the recording medium, so that its then low signal-to-noise ratio was less of a problem.
The diagram shows the RMS analysis for three files: an original one, normalised but not limited, and the same file limited using a threshold of -6dB, then -12dB. Let's focus on the difference between the original file and the -6dB one. As far as the low levels are concerned, the -6dB file gains 6dB of RMS. But the high levels are limited, so that the RMS gain for the high levels is only 5dB. This amounts to a RMS variability decrease of 1dB. Let's lower the threshold slider to -12dB: the low levels gain another 6dB, but the high levels only 3dB. This corresponds to another RMS variability decrease of 3dB, a decrease of 4dB overall. So yes, from that point of view, limiters do decrease the loudness range — in that case, by an amount of approximately 4LU.
However, a 1dB loss in RMS variability is a very small amount. The threshold below which limiting really begins to affect the signal depends on the music you're processing. The second diagram shows the evolution of RMS variabilities at different scales for three pieces of music. Notice how the pop/rock music piece on the right shows RMS variabilities that are more resilient to limiting than the two other pieces, which are opera and jazz. This is especially valid for the lower time scales: in that particular case, the limiter's threshold had to be set to at least -6dB to get a noticeable decrease in RMS variability. This might very well be caused by the presence of a loud, very prominent kick drum part in this piece, which may indicate that the higher the initial RMS variability, the more its resilience to limiting. According to that point of view, high variabilities are not easily reduced. This initial resilience to limiting is another argument towards the contention that limiting doesn't automatically mean a reduction in loudness range, especially if the initial material is highly variable.
Remastering & Limiting
Many albums from before the digital era have been remastered. As an example, let's focus on the Cure's discography. Since 2004, each of their pre-1990 albums has been remastered and released with extra material. Diagram 1 from the group below compares the original editions with the remastered ones in terms of RMS level. The 'Deluxe' editions are indeed louder than the original ones, and their RMS level is generally 5dB above that of the original editions. That being said, they're not as loud as the albums released after 1995. On a side note, notice how recent Cure albums are definitely victims of the loudness war: between Wish and Wild Mood Swings, there is a sudden jump of 6dB so that Cure albums, generally less loud than the current trend, exhibit as much level as everyone else.
Let's focus on Pornography, originally released in 1982. The waveform capture on the same image compares the waveform corresponding to the original and remastered editions of the entire album. Obviously, the 2005 remaster relies heavily on digital lookahead brickwall limiters. Is that good or bad? I personally enjoy listening to both editions. From a more objective point of view, let's focus on the highlighted part of the waveform, which corresponds to the end of 'A Strange Day'. On the original edition, just before the short pause, we can see a slight decrescendo, followed by a short crescendo. Readers who know the song will agree that these loudness variations are very relevant to the actual musical content (song climax and then pause). In the original edition, those loudness variations use the first loudness paradigm as described in the main text. Now, look at same part on the waveform corresponding to the remastered edition. The loudness variations are now of a very different nature, and that may not be such good idea. In my opinion, this may be the main danger of remastering albums from before the digital era: if one is not cautious, it raises the density of very high-level samples, reduces the crest factor, and turns the first loudness paradigm into the second.
Records from notably famous and venerable bands such as the Beatles or Pink Floyd are often remastered several times, to the point where it becomes difficult to find a reference version for any of their albums. Let's take Dark Side Of The Moon, for example. Diagram 3 shows high-level sample density for five of its releases: each and every one is mastered or remastered differently. Even the two editions labelled "Original Master Recording” are not the same — probably because one is a vinyl record and the other a CD.
In the context of the loudness war, there is one question that comes to mind: do those remasters respect the original 1973 edition? Diagram 3 from the image below gives some answers. The 1981, 1989 and 1992 editions show an overall amount of limiting that's comparable to what could be found on 1973 records, according to the findings presented at the beginning of this article. The 2003 edition is more problematic: its limiting is comparable to that of a 1995 album. As far as the 2007 edition is concerned, things are not so clear: 'Eclipse' seems to have been limited, or at least compressed quite a lot, but the other tracks show a very reasonable high sample density. Listening to each edition and looking at the waveforms refines the analysis. The 1981, 1989 and 1992 versions sound very much 'old Pink Floyd-ish', with exclusive use of the first loudness paradigm. On the contrary, the 2003 edition isn't convincing in that regard. The left and right front channels of this 5.1 remaster are heavily limited, with a frequent use of the second loudness paradigm. It sounds like Pink Floyd on an FM radio. The fact that the surround channels don't share that flaw doesn't make up for it. As for the 2007 edition, it's an interesting case, and in my opinion seems to have been handled particularly smartly. 'Eclipse', for instance, is loud — much louder than the original. On the other hand, isn't 'Eclipse', as the finale of the album, supposed to be loud? Other songs are remastered in a different way. Plus, even for 'Eclipse', there isn't any use of the second loudness paradigm, though a look at the waveform shows that we're almost at the limit between the two paradigms. It suggests that not all legendary albums are abused by their rights owners: the 2007 edition of Dark Side Of The Moon shows real respect and understanding for the music, and may very well succeed in bringing a nice compromise between the original album colour and more contemporary tastes.
About The 4500 Tracks
Much of this article is based on analysis of a corpus of recorded music compiled from albums that achieved serious commercial and/or critical success. The main references are: Wikipedia's best-selling albums page (see http://en.wikipedia.org/wiki/Best_selling_albums), chart archives from Billboard.com (www.billboard.com/#/charts/hot-100), and the 'best ever albums' web site (see www.besteveralbums.com). Additionally, when an artist is mentioned repeatedly on besteveralbums.com, the complete discography may be included. This is, for example, the case for Radiohead, Nirvana, Pink Floyd and U2. Each album from the corpus was verified as featuring mastering that could realistically have been performed at the time of the initial release — so if, for instance, a recording from 1970 displayed obvious digital brickwall limiting, it was rejected as being a remaster. Songs from compilations were referenced according to their original release date, not the compilation date, and checked for obvious remastering.