You are here

Q. What's the dynamic range of the channels in my DAW?

As far as I am aware, Logic's processing engine, with which it does all of its internal effect calculations and summing, runs at 32 bits. I record at 24-bit so, theoretically, the 32-bit audio engine should give about 48dB of headroom over the dynamic range of a 24-bit signal (each extra bit represents 6dB).

To test this, I put a sound file of a sine wave on a mono track and pushed up the channel fader. I then put an oscilloscope plug-in on the master output, just in case any distortion was too subtle for me to hear. Even with the channel fader 6dB into the red, the sine wave was perfect and undistorted, as long as I kept the master fader down and didn't clip the output. I then added four Gain plug-ins to the sine wave channel, and pushed them up to +24dB, +24dB, +24dB and +22dB respectively. These, plus the channel fader's +6dB, gave a +100dBFS output. However, once again, if the master fader didn't clip, the sine wave was untouched. How can this be? How can a channel be +100dB over full scale and not clip the signal? Is Logic processing all the audio in 64 bits internally?

Paul Bissell

Q. What's the dynamic range of the channels in my DAW?Left: In fixed-point processing, there is a generous but fixed amount of headroom above the 24-bit source signal, and a very low but fixed noise floor. Right: In floating-point processing, the 24-bit input signal is effectively placed within an almost infinite dynamic range, and then its level is then adjusted as necessary by its exponent against the nominal gain scale.

Technical Editor Hugh Robjohns replies: This isn't something restricted just to Logic. Pretty much all computer DAWs and large-format digital consoles work in the same way, and it's all due to the use of an approach referred to as 'floating point' maths.

The largest word length we use for capturing audio and moving it around between equipment is 24 bits, a number chosen essentially because it provides a theoretical dynamic range of a little over 140dB. That is usefully greater than the dynamic range of the average human ear (the nominal threshold of hearing is given as 0dB SPL and the point where the blood starts spurting horizontal from the ear in Monty Python-esque style is around 130dB SPL). So a 24-bit word length is more than adequate for capturing and conveying audio signals, since it can cope in theory with stuff quieter than we can hear, and louder than we can stand.

However, when you start processing and combining signals, they have a habit of getting louder, so inside a mixing desk or computer DAW it is inevitable that you will need a greater word length to cope with the extra headroom needed to combine loud signals, but also to provide a very low system noise floor so that the mixing of many signals doesn't bring with it a lot of noise. There are two common approaches to this: 'fixed point' and 'floating point'. The former tends to be used in products that rely on dedicated hardware chips to perform much of the signal processing (Yamaha's digital consoles, for example), while the latter tends to be used in software-based systems such as large-format professional consoles and computer DAWs.

Fixed-point processing is much as you have described; 32 bits are allocated for most of the number-crunching processes, providing an internal dynamic range of some 192dB. That can be arranged to provide a lot of headroom above the 24-bit input signals, and equally a very low noise floor below them. In many cases, more complex calculations (such as for EQ) are done to higher resolution than 32 bits (so-called double or treble precision), but the result is then reduced back to 32 bits for onward processing. If properly engineered, this fixed-point approach works well in practice, but if you try hard you will eventually find the headroom limits!

The floating-point approach tackles the problem from a different direction, and I'm sure you'll remember the idea from school maths lessons. Instead of writing a large number as, say, 12,345,600,000,000, we can write is as 1.23456 x 10^13 (10 to the power of 13). The first part of the number (the 1.23456) is called the mantissa, and the second part (10^13) is called the exponent. The exponent describes how far to move the decimal point to get the full number — hence the term 'floating point'. The obvious advantage to us of this nomenclature is that it makes very big numbers much easier to write down, but it also makes the maths easier when multiplying numbers, as you have to do when changing the gain or volume of a digital signal.

In most practical applications, floating point is still used within a 32-bit framework, but with 24 bits allocated to the mantissa and eight bits allocated to the exponent. If you do the maths you'll find such an approach provides the utterly ludicrous theoretical dynamic range of 1500dB, which means you will never run out of headroom inside the processing, and never lose signal in noise floor.

So, as you found, you can increase or decrease the level to the most ridiculous extremes inside a floating-point system, and as long as you restore the gain to something more appropriate to feed the output converter's dynamic range, you will not suffer from the noise or clipping that a more conventional fixed-point or analogue system would. Which is very impressive.

Floating-point maths isn't quite the perfect audio saviour that it might appear, though, and really shouldn't be seen as a handy excuse not to bother with traditional gain-structuring practices. The mantissa is still restricted to 24 bits, which imposes some limitations on ultimate resolution when mixing multiple signals together, and that probably lies behind the raging arguments about quality differences when mixing 'inside the box' as opposed to outside. But I hope this somewhat long-winded explanation has quenched your thirst for an explanation as to why you can crank a signal up to +100dBFS and still get it back without it having been clipped.