You are here

Q. What's the difference between floating- and fixed-point systems?

By Hugh Robjohns

Could you clarify the difference between floating- and fixed-point 32-bit operation in the digital domain. I know that floating-point systems allow for data to be handled at word lengths above 24-bit, which are then dithered back down. Does it also result in a greater dynamic range?

SOS Forum Post

Technical Editor Hugh Robjohns replies: Accurate digital audio capture and reproduction requires, at the very most, 24-bit resolution. The reasoning behind this is that a 24-bit signal has a theoretical dynamic range of 144dB, which is greater than the dynamic range of the human ear, so, in theory, a 24-bit system can record sounds slightly quieter than those we can hear and reproduce sounds louder than we can stand. There is therefore no need for A-D/D-A converters to work at resolutions higher than 24-bit.

High-end digital consoles like the Sony DMX R100 use 32-bit floating-point processing, giving them almost limitless headroom.High-end digital consoles like the Sony DMX R100 use 32-bit floating-point processing, giving them almost limitless headroom.However, when it comes to processing sound within a digital system, there needs to be some headroom to accommodate the fact that adding two 24-bit numbers together can produce a result which can only be described using 25 bits, and adding 30 or 40 such numbers together can produce something even bigger. At the other end of the scale, the mathematical calculations involved in complex signal processing like EQ generates very small 'remainders', and these have to be looked after properly, otherwise the EQ process effectively becomes noisy and distorted. The natural solution is to allocate more bits for the internal maths — hence 32-bit systems.

Fixed-point systems use the 32 bits in the conventional way to provide an internal dynamic range of about 192dB. Systems that use fixed-point 32-bit processing (like the 0-series Yamaha desks) usually arrange for the original 24-bit audio signal to sit close to the top of that 32-bit processing number to provide a lower noise floor and slightly greater headroom for the signal processing. (Incidentally, a 192dB SPL is roughly equivalent to two atmospheres' pressure on the compression of the wave and a complete vacuum on the rarefaction.)

Floating-point systems also use 32-bit numbers, but organise them differently. Essentially, they keep the audio signal in 24-bit resolution, but use the remaining bits to denote a scaling factor. In other words the 24-bit resolution can be cranked up or down within a colossal internal dynamic range so that, in effect, you can never run out of headroom or fall into the noise floor — there is something like 1500dB of dynamic range within the processing, if the maths is done properly.

Most high-end consoles and workstations employ floating-point maths because (if properly implemented) you can get better performance and quality in the computations. Most budget/low-end consoles and DAWs use fixed-point processing because it's easier and faster, and can be implemented in hardware more easily.

 

Published January 2004