You are here

Q. What’s the best way to downsample?

Hello SOS team, and thank you for the best education I could hope for! I want to ask what method you consider to be the best for downsampling. I have started working at 24-bit and 96kHz and am noticing the benefits in quality, but I’m confused about the best method for getting back to CD quality (16-bit, 44.1kHz). I have no problem understanding dithering 24-bit audio to 16-bit, but am less clear about downsampling.

Before attempting to master (once the mix is done and in stereo format) I take the file and downsample from 96kHz to 44.1kHz, but there is a definitely noticeable degrading of the high end when I’ve done this. I have used various tools for this, from Cubase’s built-in options to Voxengo R8Brain (apparently one of the best), but nothing seems to have worked. Ultimately, I’ve resorted to mastering my track and dithering to 16-bit, then simply recording my converter output (outputting at 24-bit, 96kHz) back into the computer at 16-bit, 44.1kHz. I seem to have achieved the best results this way — if it sounds good, then it is good, as they say.

As the converters in my audio interface are not of the highest quality, though, I can’t tell if this is doing more harm than good. Could you please shed some light on this and let me know how to get this part of the process right?

Justin Shardlow, via email

Although it’s not the only measure of a  sample-rate converter’s (SRC) quality, these plots show a  clear difference in the aliasing artifacts caused by the SRC in Cubase versions 4-8 (top) and the linear-phase SRC in Voxengo’s R8Brain Pro. The latter clearly performs better.Although it’s not the only measure of a sample-rate converter’s (SRC) quality, these plots show a clear difference in the aliasing artifacts caused by the SRC in Cubase versions 4-8 (top) and the linear-phase SRC in Voxengo’s R8Brain Pro. The latter clearly performs better.SOS Technical Editor Hugh Robjohns replies: When you get into the back-room mechanics, asynchronous sample-rate conversion (whether it be software- or hardware-based) is inherently a fairly complex subject. In essence, though, it all comes down to a virtual reconstruction of the full audio waveform from the source digital samples, and then calculating the amplitude values of that waveform at the specific moments in time when the output samples are required (at whatever new rate). This same core process applies regardless of whether it is upsampling or downsampling, although the latter requires an additional stage of low-pass anti-alias filtering to comply with the new Nyquist limits (allowing nothing through above half the sample rate).

If you think about it, this is exactly the same process as when you send audio out through your interface’s D-A converter and back through its A-D converter. The D-A converter reconstructs the analogue waveform, and the A-D re-digitises it, via the appropriate anti-aliasing filter, at the new sample rate. If your A-D has a 16-bit mode you could perform both the word-length reduction and dithering as part of the same process, too.

Hardware and software asynchronous sample-rate converters (SRCs) employ some fairly complex mathematics processes, obviously, but when performed correctly this is a very mature and highly accurate science. It is a fact, for example, that good SRC processes easily outperform the best converters in terms of dynamic range and distortion.

So, is the analogue process doing more harm than good? Theoretically, yes it is, because a good SRC should maintain a greater dynamic range and lower distortion. But that’s the theory, and modern converters are superb these days — as you say, if it sounds good, it is good!

Regarding software SRCs, though, as you’ve discovered not all are designed equally well. In fact, some are positively atrocious. You can compare and contrast a wide range of different software SRCs at As it happens, the SRC algorithms in Cubase (versions 4-8) are fairly poor, as the Sweep and Tone plots on that web site reveal very clearly in the obvious aliasing patterns and spectral detritus. I am a little perplexed, though, at your comments about the R8 algorithm because that is, indeed, one of the better SRC algorithms on the market (as the plots also indicate very clearly).

I don’t know which version of R8 you’re using, but neither the free nor pro versions suffer from aliasing or other artifacts. So, assuming the software is working correctly, there are a number of possible reasons for the HF degradation you’ve experienced. The most obvious first place to look is the newly imposed Nyquist roll-off, although this should be well above the hearing of most people, and the A-D converter in your analogue conversion process should be applying a similar roll-off, too.

So if it’s not the presence of the roll-off, could it be something to do with the filter type? The filters in most modern A-D converter chips are not quite as steep as those in decent software SRC’s, and most actually allow some aliasing because they are only about -6dB at the Nyquist frequency. In contrast, most decent SRC algorithms have much better filters that really do stick to the rules (Cubase doesn’t, but R8 does!).

Another possibility could be the time-domain aspect of the filter. The R8 Brain Pro version allows the selection of linear-phase or minimum-phase filters, which you may perceive as having slightly different characters. Linear-phase filters introduce a small amount of pre-ringing (which can’t happen in the analogue world), while minimum-phase does not. But again, D-A and A-D filters are usually linear-phase with the attendant pre-ringing.

Finally, I wonder if the issue you’re experiencing is actually related to signal level, rather than high-frequency content, since you’re working with ‘mastered’ material. This could be our old nemesis, the dreaded inter-sample peak! If you have normalised the signal so that peak samples hit 0dBFS, it is possible for peaks in the waveform to rise above the height of those existing samples.

As I’ve explained, SRC processes calculate the precise amplitude of the waveform in between any existing samples, and any inter-sample peaks could be higher than 0dBFS, resulting in clipping distortion and aliasing. In contrast, your analogue conversion process probably manages to avoid this clipping issue because of a slightly lower analogue signal level into the A-D. So it might be worth leaving a few decibels of headroom on your 96kHz file before the SRC process, using R8, and seeing if that helps.