There is now a bewildering array of audio options inside most PC audio recording packages, and if you understand the reasoning behind them you can get a bit (or even a few bits) more quality out of your hard drive audio. Martin Walker explains.
Until recently, unless you could spend a lot of money on a high‑end audio recording system, you would be unlikely to worry too much about software compromising your audio quality — the limiting factor was far more likely to be the budget A‑D and D‑A converters used in many soundcards. However, now that budget audio has improved it's far more likely that choosing the correct software settings will result in audible improvements. Also, now that more and more people are buying CD‑R drives, and burning their own one‑off CDs for duplication, software settings become even more important — once you create a CD, every bit is transferred faithfully on to the final product, so the more bits you can get on there in the first place, the better.
For A Few Bits More...
Although it's easy to understand the signal path in high‑end packages which maintain a 24‑bit path throughout (from A‑D conversion before recording, through hard disk storage, to D‑A conversion on playback), the situation can be a lot more confusing in mixed systems. For instance, several cheaper PC soundcards now offer converters with 20 bits, and with 24‑bit internal resolution — but what exactly are the advantages if the audio is still stored on your hard drive in 16‑bit form?
Many PC applications also offer a choice of working resolutions when running — Wavelab, for instance, has options for 8‑, 16‑, 20‑, or 24‑bit Preferred Playback Resolution (whatever the number of bits in the file being played), and when running DirectX plug‑ins inside Sound Forge you can switch between 8‑, 16‑, and 24‑bit processing. But if you have 16‑bit data files on your hard drive, what's the best choice?
Try to ensure that your audio stays at a higher resolution all the time you are editing, reducing it with noise‑shaped dither only at the final stage, before saving it at 16‑bit resolution.
When it comes to DirectX plug‑ins that are primarily intended for mastering (such as the L1 Ultramaximiser from Waves), the permutations increase even further, since you may be offered additional choices for dithering and noise shaping (see 'All About Digital Recording' in the June '98 issue of SOS for a full description of these). While most people can see the point of these options when preparing 8‑bit multimedia files from 16‑bit recordings, or mastering to CD at 16‑bit resolution from 24‑bit files, how about when you're processing 16‑bit files that remain 16‑bit? Are these options valid when you're maintaining the same resolution, and if they are, what settings do you choose? Also, if you've already carried out some processing on a file, can you use dither and noise shaping more than once if you need to further edit your audio? Let's see.
Most people understand that, in order to maintain high audio quality, internal mathematical calculations need to be carried out at a high resolution. This minimises rounding errors that accumulate and give rise to grainy artifacts at low audio levels (such as the end of reverb tails and fade‑outs). Normally, when a system is working with 16‑bit audio data the internal resolution used for audio processing will be 24‑bit or even 32‑bit. There's understandable confusion when applications are said to contain 32‑bit‑compatible code. This refers to the way computer data is addressed, and not audio data. Having 32‑bit code doesn't mean that you're dealing with 32‑bit audio data — the two are quite separate. However, the internal processing resolution used by software is a fundamental choice of the developer, and you would expect that it would be set to an optimum value and left alone.
The problem is that every time you process a 16‑bit audio file, further rounding errors are created — with each operation the losses accumulate, and the fidelity of your audio degrades a tiny bit more. For example, if you EQ a 16‑bit file the calculations may be carried out internally at 32‑bit resolution, but when you click the OK button your data emerges as a 16‑bit file. If you then add some reverb to the sound, another set of 32‑bit calculations is carried out, followed by truncating (chopping off the extra bits) to a final 16‑bit file. Finally, you normalise the file to bring its peak value to the maximum digital value — yet another set of calculations, followed by more rounding errors. Although each process has been carried out accurately for the optimum sound, the final audio quality has been compromised.
There are normally two ways to minimise this problem: either you carry out all intermediate processes at a higher resolution of 20, 24 or 32 bits (converting back to 16‑bit audio only at the final stage), or you apply dithering at each stage, which converts the low‑level rounding errors into a steady hiss (which can be made less obvious by 'shaping' the noise so that it occurs at frequencies to which the ear is less sensitive). However, many dithering systems are not designed to be used more than once; they're intended to be used as the final process in the audio chain, just before mastering. If high levels of carefully tailored noise have already been added, adding yet more may cause audible problems at high frequencies. So the best option is to try to ensure that your audio stays at a higher resolution throughout editing, reducing it with noise‑shaped dither only at the final stage, before saving it at 16‑bit resolution.
Here's A Batch I Made Earlier
If you want to apply more than one process to any 16‑bit file, you ideally need to carry out the intermediate stages at a higher bit resolution. There are various ways to accomplish this. Batch Converters are normally used to apply the same set of processes to a number of files, and are used a lot by multimedia musicians who need to convert files between Mac and PC formats, or change CD‑quality audio into the best sounding set of 8‑bit, 11kHz files for a game or other multimedia title. It's boring work, and once you establish the best sequence of normalisation, bit‑reduction, and dithering options, you can point the Batch Converter at hundreds of files and leave it to get on with the nitty‑gritty.
Likewise, if you know exactly what editing stages you need to apply to a single file, you can also use a batch process, so that all editing stages are part of the same set of calculations. This should result in the final audio signal having better quality than if each process was applied individually, since the audio will stay at the higher internal resolution during the entire process.
However, you normally need to audition the audio before committing yourself to what may be a lengthy procedure. Fortunately, real‑time batch processing is available, and two good examples are the Sound Forge Audio Plug‑in Chainer, and the Wavelab Master Section. These both allow multiple processes to be applied to any file in real time, so that you can hear the results before you commit yourself to writing the edited file to your hard drive.
A particularly elegant approach is that of Wavelab, whose Master Section allows up to six processes to be used (each occupying one of the available 'slots'), followed by a Dithering Processor (see Figure 1). Normally, all of the processes occur in real time (subject to enough CPU power being available), and all you need to do to maximise audio quality is to ensure that the best resolution is being used during any intermediate editing, by setting up all temporary files with a higher resolution of 20 or 24 bits. However, when you want to Apply the Master Section processes permanently to a file, and then save it, it works as a batch processor (the Batch Processor menu option has more options, and is an extension of this).
The latest version of Wavelab carries out the processes as one complex set of calculations, by treating small chunks of the file separately in turn (applying every chosen process), and saving these directly to the final file. This has two big advantages: it's carried out much faster than if the entire file had to be written to disk after each stage, and because no temporary file is created the audio stays at the internal 32‑bit resolution until the final save.
However, despite being able to listen to six or more processes in real time, you're still likely to need other, more basic, editing for your audio files. Apart from topping and tailing (to remove unwanted material before the first note starts, and after the final one has died away), most other editing processes will ideally need a higher than 16‑bit resolution. Most PC digital audio recording packages have various options in their Settings or Preferences menus, and these are the secret to getting the best results during basic editing.
Cool Edit Pro
When creating new files, Cool Edit Pro can edit at up to 48kHz, with 8‑, 16‑, or even 32‑bit resolution. Most people working with 16‑bit data tend to initially select 16‑bit options, but this is not the best solution. If you're working with Cool Edit Pro and prefer to stick at 16‑bit during your editing, you should enable the 'Dither Transform Results' option, so that you retain as much dynamic range as possible when each 32‑bit Transform is reduced to a 16‑bit result. However, as Syntrillium themselves say, this will add a small amount of noise at each stage, although this is still preferable to simply lopping off the extra bits.
A better solution is to use the option in the Data page of Settings, to 'Auto‑convert all data to 32‑bit on opening'. All subsequent editing will then be carried out at 32‑bit resolution, but it is up to the user to convert to a 16‑bit format, after editing is finished but before saving the file. You do this in the Edit/Convert Sample Type window (see Figure 2), which provides a more than comprehensive selection of dither and noise‑shaping options. Even if you only have a soundcard capable of 8‑bit playback, you can edit other file formats by choosing the 'Play 16‑bit files as 8‑bit' option in the Settings section.
The problem is that every time you process a 16‑bit audio file, further rounding errors are created.
There are also two relevant settings when multitrack recording: Playback Mixing can be either 32‑bit or 16‑bit, when combining the tracks for monitoring purposes before sending them to the soundcard. The default is 32‑bit, but 16‑bit can be used on slower hard drives. Mixdowns can also be 16‑bit or 32‑bit, depending on whether further editing is likely to take place.
Sound Forge can work with any 16‑bit file, up to a 96kHz sample rate, but there are few internal options to worry about. Inside any plug‑in you can select 8‑, 16‑, or 24‑bit processing by right‑clicking on the current value shown by 'CPU %'. This determines the resolution of the data both entering and leaving the plug‑in, although most plug‑ins will operate internally at an even higher resolution. Sound Forge operates internally with a 16‑bit resolution, but the 24‑bit CPU% setting becomes valuable when using the Audio Plug‑in Chainer — the chainer will then pass 24‑bit data between each chosen plug‑in, to maintain the best audio quality between each process. At the output of the chainer the data returns to its normal 16 bits, but of course you can use noise‑shaped dither as the final stage, to ensure that you squeeze the last drop of quality into these 16 bits (a dynamic range equivalent to 18 or 19 bits is claimed for some dither algorithms).
For many people, Wavelab is a perfect partner to Cubase VST, since its comprehensive range of audio treatments are the final icing on the audio cake, both for off‑line treatments of VST tracks, and for mastering to CD after the mixdown is finished. VST tracks are normally 16‑bit, apart from the export function (see later), but Wavelab can handle a variety of sample formats: 8‑, 16‑, 20‑, 24‑ and 32‑bit, and from below 11kHz right up to 96kHz.
Whatever the number of bits in the audio file, there are various choices to be made in the Preferences section. In the Audio Card page you can select from four Preferred Playback Resolutions: 8‑, 16‑, 20‑, and 24‑bit. Whatever the format of your audio data, Wavelab converts it to the chosen resolution 'on the fly,' before sending it to the soundcard. This is useful if, for instance, you want to edit 24‑bit data, but your soundcard only supports 16‑bit playback. If you have the luxury of a soundcard that supports 24‑bit playback, you can hear how the data will sound when dithered to 16 bits. However, if you try to play back at a resolution not supported by your card, you will get an error message to this effect.
In the File window you can select between three types of temporary file: 16‑, 24‑, and 32 ‑bit. Steinberg only recommend using 16‑bit temporary files where speed and disk space are crucial. If you ever plan to export 24‑bit files, the 24‑bit or even 32‑bit options will be preferable, but even for general‑purpose 16‑bit work, 24‑bit temporary files will maintain audio quality when performing more than one edit. The 32‑bit option is only useful if any of the temporary files is likely to generate levels greater than 0dB, since it avoids clipping. This is unusual, so for most purposes the 24‑bit option will be the best choice.
Incidentally, if you have two or more hard drives it's well worth placing the temporary files on a different drive to your main audio data. When processing, the source file will be read, and then a temporary file written. If both are on the same drive the heads will be constantly swapping between two positions on the same drive. By using a second drive for the temporary file it's possible, in some cases, to double processing speed.
Apart from ensuring that you get the maximum possible level before clipping whenever you record a track, to ensure the widest possible dynamic range, there are rarely many options that concern basic audio quality in MIDI + Audio sequencers (unless you have a more expensive system with options for 20‑ or 24‑bit recording). The internal processing resolution may be high (Cubase VST, for instance, uses 32‑bit floating point), but since both input and output signals are normally 16‑bit there are seldom any decisions to be made. People do routinely normalise their recorded tracks to bring them to similar levels, but this won't affect their quality.
The only basic choice is likely to be that of sample rate, but for most purposes 44.1kHz is the best option. The 48kHz option may be required if you're importing tracks digitally from DAT or ADAT, and some people do use 32kHz if they have borderline systems (to gain a few more tracks), but this should only be done if absolutely necessary, as any transfer to CD later will require a sample‑rate conversion anyway. Few applications allow mixed sample rates, so this tends to be a set‑and‑forget decision.
The only time when you're likely to have a choice of bit resolution is when exporting audio during final mixdown (if this is provided as an option). Cubase has this, as do the new versions of Logic Audio Gold and Platinum. In Cubase, you can access this either by clicking on the Create File button of the Master Section, or by selecting the Export Audio File function from the File menu (see Figure 3). Here there are various choices to be made: final sample rate, whether to save in Mono, Stereo Split (a pair of mono files), or Stereo Interleaved format, and whether to include any of the real‑time Automation and Effects that you have added.
The only other option here is Resolution, and the choices are 8‑, 16‑ or 24‑bit. You would choose 8‑bit only for multimedia applications, where audio quality is not as important as small file size. If you want to do any further work on the exported file inside VST, the 16‑bit option is the one to choose, since Cubase can only currently work with 16‑bit files. However, when mixing many 16‑bit audio tracks, the resolution of the mixdown file will be much higher.
The highest resolution, 24‑bit, would be ideal if you anticipated applying any further processing, perhaps before final mastering. This would be the most suitable setting, for instance if you wanted to later import your audio into Wavelab, since you would then maintain highest audio quality until the very final stage of saving a 16‑bit master file to DAT or CD‑R.
Most plug‑ins do just that — plug in and go, without needing any user adjustments other than the creative ones providing the effect. However, the Waves Native Power Pack, which is widely used on the PC (as well as the Mac), comes with extensive options for dithering, notably in the L1 Ultramaximiser. All the normal rules apply, but there are several special things to note. The IDR (Increased Digital Resolution) system is claimed to be one of the few that have truly random noise added during the dither process, so that it can be used several times without causing problems. Despite this, if you are using the NPP inside Wavelab, it's sensible to make sure that you use either the NPP dither options, or those of the Wavelab Master Section, but not both. Internal processing is carried out at 24‑bit resolution, but of course it is the supporting application that determines the resolution of what goes in, and what comes out at the other end.
Multitrack Mixdown To Stereo
When you're editing a single track inside a MIDI + Audio sequencer, you will probably have a different set of priorities from when you're mixing down every track to a single stereo file. However, some tweaks are easier to do while still at the multitrack stage. The first thing you should do is to listen through carefully on both loudspeakers and headphones. Any clicks, glitches and hums should be noted, and their causes narrowed down — it will be easier to edit these out of a single track than a combined mixdown. They can be carefully edited out before the mixdown is carried out, but you may be able to sort the problem out and re‑record the offending track. Unavoidable hums and hisses may be reduced in a variety of ways, from notch filtering right through to adaptive noise‑print restoration, using DirectX plug‑ins.
Having 32‑bit code doesn't mean that you're dealing with 32‑bit audio data — the two are quite separate.
When you do the final digital stereo mixdown, it's normally best to use the same sample frequency as the final intended product, even though most modern software has resampling options to change between different formats. For instance, if you want to burn a CD, recording at 44.1kHz is best — yes, you may get a gnat's whisker more top‑end response by initially recording at 48kHz, but no conversion process is perfect, and what you initially gain will be lost when you down‑sample later. The only situation in which a higher resolution is worth having is if you have the option to record the final track at 24‑bit (as you do in Cubase VST). If this track is going to be further edited, 24‑bit should give you a better final result, even though you dither down to 16‑bit at the end.
By the way, if you intend to have a fade‑out on your track, leave this until the pre‑mastering stage. Although it's possible to add global audio fade‑outs in many MIDI + Audio sequencers, it's safest to wait until you have carried out any global EQ and level tweaks, and then you'll get the cleanest result.
When you have a final stereo mixdown track recorded, it can still be tricky knowing where to begin. Don't immediately start by normalising the entire track, since there may yet be other edits to be made that later change levels, making this superfluous or even undesirable.
The first operation is normally topping and tailing, to remove any superfluous data from before the start of the first note, and after the final note. However, before you do this, listen to these two parts of the track carefully. Since they are probably the most exposed areas, any hiss or hum may be more noticeable than in the rest of the track. If you have left a second or more of this background noise after the final note, and have some noise reduction software (included in Cool Edit Pro, and also available as plug‑ins from Sonic Foundry and Steinberg, amongst others) you can get a noise print from this, and then use it to treat just the initial and final few seconds of the track. As long as you're careful that no changes of timbre are evident at the joins, this can clean up tracks very well.
Once this has been done, and topping and tailing carried out, you need to consider whether any other corrective EQ, compression, or overall treatments such as enhancement are needed. If you like to compare the track to commercial releases in a similar style, you may want to add a little EQ. Plug‑ins like Steinberg's new FreeFilter (reviewed in the July '98 PC Notes) make this process easier, by allowing you to directly compare the two tracks and generate a filter response for correcting your track to sound more like the commercial one, but there is still no substitute for a good pair of ears.
Before you start applying effects one at a time, remember the improvements available if you use a batch process. The final treatment will probably be to ensure an optimum level, either using normalising or peak limiting, to bring the overall level up without compromising the sound (see the 'On the Level' box for more details).
Having got the sound exactly as you want it, you can now add a fade‑in or fade‑out, but here it's advisable to make sure that you're using 24‑bit temporary files if available, since you want to keep these higher resolution calculations intact before applying the combined batch effects with dither. There's some dispute about whether these fades should be done before or after final level tweaking. Logically, the fade is best left until every other process has been carried out, but if you want to use dithering as the final stage in a batch, it is probably best to do the fade first. However, if you are intending to burn a CD in Sonic Foundry's CD Architect, you can leave the fades till later, since this program allows you to create them 'on the fly' during the CD burn, complete with dither.
The Final Bits
Here's a summary of all the procedures mentioned so far:
- During recording, try to get the highest possible signal level without clipping, to make the most of the available dynamic range.
- Any editing operation on a 16‑bit audio file that changes its level (including fades and EQ) will generate a result with more than 16 bits.
- Always use batch processing if available — you can still normally audition the combined effect in real time before committing yourself, but you will get 24‑bit or even 32‑bit resolution for the entire set of calculations.
- If, for some reason, you can't use batch processing, select 24‑bit or 32‑bit temporary files if available, to maintain highest audio quality during your intermediate editing stages.
- Use 16‑bit temporary files during editing only if your PC struggles with the extra overhead of writing 24‑bit data, or you are running out of space on your hard drive.
- Always leave any dithering and noise shaping until the final stage, and unless your software manual states otherwise, don't attempt to add more dither to an already dithered file.
Armed with these tips, and with the reasoning behind them, you can now add a bit more spit and polish to your digital audio, and you should emerge with a clearer, more detailed sound.
From The Outside In
Many modern soundcards have 18‑bit or 20‑bit converters, and although some, like the Event Darla and Gina, allow 20‑bit recording as well as 16‑bit, others still only offer 16‑bit recording. However, you are still likely to get a cleaner, quieter 16‑bit signal than with 16‑bit converters.
Some proprietary dithering, like Sony's Bitmapping process, takes place at the A‑D stage during recording, when using converters with more than 16 bits. Dither is added before the signal is saved onto the DAT tape, which preserves more of the dynamic range of the original signal coming from the A‑D converter. Once on tape, this improvement can be heard when played on any other machine, since it is part of the recording itself. The reason that Sony provide a switch for the process is that further editing may cause problems with an already dithered signal. If you need to treat your DAT recording in an editing package, Bitmapping should be switched off, and dither applied at the final stage of editing, as normal.
If your D‑A converters are 20‑bit, and the internal path is 24‑bit, the hardware will normally provide its own dithering to give the best possible 20‑bit signal when playing back 24‑bit files. If you're only recording 16‑bit analogue data to your hard drive, you can still use a 24‑bit internal path during multitrack and mastering work when a higher resolution is being generated, and for importing and exporting digital data.
On The Level
Although there are many useful tools available within the latest digital audio editors to adjust levels, there's no substitute for getting the maximum possible dynamic range into your WAV file recordings in the first place. If your peak levels are 6dB below clipping, you're only using 15 of the available 16 bits during recording.
Normalising scans the file for the highest peak, then increases the level of the whole file so that the peak is at the maximum possible digital level. This doesn't increase the dynamic range: all it does is make the audio louder. However, unless you have already applied compression or limiting, normalising may still leave the audio at a comparatively low level, since there will be a few short transient peaks that are significantly higher than the remainder of the material. There are several mastering tools that can provide a much greater average level, such as the Waves L1 Ultramaximiser, Spectral Designs' Loudness Maximiser, Emagic's Audio Energiser, and the Peak Master included with Wavelab 2.0.
What these plug‑ins do is to raise the overall level by a chosen amount (to increase the perceived volume), while altering the sound as little as possible. Most of the audio will simply have its level increased, and only the short peaks that exceed the chosen threshold will be treated. Some of you may even have attempted a similar thing by hand, using a pencil tool to round off the tops of a few stray peaks that clipped during an otherwise perfect recording. Even a normalised file, whose peaks are already at digital maximum, can be treated in this way, to further increase level, with minimal audible changes to the sound. Essentially, this is peak limiting, but these plug‑ins have the advantage over hardware devices of being able to look ahead in the waveform to anticipate and shape signal peaks in a way that produces the bare minimum of audible artifacts.