Almost every recording musician has a great take somewhere that's unrepeatable and marred by hiss, hum or other audio gremlins. Wouldn't it be great if there was some affordable way of restoring such a recording to its former glory? JANET HARNIMAN COOK discovers that there is...
Audio renovation was, until recently, the domain of expensive high‑end systems such as those from Cedar and Sonic Solutions, but the last year or so has seen the introduction of high‑quality budget audio restoration software for the PC, most notably as plug‑ins for the Steinberg WaveLab, Sonic Foundry Sound Forge and Creamware TripleDAT PC editing and recording applications. This impressive new technology is capable of restoring and enhancing damaged or badly recorded audio material which would previously have been considered beyond salvage, and it can be used to perform a wide range of functions, including the removal of hiss from cassette and open‑reel tape recordings; the elimination of spikes, glitches and drop‑out from DAT and other digital sources; the eradication of background noise such as mains hum and system noise from mixers and outboard; and reducing the surface noise and crackle from ancient distressed vinyl LPs.
A word of caution is in order, however: although this new technology is surprisingly effective, inevitably some audio material will prove to be irrecoverable: audio restoration is a mixture of art and science, and it is better to aim for acceptable results rather than to expect perfection. Generally, most listeners tend to ignore low‑level hiss or the occasional small pop but take exception to unnatural sounds, such as the flange‑like artifact noise that can be introduced during digital processing. However, in situations where intelligibility rather than audio fidelity is the main criterion — such as when cleaning up telephone conversations and conference recordings — the presence of artifact noise could be unimportant.
Just as the removal of layers of old discoloured varnish from an oil painting often reveals unsuspected detail of line and colour, so too the stripping away of noise from an audio recording unmasks timbral and spatial qualities, and it is fascinating, when listening to successfully renovated audio, to realise that the recording sometimes sounds better than it did when it was originally made!
Unwanted noise comes in two main forms:
- Broadband Noise (also called Wideband Noise) is of a relatively constant nature and is present throughout the recording.
- Impulse Noise occurs unpredictably and is of a short, sudden character.
Broadband noise consists of unwanted background interference, such as that produced by tape hiss, mains hum, noise from PC hard drives and fans, machine noise from camera and tape machine motors and air conditioners, and system noise from mixers, outboard, synths and dirty guitar amplifiers. Most noise‑reduction applications take a sample, or 'noise print', consisting of a region of 'silence' of between 0.5 and two seconds, that is representative of the background noise to be removed (the most convenient regions are usually those preceding the start or following the end of the main audio material). The noise reduction algorithm analyses the selected region of noise by its amplitude and frequency components, using a mathematical Fast Fourier Transform (FFT — see below) and extracts the unwanted noise using multi‑band filtering. This is computationally very demanding and can be very slow on even fast PCs, but it is usually highly effective, so your patience is rewarded!
In the Sound Forge Noise Reduction module, the sample size and number of frequencies analysed is determined by the Fast Fourier Transform value: higher values provide greater resolution and finer detail, but take longer to process. A good starting value is ‑20dB, but it's worth experimenting on a small portion of audio to see which value gives the best results: lower values will remove less noise; higher values carry the risk of generating audible artifacts and possibly removing part of the source material. Often a multi‑pass approach works better than single‑stage processing, so rather than applying 30dB of noise reduction, three separate applications of ‑10dB may produce cleaner results.
The noise print analysis is displayed as an amplitude/frequency graph containing envelope points or nodes. These are adjustable individually or as a group, and their function is to raise or lower the amount of noise removed at the node frequency; the overall level of noise reduction applied is reflected by the distance of the envelope above the noise graph, with signal below the envelope being treated as noise; by default the envelope points are generated 6dB above the noise print. If, like me, you use noise reduction regularly on your recordings, it's worth leaving a region on the session tracks from which to take the noise sample. If you cannot find a suitable section of the audio file from which to take a noise print, most applications include presets for common noise‑reduction tasks, and the DART [Tracer Technologues noise‑removal software, reviewed January 1997] DeHiss module operates using a standard noise print.
The main limitation of off‑line noise print‑based systems is that they operate on the assumption that characteristics of the noise print are common to the entire soundfile. If this is not the case, these systems may fail to provide a uniform renovation quality. To overcome this problem, the real‑time applications — Steinberg's Denoiser and Creamware's Osiris — use a different approach, with a read‑ahead adaptive algorithm filter that recalculates the noise content of the soundfile in real time.
When using noise reduction, special care should be taken to avoid inadvertently removing subtle low‑level audio signal components and so throwing the baby out with the bath water! Reverb tails are especially vulnerable, and can be removed along with noise if long algorithm envelope release times are used; conversely, very high attack values may add unwanted artifacts, making your vocalist sound like an entry into the Eurovision Gargling Contest.
Impulse noise may be defined as any suddenly occurring disturbance that affects a recording, and includes pops and clicks from microphones; short scratches on vinyl disks; poor tape splicing, spikes and crackles caused by noisy potentiometers and switches on studio equipment; and glitches and dropouts on digital recordings, caused by faulty DAT recorders or corrupted computer soundfiles. Many low‑amplitude clicks and pops may be removed with noise‑reduction utilities, but to minimise the audibility of impulse noises we must look to dedicated click‑removal processing. Clicks appear as sharp, jagged or high‑sloped spikes on the waveform, and these are identified by the de‑clicker algorithm, which then applies processing to these areas, leaving the surrounding audio in its original condition. The click‑removal algorithm mutes the section of damaged audio, and interpolates audio material from either side to fill in the space left by the offending click.
In practice, click removal is often very tricky and the results are variable — sometimes it may even prove more effective to remove the worst clicks manually! When attempting to repair dropouts, de‑clickers can only correct dropouts of less than 60 samples — slightly more than 1ms at 44.1kHz sample rate — and, in practice, copying across a similar section of audio from elsewhere in the recording may yield a more satisfactory result. Occasionally, better results can be obtained by reverse time‑processing the damaged audio prior to de‑clicking, and this technique can be especially useful for the removal of impulse sounds with a slow attack and sharp decay. To try this, first reverse the whole soundfile; next, apply de‑click processing, and then reverse‑process again to restore the original form. Bear in mind, though, that reverse time‑processing may produce weird and undesirable results when combined with filtering; as with so much of audio renovation, it's a question of experimentation. De‑click utilities should be used before applying other filtering techniques, such as broadband noise removal and low‑pass filtering, as these processes can inhibit click detection; the exception to this rule is when the audio has large amounts of very low‑frequency (under 30Hz) content. This sub‑bass should be removed prior to de‑clicking, to avoid introducing unpleasant low‑frequency artifacts.
Click removal can also be used creatively — for example, to reduce harshness in digital recordings, or to soften the attack of brass and percussive instruments — but over‑zealous application of de‑clicking techniques will impart a dull, lifeless quality to recordings, as it will remove the attack transients upon which human hearing relies when identifying individual sounds. This said, de‑clickers can be an invaluable studio resource when used with caution.
Should original analogue master tapes be lost or damaged, there may be no alternative but to transcribe the vintage audio from vinyl disk. Noise reduction and de‑clicking will play an essential part in the restoration process, but they can only achieve limited results, due to the particular nature of vinyl surface crackle. De‑crackling technology is highly complex, and of the applications featured in this article, only Creamware TripleDAT's Osiris includes a De‑crackler module — though the Steinberg De‑clicker plug‑in does a very creditable job and features settings for processing vinyl and shellac disks.
There's a further source of impulse noise that needs to be tackled differently: DC (Direct Current) Offset problems are caused by mismatches between digital recording equipment — usually soundcards and DAT machines — and give rise to audible clicks at splice and boundary points in soundfiles. Most audio editors include DC Offset detection and correction facilities, and it is easy to check if a soundfile has DC Offset problems by zooming in on a silent section of the waveform: when DC Offset is present, the waveform will appear displaced (offset) from the 0dB axis. If DC Offset is detected, it will be necessary to apply DC Offset correction whenever a recording is made, since it's an inherent and constant characteristic of the hardware used to make the recording.
If a recording was made at a low volume level, normalisation can be used to optimise the signal without inducing clipping, whilst maintaining relative volume levels in the soundfile. The Normalisation process first involves scanning the soundfile to determine the peak volume level. The difference between the peak level and the onset of digital clipping (>0dB) defines the maximum amount of gain that can be applied during normalisation, although it is a matter of some controversy as to whether the true optimum signal level for digital audio files is 0dB or ‑0.1dB.
It's often necessary to correct any discrepancies in volume that might exist between different regions, to ensure a consistency of level throughout the recording. Gain changes should be matched carefully, so that level correction is a single‑pass process, and this will reduce the chances of introducing distortion and artifacts by over‑processing. Similarly, adjustments to the stereo balance may be required. It is not uncommon for the voices of inexperienced vocalists or voice‑over readers to fade out at the end of a phrase; this is caused by poor breath control and/or movement away from the microphone. The answer here is to apply compression to the whole file to even out the overall level, and, if necessary, manually add volume changes on the quiet syllables to compensate for the lack of level. If needed, also apply de‑essing to correct sibilance, and gate out breath pops and any other extraneous noise. Compression lowers the dynamic range of audio material by reducing the level of the loudest components and then raising the overall gain, and if a recording seems thin, adding compression will make the sound smoother and more solid
Once gain processing is complete, any timbral adjustment can be made using EQ and enhancement. However, if you do not have access to the specialist noise reduction applications discussed earlier, the filters in your audio editor are invaluable: low‑pass filters can be employed to remove high‑frequency disturbance such as hiss, high‑pass filters can remove boom and rumble, and notch filters can be tuned to attenuate problem frequencies that occupy narrow bandwidths. However, in order to maintain consistent levels, it's better to apply EQ for noise reduction before carrying out gain correction. Some types of noise can be particularly tricky to remove because of their often wide frequency range and changing dynamics, especially noise from traffic and noise introduced via the off‑line recording chain (from the mixer, outboard, synths and connecting cables). Noise from the recording chain is typically present throughout the whole recording, but is masked by the louder passages, and is most audible during quiet sections and at the start and end of the recording. The constituents of this type of noise are not constant, and vary as the different sources of the noise change level during the mix; consequently, processing with noise print‑based techniques will be only partially successful, and the best approach is to treat problem areas individually with gating, expansion and EQ.
Sometimes, rogue frequencies may be difficult to identify precisely. The spectrum analysis utilities provided by many audio editing applications facilitate this task by providing a graphical analysis of audio in the frequency domain (across its frequency spectrum). The time‑to‑frequency‑domain analysis of audio signals used by audio spectrum analysis (and noise reduction) utilities is known as Fast Fourier transform (FFT). This derives from work in the early 1800s by the French mathematician Joseph de Fourier, which stated that any periodic signal can be recreated by adding together a series of harmonic sine waves. The three most common forms of spectral analysis display are two‑ and three‑dimensional graphs and the sonorgram — which both present an off‑line static snapshot of the audio — and the real‑time bargraph. The Steinberg WaveLab Frequency Analysis takes the form of a 3D graph and displays time and frequency across the two base axes, with amplitude represented on the vertical. This results in a snazzy 'mountains and valleys' view of the audio.
The Sound Forge Spectrum Analysis plug‑in suite includes a sonorgram display and is a 2D frequency/time graph in which amplitude is represented by colour intensity. Sonorgram images present distinctive spectral patterns of sound and, given skilled interpretation, individual sound sources can be distinguished — for example, off‑camera audio of a street riot can be analysed visually to separate speech from traffic noise or gunshots. TripleDAT features a good example of the real‑time bargraph display amongst its measurement instrument set. The audio signal is displayed as easy‑to‑read multiple frequency bands arranged across the horizontal axis, with the amplitude represented by the height of the columns, which dance in real time as the signal passes through the processor.
It's often easier to replace a section of damaged audio by substituting a section from elsewhere in the recording. Whilst it is possible to do this in a stereo audio editor, it's much quicker to use an editor with multitrack facilities and employ the processing muscle of the computer to make virtual splices, by switching in real time between audio regions on adjacent tracks. Imagine a song with a damaged chorus, which we'll replace with another undamaged chorus from elsewhere in the recording.
- First, load the soundfile containing the damaged chorus to track 1.
- Add cue points or markers to define the boundaries of the damaged section that is to be replaced — these will act as virtual splice points. The two markers should be positioned at musically significant points — the first downbeat of a bar usually works well — and to avoid creating a glitch, the markers should be placed at zero‑crossing points relative to the waveform dynamic.
- Create a new region containing the replacement audio and add this to the adjacent track 2.
- Zoom in and align the two regions using the cue points.
- Add a ‑70dB cut to the start of track 2 to mute the unwanted audio, and at the first marker point bring up the level of track 2 to 0dB and attenuate that of track 1 to ‑70dB.
- At the second marker, reverse the procedure, restoring the volume level of track 1 to 0dB and cutting that of track 2 to ‑70dB. This should give a clean splice, though you may need to adjust the relative levels to suit the piece. If the results are unsatisfactory, try applying crossfades at the splice points.
To conclude, here's a description of an audio restoration session I conducted earlier this year, after being contacted by the well‑known Indian classical dance specialists Sheila Cove & Nick Proctor. The project was to renovate and transfer to CD rare recordings of Sheila's teacher, the respected Bharat‑Natyam guru Bala Sundari. The recordings were consumer DAT transcriptions of second‑generation reel‑to‑reel copies dating back to the early 1970s, recorded live in India on what I guess was, by modern standards, a fairly primitive mono tape recorder. The pieces featured solo and ensemble vocals with flute, violin, tanpura and various percussion including mridangam (a double‑headed barrel drum).
The first procedure in the renovation process was to transfer the recordings digitally from DAT to their own folder on the PC hard drive. As the original recordings were made in mono and then copied to the stereo DAT format, I converted the files to mono, which halved the amount of hard disk space needed and also halved the time taken by subsequent processing tasks. Next I removed the DC Offset, and then, as the final destination of the recordings was Red Book audio CD, the 48kHz tracks were converted to 44.1 kHz.
The biggest problems were with the tape noise and rumble, but fortunately there were sufficient 'silent' regions from which to take noise samples, and two passes with Sound Forge Noise Reduction produced spectacular improvements that brought the recordings back to life, showing not only a significantly greater clarity and tonal richness in the solo vocal and the accompaniment, but also revealing for the first time details of the original room acoustics, the presence of background voices, and the faint spill of traffic from the busy street outside the dance studio in which the recording had taken place. The noise reduction processing was applied to the recordings individually, after which the tracks were topped and tailed to remove unwanted space from the start and end of each. After backing up the files I then listened critically to each recording, noting the regions that required further attention (for the most part these represented discrepancies in volume level), and defined these by adding markers to the waveform display. Then the Q10 equaliser from the Waves Native Power Pack was called in to filter out the worst of the traffic rumble, minimise the recording's slightly boomy room acoustic and, finally, enhance the presence by lifting the lower mid‑range frequencies. To complete the session, the tracks were batch‑processed into stereo and passed through the Waves L1 Ultramaximizer to ensure optimum levels, and then written to Red Book audio CD using WaveLab 1.6.
You need a biggish PC and/or a lot of patience to run these applications! The reference system provided glitch‑free real‑time processing in stereo at 44.1kHz in both WaveLab and Osiris, and consisted of an Intel Pentium 200 with 48Mb RAM running Windows 95, with a 2Mb PCI graphics card, 2.5Gb hard drive space, Plasmon CDR 4240, Turtle Beach Pinnacle and Creamware TripleDAT soundcards. PCs with lower‑powered processors than a Pentium 166 will struggle to run the more demanding applications in real time, and if you have this kind of machine, you'll have to be content with slower off‑line processing. PCs equipped with SX or Cyrix 586 processors will experience difficulties in running most noise‑reduction software, because of the weak floating‑point processing capabilities of these CPUs.
- SOUND FORGE PLUG‑INS
The two Sound Forge plug‑ins featured in this article are system‑specific — they can only be used with Sound Forge and are off‑line (not real‑time) applications, although they do feature a preview function. The Noise Reduction plug‑in is superb value and includes de‑noise, de‑click and vinyl restoration modules; the Spectrum Analysis plug‑in provides a well‑featured 2D monochrome spectrum graph and a sonorgram.
Sound Forge £329; Noise Reduction Plug‑in £149; Spectrum Analysis Plug‑in £89.
- STEINBERG DENOISER AND DECLICKER PLUG‑INS
These processors offer classy performance and the joys of real‑time processing. They are available for WaveLab on the PC, and as TDM versions for Macintosh applications. WaveLab itself includes a very snazzy, rainbow‑hued 3D frequency analysis module.
Plug‑ins £299 each; WaveLab £399.
- OSIRIS AUDIO RESTORATION SUITE
This powerful, impressive application bundle for Creamware TripleDAT and Creamware Masterport provides system‑specific, top‑quality real‑time processing modules for spectrum analysis, de‑noise, de‑click, the only available budget de‑crackler, plus an exciter and low‑frequency enhancer. Why Osiris? Well, in Egyptian mythology the god Osiris was murdered and dismembered by his wicked brother Set. The widow of Osiris, the goddess Isis, tracked down all the bits and reconstructed her hubby, just as using the Osiris Audio Restoration package can help you bring back to life audio material once thought beyond revival. Neat or what?
Osiris £379; TripleDAT system £1290; Masterport £599.
Tracer Technologies' DART is the only stand‑alone noise‑reduction application, and produces good results, but lacks real‑time preview and can be rather cumbersome in use. A more user‑friendly Direct‑X/ActiveMovie version is planned for release this year, with an expected price of an affordable £49.
- WAVES NATIVE POWER PACK
The Rolls Royce of PC effects processors/mastering tools. As a Direct X/ActiveMovie plug‑in, NPP runs in Sound Forge 4.0a, WaveLab 1.6, Cakewalk Pro Audio 6 and the forthcoming Cubase VST. In addition to the Q10 equaliser, the pack also features the TrueVerb reverb, the L1 Ultramaximizer limiter, the C1 compressor/gate, the S1 stereo imager and the WaveConvert soundfile utility.
It's best practice to back up all your material after each stage of restoration, so that you'll be covered should the need arise later to revert to a previous version. The amount of drive space needed for a full CD project can be considerable — especially if it involves restoration processing and, unless you have oodles of hard drive space, it may be worth considering archiving backups to external data storage media. Backup to DAT can be slow but is otherwise adequate, although if the soundfile is saved as an audio track it will be necessary to re‑record it to the PC as a WAV file. It's here that the speed and convenience of removable media comes into its own: Iomega Jaz drives are a type of hard drive with a fast data transfer rate and use rewritable, removable disk cartridges that can store up to a gigabyte of data; recordable CD (CDR) is also very useful, with a maximum 680Mb capacity, but recording large amounts of data can be a little slow and CDR is a write‑once/read‑many (WORM) medium.
To avoid introducing distortion, aliasing and artifact noise, it is important to process audio material in the correct sequence:
- 1. Remove DC Offset.
- 2. Sample Rate Conversion.
- 3. Noise Reduction.
- 4. Normalisation.
- 5. Gain correction.
- 6. Stereo rebalancing.
- 7. Equalisation.
- 8. Dynamics processing.
- 9. Reverb processing.
- 10. Limiting.