You are here

All About Digital Audio: Part 6

Making Connections & Digital Clocking By Hugh Robjohns
Published October 1998

Figure 1: During replay each machine uses its internal clock, but on recording refers to the incoming clock associated with the digital audio data.Figure 1: During replay each machine uses its internal clock, but on recording refers to the incoming clock associated with the digital audio data.

PART 6: In the final instalment of our series on the techniques and technology of digital audio, Hugh Robjohns contemplates plugging it all together. This is the last article in a six‑part series.

We have at last reached the fun and frustrating part of digital audio, plugging it all together! Having considered the fundamentals of digital audio, and the technology of the various tape and disc formats, we can now think about interconnections and constructing an all‑digital signal path. The key to a successful and trouble free set up is in the clocking — an often neglected element — but let's start at the beginning.

To interconnect equipment, we need an interface — and, in true audio industry tradition, rather than getting to grips with one established standard, we have to cope with at least nine! The first and most basic (and some would say the most accurate and reliable!) is Sony's SDIF2 interface, a dual‑channel plus separate clock format (see Common Digital Interfaces box), but we also have AES‑EBU, S/PDIF, Toslink, and Yamaha's Y2 format. These are all stereo or two‑channel interfaces. For multichannel applications MADI, ADAT, TDIF1 interfaces are the most common.

Needless to say, none of these interfaces are directly interchangeable with any of the others. With the exception of interconnecting AES‑EBU and S/PDIF interfaces (which can usually be persuaded to work to a useful degree with little more than a suitable 'bodge lead'), digital interface conversion needs a lot more than a cable with suitable plugs on each end. Channel coding, voltage levels, bit sequences, numbers of bits, auxiliary data, status flags and control data all has to be translated or transcoded, requiring semi‑intelligent signal processing in a dedicated format converter such as Otari's UFC24 or Spectral's Translator for multichannel applications, or something like Audio Design's stereo digital format converter. It is also worth bearing in mind that this data involves frequency components of many megahertz — not dissimilar from television signals — and so needs rather different care and attention to conventional analogue audio signals.

Clocking is the most crucial aspect of any digital installation, and an area all too often neglected.

In any practical installation, it is highly likely that several different interfaces will have to be used: perhaps S/PDIF or AES‑EBU for stereo sources, and TDIF or ADAT Optical for multichannel devices. It is sensible to purchase equipment with compatible interfaces to minimise the number of format conversions which have to be made, and invest in decent and appropriate cabling for digital signals. It also pays to keep cable lengths as short as practicable but most important of all, work out a stable and reliable clocking arrangement with the best master reference you can afford because this defines the potential quality of the entire system.


Figure 2: A word clock howlround! The workstation is configured to use the DAT as its clock source which is fine until the DAT is switched to record...Figure 2: A word clock howlround! The workstation is configured to use the DAT as its clock source which is fine until the DAT is switched to record...

Clocking — the correct application of word clock signals — is the most crucial aspect of any digital installation, and an area all too often neglected. With the increasing number of affordable digital sound desks, and the growing trend towards digital audio workstations, interfacing equipment digitally is now commonplace. However, I have encountered many systems which are unreliable, constantly need to be reconfigured, produce random clicks and splats, and suffer timing or synchronisation problems with MIDI or timecode‑based devices.

Plugging digital audio equipment together may seem as trivial as interconnecting analogue equipment, and sometimes it is, but there are some critical underlying technicalities waiting to catch out the uninformed.

In transferring digital audio between equipment, the receiving equipment has to be able to interpret the data correctly and part of that involves understanding where each data bit belongs in a complete sample, which sample belongs to which channel, and what the precise sampling rate is. The key to this interpretation is the word clock signal — a metronome if you like, which beats out the data timing and enables each bit to be understood correctly. The more elaborate digital interfaces incorporate word clock within a single data stream as part of the channel coding, while other systems rely on separate cables to convey clocking signals.

A Simple System

Figure 3: Multiple sources connected to a digital mixer must share a common clock source. Machines recording the output of the desk can make use of their digital inputs as a clock reference during playback.Figure 3: Multiple sources connected to a digital mixer must share a common clock source. Machines recording the output of the desk can make use of their digital inputs as a clock reference during playback.

Perhaps the simplest system to consider would be a DAT recorder connected to a digital audio workstation via S/PDIF or Toslink interfaces. The DAT is used to load material into the workstation for editing, and to record the finished mix from it. Since the channel coding of S/PDIF and Toslink incorporates a word clock signal appropriate to the audio data, word clock routing is effectively built in to the interconnections. When loading material into the workstation the latter can synchronise itself to the incoming clocks from the DAT, and when dumping a mix out to the DAT, clocks from the workstation are present with the data to lock the DAT. All nice, simple, and probably completely automatic.

There are potential snags, however. Workstations typically have numerous options hidden away in menus somewhere. There will almost certainly be a section determining the word clock source which will include an internal reference, one or more external digital inputs, perhaps a dedicated word clock input, and maybe an automatic mode. The last uses the internal clock during replay and when recording via internal A‑D converters, but selects the clock derived from a digital input when recording from external sources.

However intelligent these automatic modes may be, an inappropriate selection will result in clicks, splats, or worse! Consider a workstation permanently configured to use only its internal clock. During editing, mixing and replay all is well, but when loading data from the DAT there will be occasional clicks and splats — often erroneously put down to errors and glitches on the DAT recording. The problem here is that although both DAT and workstation have their internal clocks set to the same nominal value, say 44.1kHz, they are very unlikely to be running at exactly the same rate. One might be 44101Hz and the other 44100Hz. Even with this timing difference between the machines, many thousands of data samples will be correctly transferred without any obvious problems at all. However, as the timing drifts further apart eventually some data will be misinterpreted or lost altogether, resulting in an occasional click or splat. Depending on the audio material these data errors may go unnoticed — slow sustained piano music or a steady 1kHz tone is the best test of such problems. Indeed, I would strongly recommend recording a test tone through any newly configured digital interfacing to prove the clocking arrangements are working correctly.

A more dramatic clocking error, but one which is almost as common, is to force the workstation to permanently clock from the S/PDIF input connected to the DAT. Loading material into the workstation is fine, and provided the DAT is left switched on during editing and mixing operations, they will work satisfactorily too. The problem comes when dumping a completed mix back to the DAT. The workstation looks to the DAT for its clocking reference, but during recording from a digital source, the DAT automatically clocks from its digital input — the workstation. So each machine tries to clock from the other and end up in a mad spiral of digital chaos!

At best, the entire system will crash, hopefully with helpful warning indications on the two machines. Alternatively, there could be an ear‑splitting howl as the digital data is grossly misunderstood — but at least these two situations quickly draw attention to the problem. The worst scenario is that the clock rate might gradually rise or fall until some range limit is reached by one or other piece of equipment. This is a nasty result as it is surprisingly easy to miss and you may only become aware of the problem later on when replaying the DAT, only to find it seeming to run too fast, slow, or not at all!

Digital Mixers

Figure 4: In an ideal installation, a master word clock source is distributed to all equipment via distribution amplifiers. Tidy, functional, reliable and hard to get wrong!Figure 4: In an ideal installation, a master word clock source is distributed to all equipment via distribution amplifiers. Tidy, functional, reliable and hard to get wrong!

If this extremely simple setup makes you think twice about the complexities of clocking, how bad can it be to try to build a larger digital system around a digital desk like Yamaha's O3D, with numerous digital sources like CD, DAT, outboard A‑Ds, workstation and ADAT machines? Well, the simple answer is that it could be a complete nightmare, but with a little care and planning, the system can be made to work perfectly reliably without too much difficulty.

The basic problem when a digital mixer becomes involved is in ensuring each source is synchronised correctly with the other sources and the mixer itself. In order to mix two or more signals together, their data must arrive at the mixer at exactly the same time — bit for bit — and that means they must all be clocked from the same master.

Despite their importance, master clock units are surprisingly rare...

Most manufacturers of multitrack digital recorders recommend using their recorder as the master clock source (ie. set to internal clock) to a digital mixer, and there will be a facility within the mixer to select its reference word clock in much the same way as the workstation described earlier. Options will include an internal clock, an external reference word clock input, or any of the digital inputs (usually in pairs). Once the mixer is clocked, the other devices in the system can be assigned word clocks.

Stereo recorders connected to the mixer's digital output will automatically receive the correct clocking via their S/PDIF, AES‑EBU or Toslink interfaces. Digital sources such as outboard A‑D converters and CD players will need separate word clock feeds, usually derived from one of the word clock output sockets (normally a BNC connector) on the digital mixer.

And this brings me to a couple of practical points to beware. Not all DAT machines (even supposedly professional models) can be word clocked from their digital inputs during replay, and very few CD players are equipped with an external clock input. The DAT problem can easily be checked by replaying a known good recording or tone whilst listening very carefully for occasional clicks or splats — if there are none, the DAT is being clocked correctly. (You can prove the point by disconnecting the digital input, when occasional but regular clicks should be heard). If you do not happen to have a CD player which can be externally clocked, DO NOT be tempted to make it the master system clock reference (in the deluded hope of avoiding the problem!).

The majority of domestic and semi‑pro CD players have relatively unstable and inaccurate clocks, and although acceptable in a stand‑alone situation you are asking for major trouble using a £250 CD player as the reference for a multi‑thousand pound digital recording setup!

There are two realistic solutions to the CD problem. Either invest in a sample‑rate converter which can be clocked from the desk to provide accurately timed data from a free‑running CD player, or connect the analogue outputs from the CD player to the analogue inputs of the mixer (or an external A‑D converter suitably referenced to the master word clock). Think of analogue interconnection as a 'universal sample rate converter' because with the quality of modern
A‑D and D‑A converters, the analogue route is usually the quickest, easiest, cheapest, and most pragmatic solution — and no one will know it's not digital if you don't tell them! One or other of these techniques would also be required if you wished to be able to varispeed a digital source, or had to accommodate a wacky sampling rate.

You may feel a glowing sense of satisfaction from configuring a sensible clocking system with the digital multitrack as the master clock source, but life may still be far from perfect. Most digital multitracks won't record from their digital inputs without an external clock source, so you may have to continually reselect clocks between the desk during recording and the multitrack during mixing (so what happens when overdubbing?).

But even if you are prepared to put up with the hassle of that, there is the problem of jitter associated with unstable clocks to think about (see box). I have come across precious few pieces of digital equipment with internal clocks sufficiently capable of jitter‑free operation to allow working above 16‑bit resolution (and some barely allow that!). Clock your 20‑bit recorder from most 'budget' digital desks and the resulting conversion to analogue for monitoring will rarely exceed 16‑bit resolution, no matter how many bits the converter claims to employ! So, we need a better clocking solution if you want your investment in a digital desk to be capable of taking full advantage of 20‑bit (and higher) resolution recorders.

Master Clocks

Figure 5: Cable capacitance can introduce uncertainty in clock timing on S/PDIF data streams.Figure 5: Cable capacitance can introduce uncertainty in clock timing on S/PDIF data streams.

The key to a happy life with any digital setup is to invest in a decent master word clock generator, distribute it to every digital machine (which will accept external reference clock signals), configure the system once, and get on with the far more pleasurable business of making and recording music. A master word clock generator and the associated distribution amplifier may initially seem an expensive luxury, but they will more than pay back in terms of better system reliability, ease of use, and higher quality recordings, as well as allowing easy future upgrading.

Master clock generators are available to various accuracies defined as a frequency stability in parts per million. A Grade 1 (AES definition) clock has a long term accuracy of +/‑1ppm (part per million) and a Grade 2, 10ppm. For comparison, the IEC specifications for domestic digital equipment are: Level I (highest accuracy) +/‑50ppm, Level II (normal) +/‑1000ppm, and Level III requires a calendar — hence my earlier comments about not referencing a digital system to a domestic CD player! Some well‑known digital mixers are specified as employing only IEC Level II internal clocks, so watch out! For most applications a Grade 2 master clock would be perfectly sufficient, provided you make sure it can accept a more accurate external reference (such as stable video) to allow upgrading for 22‑ or 24‑bit recorders.

Despite their importance, master clock units are surprisingly rare in the pages of pro‑audio distributors' catalogues. However, suitable units are available from specialist manufacturers including Aardvark, Audio Design and Probel. HHB Communications and The UK Office are among the few distributors who supply and advise on suitable units.

Practical Installation

It is really a matter of common sense. Avoid 'daisy‑chaining' word clock from one piece of equipment to another as unacceptable delays can build up towards the end of even short chains. Instead, use star‑distribution, ideally with proper digital distribution amplifiers providing buffered word clocks to each piece of equipment.

With all equipment switched to reference from this master word clock everything will be accurately clocked and stable, and it will not be necessary to change the clocking source on the mixer for recording, mixing or overdubbing ever again! Furthermore, changing from 44.1 to 48kHz is simply a case of resetting the master clock source — everything else should update automatically.

The Future

I hope you have found this series on the technology and techniques of digital audio informative. The analogue audio industry started around 100 years ago and will no doubt be with us for a while yet. However, I find it quite amazing to think that digital audio has become such a dominant feature of the industry in under 20 years. There are still debates over how to improve things, and not everyone likes the nature of digital recording, but it is certainly here to stay and by understanding the underlying technology, you will be able to use it to the full.

Common Digital Interfaces

It is really not necessary to know much about the technical detail of the common interfaces to use them effectively, but should you feel the need, The Digital Interface Handbook is very comprehensive (Rumsey/Watkinson, Focal Press, ISBN 0‑240‑51333‑9).

The most common stereo or dual‑channel interfaces are SDIF‑2, the trio of AES‑EBU related formats, and Yamaha's Y‑series. SDIF (Sony Digital InterFace) is the oldest and simplest, using three BNCs to carry data — one for each channel and the third for word clock. The word clock connector from this interface is retained on virtually every digital machine. Both the mono and stereo Yamaha Y‑interfaces employ an 8‑pin DIN socket with separate pins carrying balanced word clock and balanced digital audio. There is also a facility to enable data transmission only when a suitable destination is detected. The two versions are compatible (Y2‑Y1: left only to a mono socket; Y1‑Y2: mono on the left channel only to a stereo socket).

The AES‑EBU trio share similar structures, but differ just enough to make interconnection between the formats unreliable! The AES‑EBU system uses XLRs and carries a balanced signal encoding data, word clock, and lots of signalling and auxiliary data. S/PDIF is the domestic version which runs as a smaller, unbalanced signal on phono connectors. In fact, the domestic version is better defined than the professional AES‑EBU system. For example, it can carry CD and DAT track information where the professional system cannot. Not all XLR connectors labelled 'Digital' are AES‑EBU format — many just carry S/PDIF data balanced for transmission along longer cables! A short phono‑to‑XLR lead will usually allow functional interconnections to be made between pro and semipro gear.

Toslink (Toshiba Optical Link) is an S/PDIF signal causing an LED to flash on and off very quickly. The light passes down a length of plastic pipe and at the receiving end an opto‑sensor regenerates the S/PDIF signal. Although Toslink avoids electrical interference and ground loop problems, it is not a panacea — cheap optical leads can introduce a lot of jitter to the data signal and only short runs are practicable.

The most common multichannel interfaces are the optical ADAT system, which is a re‑engineered version of Toslink carrying eight channels, and the TDIF format for the DA88 family of machines using 25‑way D‑Sub connectors. MADI (Multichannel Digital Audio Interface) is used on big open‑reel digital multitracks and carries up to 56 channels. It is usually connected as a single BNC‑terminated cable for audio data with a second for word clock, but optical versions are also available.

The Jitter Problem

Jitter is not really a problem in a properly engineered system, but can create havoc in a poor installation. Jitter refers to the digital equivalent of wow and flutter — tiny timing variations in the data stream or clocking signals which cause samples to be encoded or decoded at the wrong time. The effect is not dissimilar to taking incorrect sample measurements at the right time, and produces quantising errors which typically manifest as high frequency noise, and unstable or vague stereo imaging.

Jitter can be caused in two ways: either by an erratic clock generator, or by passing a data stream through inferior or excessively long cables (including fibre‑optics). The first cause is obvious, the second applies to channel coded interfaces like the AES‑EBU and S/PDIF where the word clock is embodied within the data stream. Cable capacitance reduces high frequencies resulting in rounded corners and sloping edges to the original square wave signals. Since the word clock timing is defined by the midpoint of the edges between data bits, any move away from vertical creates timing uncertainty and thus jitter.

Since the effect of jitter is similar to quantisation errors, it follows that jitter can be a limiting factor to bit resolution: the higher the jitter, the lower the resolution. A 20‑bit recorder clocked from a jittery source may only be able to resolve 15 or 16 bits — the rest will effectively be carrying noise! Hence the importance of a very stable master clock in high‑resolution systems.

Jitter can be removed by re‑clocking the data to a more accurate word clock, and in any case is of little concern within a digital data chain provided it is not excessive. It only becomes a problem when signals are converted to or from analogue (during recording and auditioning) when the timing errors convert to pseudo‑random amplitude errors with potentially audible effects.