You are here

All About Digital Audio: Part 4

Digital Tape Recording Formats By Hugh Robjohns
Published August 1998

Figure 1: In a video machine, the tape is arranged to wrap around the head drum for about 270 degrees of its circumference to ensure there is always at least one head in contact with the tape for a continuous recording.Figure 1: In a video machine, the tape is arranged to wrap around the head drum for about 270 degrees of its circumference to ensure there is always at least one head in contact with the tape for a continuous recording.

In the fourth instalment of our series on the techniques and technology behind digital audio, Hugh Robjohns looks at digital tape recording formats. This is the fourth article in a six‑part series.

In the previous parts of this series we have examined the essential building blocks of a digital system: sampling, quantising and error correction. This month, we look at some of the practicalities of recording and replaying digital audio data streams, starting with digital tape formats.

The first decision facing the designer of a digital tape recorder is whether to employ a stationary‑head transport (like a conventional analogue recorder), or a rotary‑head one (like a video machine). The former is mechanically simple and therefore relatively cheap, but has a low head‑to‑tape speed that means either a low data transfer rate, or a very high density of data on the tape (or both!). Rotary‑head machines have high head‑to‑tape speeds and therefore high data transfer rates, with lower data density on tape reducing the demands on the tape medium itself.

In the early days of digital recording, it was hard enough just getting the A‑D and D‑A stages to work properly, so it made sense to redeploy existing video recorder technology rather than develop a bespoke digital recorder from scratch. Thus, the professional CD mastering recorders used off‑the‑shelf video transports (professional three‑quarter‑inch U‑matic machines) to store digital audio data encoded as black and white dots in a standard video picture format.

However, there are a few digital tape systems that have employed the stationary‑head concept — most notably the Sony and Studer DASH‑format multitrack recorders (more on this later in this article). Mitsubishi also produced an excellent 32‑track stationary‑head recorder for a while, but ceased production several years ago. Another stationary‑head format that can be found occasionally (often in the reduced section of High Street hi‑fi outlets) is the Philips DCC — Digital Compact Cassette (see the box elsewhere in this article for more on this format).

Rotary‑Head Systems

Figure 2: The tape and transport arrangement of a typical DAT machine.Figure 2: The tape and transport arrangement of a typical DAT machine.

In a rotary‑head recorder, the tape is partially wrapped around a rotating drum that contains a number of recording/replay heads. As the drum is angled very slightly off‑vertical, the tape spirals around it and the heads trace a shallow diagonal stripe across its width. The drum rotates very quickly — around 1500rpm in the case of a video machine — which means that the head‑to‑tape speed and the data transfer rate is very high. However, the linear speed of the tape is relatively slow so that the narrow recorded stripes lie parallel to each other in turn making for economical usage of tape. See Figure 1 for a typical video head‑drum arrangement.

This approach works very well, but using video machines to record digital data is a case of serious over‑engineering! For a start, a video signal needs a signal‑to‑noise ratio of around 30dB, whereas a digital system only needs about 10dB. Second, the tape has to wrap completely around a video head‑drum to allow continuous recording whereas digital signals are inherently discontinuous. Thus, the tape loading and wrapping mechanisms can be much simpler. Third, because video systems record audio as a linear track, the transports have to be built to minimise wow and flutter — an irrelevance in a digital recorder. These 'refinements' (together with other simplifications described below) allow a bespoke rotary‑head digital recorder to be built more cheaply than a conventional video transport.

Digital Audio Tape

Figure 3: The DAT tape track format.Figure 3: The DAT tape track format.

DAT, which appeared in 1987, was one of the first systems to use dedicated rotary‑head digital recorders. The most obvious difference between DAT and video transports is that the tape wrap around the head‑drum is only 90 degrees, as opposed to the 270 degrees of a video mechanism (see Figure 2). Not only does this considerably simplify the lacing procedure, it also allows the tape to be spooled against the heads with minimal friction — a serious problem with the longer wrap of video machines. Also, the fixed heads in a video recorder, which are needed for full tape erasure, audio recording and control tracks (see below) are not required in a digital machine, saving more costs. But the head drum still contains at least two heads that take turns to record digital data whenever they come into contact with the tape (for a quarter of each drum revolution).

However, not having static heads does introduce significant problems. Video machines align their replay heads with the recorded tracks on the tape through two means: a 'control track' which marks the start of each stripe, and a 'guard band' of erased, empty tape which separates adjacent stripes to minimise crosstalk between them. The control track makes sure that each head starts in the right place, but if the tape speed is slightly wrong the head will wander off course and the signal level will fall as it enters the guard band. Thus, the signal level is used as a measure of the correct tape speed and head tracking.

DAT machines don't have control track or erase heads, and there is no guarantee that the tape will be fully erased (if an old tape is being re‑used, for example). Erasure is not a prerequisite for digital tape recording as the magnetic medium is fully saturated as North‑South or South‑North magnetic elements during recording — a new recording simply overwrites the old one. However, this means that there won't be empty guard bands between tracks, so some other method of locating and maintaining the replay heads on the correct tracks is necessary.

A relatively wide track of about 20µm (micrometres) in width and 24mm length is laid down across the tape during recording by the first head in the drum. As the second head comes around, it then records its stripe to partially overlap the first one, guaranteeing every part of the tape is recorded and that no previous recording survives. The data stripe left on the tape ends up being about 13.6µm wide (see Figure 3).

Since the record/replay heads are around 20µm wide, but the recorded stripe is only 13.6µm wide, the heads will obviously overlap adjacent tracks during replay, potentially leading to considerable crosstalk. However, a clever technique called Azimuth Recording is used to overcome this. One head is twisted 20 degrees to the left and the other 20 degrees to the right creating a 40‑degree azimuth error between them — and also between alternate recorded tracks. Although each head can replay its own recorded stripe, it picks up very little crosstalk from the adjacent tracks, obviating the need for guard bands and allowing much greater packing density on the tape. However, a mechanism is still required to align the heads in the right place to start with.

Not all of what goes on to tape when you make a digital recording is digital audio — various other bits of data are recorded too. The stereo audio data is recorded in the central portion of the stripe and is bounded on each side by a sequence of tones called ATF (Automatic Track Following). Subcode data is also recorded between the ATF and the outer edges of the stripe (more on this in the 'Subcodes' box).

ATF signals comprise a repeating sequence of relatively low‑frequency tones, with the lowest at 130kHz. Such tones are not affected by the severe head azimuths, and remain equally detectable to both heads. The tone cycle repeats every four recorded tracks (in other words, every two revolutions of the head‑drum) and the sequence is used to align the relevant head with the appropriate tape stripe, acting in much the same way as a video machine's control track. ATF signals are found on both sides of the audio data in the stripe, so should the head wander off course, they help to control the tracking.

The use of azimuth recording, overlapping tracks and ATF sequences increases the recorded data density on the tape considerably, allowing more than two hours of digital data to be stored on a relatively short length of tape. The success of these techniques has led to their becoming common in several other digital audio and video recorder formats.

Off‑Tape Monitoring & Editing

Figure 4: Confidence replay is provided by a second pair of heads on the head‑drum with separate encoding and decoding circuits.Figure 4: Confidence replay is provided by a second pair of heads on the head‑drum with separate encoding and decoding circuits.

In analogue tape recorders, off‑tape monitoring is essential, as it is the head/tape boundary which causes most audio quality problems — dirty heads, incorrect bias and so on. Listening to off‑tape playback is therefore the only way of knowing whether the recording is OK. However, in digital recorders, the tape simply stores data and, major dropouts excepted, the tape is unlikely to be the cause of quality problems. In fact, most problems originate in the A‑D conversion, such as signal levels being too low or too high. As all DAT machines provide monitoring through the complete A‑D/D‑A chain, true quality monitoring is always provided anyway.

Nevertheless, one of the biggest operational problems with a DAT recorder is knowing when it is really recording, simply because you can't see the tape going around very easily! Consequently, off‑tape monitoring is provided on some professional and semi‑pro models to give the operator the confidence that the machine is really recording. Off‑tape monitoring is achieved by a second pair of heads in the head drum which are staggered in height relative to the first pair, typically by the equivalent of six tracks (three drum revolutions). In these 'four‑head' machines, the data to be recorded passes through the record circuits, which add the error protection, interleave the data (see last month's part for more on this), and sort out the channel‑coding structure, before being committed to tape (see Figure 4). This process typically takes about 45 milliseconds.

The replay circuits take about the same time to decode the data, perform the necessary error correction and so forth. The six‑track stagger in the record and replay heads adds a further 90 milliseconds, so the total delay between input and off‑tape monitoring is a 180 milliseconds — enough to give that familiar double beat when switching between source and tape!

...most problems originate in the A‑D conversion.

Last month, I mentioned the problem of dropping in to record on a digital recorder because of the disruption to the interleaving structure. One convenient way around this problem is to make use of the four heads in the drum, but in the reverse order — that is, employ replay ahead of recording (see Figure 5). In this case, data can be retrieved from the tape, decoded, de‑interleaved and error‑corrected, before being presented to a digital crossfader in normal sequential form. Initially, the crossfader passes the original data straight back to the tape through the record circuits which re‑encode and interleave it.

Since the replay and record circuits each take 45 milliseconds to process the data, and the timing between the heads amounts to 90 milliseconds, the data should arrive back on the tape in exactly the same place as it started. The record head can thus be activated at the start of any stripe without regard to the interleave structure. The crossfader allows new audio material to be punched in and out smoothly without gaps or glitches. When the original data is being recorded back on to the tape over itself, the record head can be deactivated — a very elegant solution!

Most four‑head machines allow the heads to be configured either for confidence replay or for gapless drop‑ins but, in theory, there is no reason why a machine should not have six heads, providing both pre‑reading and confidence replay simultaneously.

Many of the concepts and technology developed for DAT are used in other rotary‑head recording formats — two more are described in the 'ADAT & DTRS' and 'Nagra D' boxes elsewhere in this article. Due to the similarities between DAT and these other formats, the boxes highlight only the majordifferences.

Stationary‑Head Formats

Figure 5: Glitch‑free drop‑ins can be made if the DAT machine has four heads to allow pre‑reading of data from the tape.Figure 5: Glitch‑free drop‑ins can be made if the DAT machine has four heads to allow pre‑reading of data from the tape.

It might seem that rotary‑head recorders rule the roost as far as digital tape recording is concerned, but there is a finite limit to the maximum data transfer rate that can be achieved, and until recently, it was not sufficient for a 24‑ or 48‑track recorder. However, stationary‑head machines with suitable tape moving quickly can be designed with sufficient transfer rates, and these have therefore become the dominant format for 24 and 48‑track applications.

There are only three common stationary‑head digital tape formats: PD, DASH, and DCC (see the separate box for more on DCC) — and only one of those is likely to be around into the next millennium. Pro‑Digi (PD) was Mitsubishi's digital multitrack format — a range of machines offering 2, 16 and 32‑tracks on quarter‑inch, half‑inch and 1‑inch tapes respectively. The 32‑track machine uses 10 physical tracks for every eight audio channels (the two extra tracks carried some of the error protection data). Thus there is a total of 40 tracks for the digital audio, plus two auxiliary data tracks, two cue tracks and a timecode track — 45 tracks in all!

Although these 32‑track machines are still widely used, particularly in the US, they are no longer being manufactured: Sony's dominance of the digital multitrack market prompted Mitsubishi's early withdrawal.

Sony's DASH‑format machines (Digital Audio Stationary Head) have become standard in many professional studios and are available in 2‑track, 24‑track and 48‑track versions. The first machines appeared in 1987 with a 2‑track using quarter‑inch tape and 24‑tracks on half‑inch, unfeasible though this may sound! Two‑track machines are not very common, but one of their marketing strengths is that they can cope with physical tape editing by way of razor blades and sticky tape, thanks to an elaborate interleaving structure. Provided that edits are at least 1.5 inches apart, the error correction can rebuild enough of the missing data to allow a smooth crossfade over the edit.

The earliest DASmachines used conventional ferrite head assemblies whose physical size limited the maximum number of tracks to 24. However, the newer machines use 'thin‑film' heads (made in the same way as integrated circuits) which has allowed the introduction of 48‑track machines. The standard multitrack DASmanages to record 24 digital tracks, two analogue guide tracks, a timecode track and a control track on half‑inch tape running at 30ips. The analogue cue tracks are on the outer edges of the tape where they protect the delicate digital tracks, and the timecode and control tracks are in the most stable region of the tape in the centre. The 48‑track version slots an additional 24 tracks in between the original 24 (see Figure 6). The latest versions of the machine also allow 24‑bit recording through an built‑in bit‑splitting system that shares the 24‑bit audio data channels across multiple tracks.

Next month, the series will continue with a look at disc‑based recording formats.

Subcodes & Timecodes

Figure 6: The 48‑track DAStape track layout. Studer have also joined Sony in manufacturing DASmachines.Figure 6: The 48‑track DAStape track layout. Studer have also joined Sony in manufacturing DASmachines.

With any digital recorder, there is always a lot more to record than just audio data. In the case of DAT, auxiliary data is grouped into two sets of subcodes: one set is recorded within the audio, and the other is recorded on the two outer edges of each stripe where it can be over‑recorded without affecting the audio data.

Auxiliary information bound in with the audio includes information such as Audio or Data modes, Emphasis status, Sampling Rate, Normal or Long‑play modes, and Copy Prohibit — all information that relates specifically to the audio. The separate subcode section carries the Start, Skip and End IDs, Program Numbers, and the various date and time stamps. Of these, A‑time (Absolute) commences at zero at the start of the tape and counts sequentially throughout the recording. P‑Time (Programme) is intended to provide a track timer from each Start ID. R‑Time (Relative) was intended for an alternative user timer, but has now been supplanted with timecode data.

While Fostex were the first to introduce SMPTE timecode on their D20 DAT machine, Sony subsequently introduced a more sophisticated system that has now become the international standard. Timecode in any frame rate is transposed into a DAT internal frame rate timecode (33.333 fps), complete with a phase marker that logs any drift between the incoming code and the sampling rate. On replay, the user can determine the output frame rate for the timecode, which is then regenerated from the DAT internal timecode with exactly the same relationship to the audio as during the recording.


The DCC tape track layout.The DCC tape track layout.

The DTRS tape format used in the Tascam DA38/88/98 modular digital multitracks (as well as the Sony PCM800) is essentially an overgrown DAT tape. It uses Hi‑8 video tape as the medium, and most of the physical dimensions are scaled up, but otherwise it is fundamentally the same as a four‑head DAT machine configured for pre‑reading. Like most digital recorders, only one data stripe is recorded by each head in the head‑drum and the eight audio channels (plus timecode) are all multiplexed within that data stream. Consequently, tapes must be 'formatted' before use (or all eight tracks recorded at once) so that the multiplexing structure is complete on the tape.

Overdubbing involves retrieving the multiplexed data stream from the tape with the pre‑read heads, de‑multiplexing and error‑correcting to produce eight discrete sequential channels. One or more channels can then be replaced with new audio material before re‑encoding, multiplexing and recording the entire re‑combined data stream back on to the tape in the same physical position as it started.

The ADAT format is much the same, but adds analogue auxiliary and timecode tracks. It also employs the much larger S‑VHS medium. DTRS seems to have become the de facto standard in television and film dubbing, as well as in music‑for‑film applications. However, the ADAT is the undisputed champion of many professional, semi‑pro and home music studios. DTRS is restricted to 16 bits per channel (although bit‑splitting boxes are available to allow a smaller number of higher‑resolution channels), whereas the latest generation of the ADAT family (the new M20, for example, which will be reviewed in SOS shortly) provides 20‑bit per channel recording.

Nagra D

Nagra, the Swiss manufacturers of highly regarded analogue open‑reel tape recorders for the music, television and film industries, were sceptical about the reliability and ruggedness of DAT, and so they developed their own rotary‑head recorder, the Nagra D. Using quarter‑inch open‑reel tape, the machine provides four audio tracks with up to 24‑bit resolution per channel, plus timecode, and is also capable of recording two tracks at 96kHz sample rates.

The machine is physically large (transportable rather than portable) and relatively expensive, but it has acquired a reputation for reliability and superb sound quality. It is commonly used for high‑resolution music recordings, as well as for feature film and big‑budget television location recording duties.

One interesting aspect of the Nagra D that other machines have yet to copy is the way it allows quality control and fault tracking by logging every recording and any replay tape faults in a 'table of contents' at the head of the tape.

Digital Compact Cassette

Digital Compact Cassette (DCC) is a domestic stereo format intended to replace the analogue compact cassette, based on a stationary‑head recorder. The basic idea was conceived at the same time as DAT in the mid‑'80s, when the two formats were referred to as R‑DAT and S‑DAT (Rotary head and Stationary head respectively). Sony's original S‑DAT proposals and Philips' DCC differ only in the details.

The tape speed is the same as that of an analogue cassette (1.875ips) which is far too slow to allow sufficient data transfer rates for a single track per audio channel. Instead, a 4:1 data reduction system is used (known as PASC — Precision Adaptive Sub‑band Coding) to minimise the amount of data that is then shared between eight parallel tape tracks. A ninth digital track carries auxiliary and subcode information such as timings and track titles (see the diagram, right).

The DCC head block incorporates three sets of heads: nine 185µm‑wide digital record heads; nine 70µm‑wide digital playback heads; and two 600µm‑wide analogue playback heads. The recording heads are more than twice the width of the playback heads so that alignment between machines is less critical, and the analogue heads allow standard compact cassettes to be replayed in the same machine.

Like the analogue cassette, the DCC is double‑sided, but cannot be physically turned over. Instead, DCC machines either have two separate head assemblies, one for each side, or one assembly that inverts for side B.

DCC, although incorporating some very advanced technology, is unlikely to succeed in the domestic marketplace because the consumer has become used to the instant access afforded by CDs and Minidiscs. For some, DCC's credibility is also marred by the use of data reduction, despite the fact that it actually sounds extremely good.