There has been a lot of speculation, hype, misinformation, and general confusion over the next generation of high-quality audio replay medium. Most of us have probably been waiting expectantly for something along the lines of the Digital Versatile Disc, or DVD, but the extraordinary level of politics between the various hardware and software companies involved, and the difficulties of ensuring copyright protection, seem to have slowed any progress on the audio-only version of the DVD discs down to a snail's pace. As a medium for pre-recorded videos and computer games, DVD may already be gaining ground -- especially in Japan and America -- but there is still little sign of an agreement on a dedicated high-resolution audio format at the moment.
Even if an audio-only DVD agreement were reached speedily, there would still be a number of very important and practical issues to address. Perhaps the key one of these is the non-trivial problem (for the retailing industry) of having to maintain a double inventory of music titles (existing Red Book CDs alongside new DVDs) during the transitional phase -- which could easily continue for many years.
Although DVD players may be designed to replay Red Book CDs, thus providing backwards compatibility with existing music collections, any new DVD audio titles can't be played in conventional CD players, so new music releases would have to be
| "The idea behind the Super Audio Compact Disc has been to combine technology already developed for the DVD format with that of conventional CDs." |
These and many more potential problems inherent in the introduction of a new music format are widely recognised, of course -- not least by an International Steering Committee (ISC) set up by the Record Industry Association of America (RIAA), the Record Industry Association of Japan (RIAJ), and the European International Federation of the Phonographic Industry (IFPI). This steering committee has already specified that any new high-density audio disc should provide the means for copyright protection and anti-piracy measures, should be capable of storing data and video as well as audio, should support both stereo and multi-channel surround sound formats, and should be compatible at a practical level with existing CD players. This last point is the biggest problem for DVD, but it's essential because there are something like 500 million CD players around the world and already 10 billion compact discs!
Although this seems a challenging set of criteria, Philips and Sony announced last year their joint intention to develop a 'next-generation' music carrier to replace the CD, and their proposal addresses every point in the ISC's requirements, including compatibility with Red Book CD players. The proposed new music carrier combines 44.1kHz/16 bit audio with a high-resolution (and multi-channel) audio format using Sony's proprietary Direct Stream Digital system. Sony and Philips have christened their new disc format the Super Audio Compact Disc (SACD) and have been able to demonstrate every aspect of the new technology in near-production prototypes.
|
The fundamental idea behind the SACD has been to combine much of the technology already developed for the DVD format (in which Sony and Philips both played significant parts) with that of conventional CDs, producing a hybrid disc that will play perfectly in conventional CD players, as well as offering enhanced-quality replay in future DVD-derived players (see Figures 1 and 2)
As both of these two diagrams show, the SACD is a dual-layer disc, manufactured in much the same way as a dual-layer DVD. Two polycarbonate substrates, each 0.6 mm thick, are glued together to form a disc which is identical in size to a conventional CD. The upper substrate is injection moulded (a process usually referred to as stamping), with the pits and flats representing standard Red Book CD audio data. The lower (high-density) layer is moulded with much finer 'pits' and flats, alowing it to carry substantially more data in the same physical space (thereby providing the necessary capacity for very high-resolution audio, multi-channel audio, data, graphics, and video).
After the separate injection-moulding processes, the upper disc substrate is sputtered (coated) with a highly reflective metal layer, a protective lacquer, and eventually the screen printing for the disc artwork and title -- all in exactly the same way as a conventional CD. Meanwhile, the lower layer is sputtered with a special semi-transmissive coating (which lends a slightly golden sheen to the finished disc). It is this very special layer which enables the standard CD laser to see the Red Book layer, but allows the DVD laser to see the high-density layer, so it's absolutely crucial to the entire operation.
Most conventional CDs use aluminium as the reflective metal layer, but with the special semi-transmissive layer in front of it, the overall laser reflectivity can fall below the minimum 70% figure demanded by the Red Book CD specification. Suitable alternative metals include gold, silver and copper, with gold being the preferred choice because it is very stable (although relatively expensive). Silver is currently being used on some CD-Rs, to good effect, and copper is also proving its worth on video CDs at the moment. Currently, all four metals are undergoing testing to determine which is most suitable for the hybrid disc format and a final decision has yet to be taken.
Following stamping and sputtering, the two half-discs are brought together such that the information side of the high-density layer is bonded to the plain side of the CD layer (see cut-away diagram), and so sits in the centre of the completed assembly. An optically transparent UV bonding process cements the two halves together, and the finished disc is then tested, packed, and shipped.
Experimental small-quantity production runs have shown that the cost of manufacturing these hybrid discs is only slightly higher than the manufacturing cost of dual-layer DVDs at present, with similar yield and production cycle times. Philips expects production costs to fall to the same (or even lower) rates as dual-layer DVDs with normal production volumes, and all the necessary technology to manufacture SACDs is available and working now. Indeed, several major pressing plants are already successfully producing dual-layer DVDs and could easily fabricate SACDs on the same equipment.
An SACD player will contain two laser diodes -- a standard one (780 nanometre, infra-red) to play existing Red Book CDs, and another (of 650nm, visible-red light) to read the high-density layer. The semi-transmissive layer is designed to be effectively transparent to the 780nm infra-red laser, only reducing light transmission by a few percent, so the laser beam of standard CD players will be able to focus on the Red Book layer and from the reflected light (or lack of it) extract the encoded 16-bit/44.1kHz audio data in the normal way, completely unhindered by the presence of the high-density layer.
However, roughly 25% of the 650nm visible red laser light is reflected by the semi-transmissive layer separating the high- and normal-density substrates, and although this doesn't sound like much it is apparently sufficient for suitable players to reliably extract the data contained on the high-density pressing.
The high-density layer provides almost seven times the storage capacity of a conventional CD (an increase from around 650Mb to 4.7Gb), due
| "There's growing evidence to suggest that the audio spectrum above 20kHz plays a far greater role in perceived quality than was first thought." |
To be able to 'see' these smaller pits, the laser has to operate with a shorter wavelength (650 nanometres, as opposed to 780) and the objective lens in the pick-up's optics has to have an increased Numerical Aperture (NA) of 0.6 (as opposed to the standard CD player optics of 0.45). This increased NA reduces the system's tolerance to discswhich are not completely flat, but the problem can be overcome by reducing the thickness of the disc, which is why the DVD format was designed with 0.6mm substrates. However, a 0.6mm disc is not sufficiently robust on its own, so two 0.6mm discs have to be glued together to retain the rigidity of a conventional disc, but with the data stored at the centre of the disc, so that it's only 0.6mm away from the lens, instead of the 1.2mm of a standard disc (once again, see Figure 2).
Besides reducing the physical dimensions of the stamped pits, several other techniques are employed to help increase the data capacity of the high-density layer. For example, by reducing globally the acceptable error margins in the production of the high-density substrate (compared with a conventional CD), substantial gains can be made in data capacity -- and most modern production plants are perfectly able to meet the tightened tolerances required, as are the majority of current disc-replay mechanisms. Further gains come through improved error-correction strategies which have been developed to take advantage of the increased processing power available in today's generation of decoder chips (as compared with what was available at the original launch of the CD format in the early '80s).
The last degree of increased capacity comes from the fact that data is allowed to start at a slightly smaller radius from the centre of the disc than a conventional CD (24mm instead of 25 mm). The table in Figure 3 shows the specifications of the two types of disc side by side for comparison purposes, and the chart in Figure 4 shows exactly where SACD's seven-fold gain in storage capacity over standard CDs comes from.
Sony and Philips' proposal for the high-resolution element of the Super Audio CD involves Sony's own Direct Stream Digital audio format (see 'Direct Stream Digital' box). This proprietary system uses a 1-bit data stream at a very high sampling rate and a slightly greater data rate than the more widely known 24-bit/96kHz high-resolution systems, over which Sony claims a number of significant advantages. Amongst these are a very wide audio bandwidth and a dynamic range in excess of 120dB -- a figure chosen as being better than the capabilities of even the very best analogue recording console. From the engineering point of view, DSD also offers simplified A-D and D-A conversion processes, improved accuracy of equalisation and dynamics processing, and greatly reduced digital processing delays.
The sampling rate employed in current DSD prototypes is 2.8224 MHz (64 times higher than normal CDs) -- a figure chosen partly to allow relatively simple sample-rate conversion from DSD recordings to conventional multi-bit PCM (Pulse Code Modulation digital recording). The DSD rate can be synchronously converted to any of the current industry-standard sampling rates -- 32, 44.1, or 48kHz -- in a process performed by Sony's Super Bit Mapping system (see 'Super Bit Mapping Direct' box).
Sony have advocated using DSD for master recordings and archiving for some time, with their arguments centring on the system's ability to capture and preserve the ultra-sonic elements of music. Although it's not directly audible, there's growing evidence to suggest that the audio spectrum above 20kHz plays a far greater role in perceived quality than was first thought, and it has also been linked with improving imaging accuracy and the overall naturalness of recordings. The DSD system also provides benefits in the form of increased dynamic range equivalent to, or greater than, that of a 20-bit system, with all the advantages of low-level linearity that such resolution entails.
|
A further level of technology is applied to the data stored in the high-density layer of the Hybrid Disc -- a 'lossless' data-reduction stategy. Current data-reduction systems, like Sony's ATRAC, the various MPEG systems, and Dolby's AC3 are all 'lossy', which means that they remove some data for ever. These systems all use psychoacoustic principles to decide which data may be permanently removed, on the basis that they are judged to represent 'inaudible' elements of the sound -- always a controversial subject!
Developed by Philips, and originally intended for computer data applications, Direct Stream Transfer (DST) provides a 2:1 reduction in the audio data in a totally lossless way -- on replay, the original data can be reconstructed in full, with no errors or omissions. In its application in the SACD, the DST system has been optimised for use with audio signals, which typically have a far greater level of repetition and predictability in them than would computer data.
The use of Direct Stream Transfer is important to the hybrid disc proposal, as it allows sufficient capacity within the high-density layer to store not one but two complete 74-minute versions of audio material. The intention is to combine a stereo DSD track and a 6-channel surround DSD track -- plus various data, text, graphics, and even video signals, all on the single high-density layer!
Copyright piracy and illegal duplication is seen as a major problem by the record companies, and any new music carrier would have to have extensive copyright protection measures for it to be accepted by the record industry. From their point of view, the ideal system would involve some kind of non-removable watermarking system providing copyright data encrypted within the disc, which would be interrogated by every replay machine. Only suitably coded discs would be accepted and played. This kind of system could
| "I was unable to spot any reliable differences between DSD and 24-bit/96kHz, both of which gave pin-point imaging and had very natural, believable acoustics." |
Embedding various forms of 'pilot tones' or 'pseudo-random' copyright data within the audio signal itself has often been tried in the past, but most systems have fallen at the first post, either because 'golden-eared' listeners have been able to detect their presence, or it has been too easy to strip the copyright data out. So the important question is "How can copyright data be embedded in the format in such a way that it does not interfere with the high-resolution audio, yet remains virtually impossible to strip out or duplicate?"
Sony and Philips' very clever solution to the problem is actually a form of 'Digital Watermarking' which stores the required copyright data as a modulation of the width of the injection moulded 'pits' on the disc substrate itself. As such, it is virtually impossible to replicate the copyright data without the specially designed (and carefully licensed) glass-mastering equipment used to make the original disc stampers.
Another clever feature of the system is that the modulation of the pit widths can be synchronised on consecutive turns of the disc so that faint visible patterns can be formed on the disc itself, perhaps displaying recognisable words or graphics -- true watermarking in the conventional sense!
The newly developed technology to encode this copyright protection data has been called Pit Signal Processing (PSP), and it works by modulating the power of the laser used to record data onto the glass master at the pressing plant. If laser power is increased, the size of the focused beam increases, and so too does the width of the resulting mark. However, if that was all the process involved the length of the mark would also be affected, and since its length is critical to the meaning of the encoded audio data, some further cleverness is required!
When the PSP equipment is installed in a pressing plant, a temporary feedback loop is introduced between the PSP unit and the quality-checking system inspecting finished discs. This feedback loop effectively 'teaches' the PSP unit how to optimise its laser modulation parameters to take into account the vagaries of the specific injection-moulding process used at the plant. The result of the teaching phase is that the lengths of marks cut into the substrate are controlled extremely accurately, ensuring that the required width modulation to encode the copyright information does not introduce unwanted length variations.
There's a very welcome spin-off from this increased precision in the lengths of the marks. The 'Eight-to-Fourteen' modulation (EFM) used to encode audio data onto a CD master allows the pits (and the spaces in between) to vary in length in integer values between 3 and 11 units. However, slight errors in length, while not sufficient to represent a different integer value (and therefore misrepresent the data representing the size of an audio sample), cause jitter -- variations in the data timing. The normal tolerances of glass mastering and injection moulding can easily create such variations in data timing, which can potentially cause blurring of the stereo image and an increase in high-frequency noise. Indeed, this is believed to be one of the reasons why different batches of the same CD can appear to sound different, or why the same CD can sound different when replayed in CD machines from different manufacturers (which may be more or less susceptible to disc jitter). The very tight control on pit length exercised by the PSP system actually reduces pressing-induced jitter by at least a factor of two, and potentially much more. So the introduction of the PSP system could mean that discs manufactured by different presses will become much more consistent, and the aspects of audio quality affected by the injection moulding process could be improved considerably.
As well as providing a visible 'watermark' on the playing surface of the disc itself, the width modulation of pits can be used to store all manner of irremovable identification codes for the country of origin, mastering house and pressing plant identification codes, glass master matrix number, disc ISRC catalogue numbers, and so forth.
|
Working prototype players fitted with copyright-protection systems have already been demonstrated with a selection of suitably encoded experimental discs. If a correctly encoded disc is inserted, the player behaves perfectly normally, but when a non-encoded disc is inserted, the machine plays the disc for a few seconds while looking for the copyright data. When none are found, replay is stopped and the disc automatically ejected.
The Sony and Philips' proposal is that only the high-density layer of the hybrid disc should be encoded with copyright protection data of this form (essentially so that the legal replay of older Red Book discs is not affected), and that all hybrid players be equipped with suitable PSP detection circuitry by law.
Every aspect of the Sony and Philips hybrid disc proposal is currently realisable. Most of the elements have already reached production maturity, and the remaining ones are in advanced prototype form. With suitable support, it would appear that the hybrid disc could hit the shops in a very short time indeed... so what's stopping this from happening?
Both the consumer and the retailer would surely support the hybrid disc, almost by default from the point of view of compatibility with current Red Book players and the advantages of a single inventory. Indeed, hybrid discs could probably be introduced to the market with few consumers even noticing any difference! The digital watermarking and sophisticated copyright protection strategies would surely be enthusiastically supported by the record companies and licensed pressing plants, and doubtless the reduction in mould-induced jitter would be a marketing advantage as far as the golden-eared hi-fi fraternity are concerned.
The only potential stumbling block I can see in the hybrid SACD proposal is the implicit adoption of Sony's Direct Stream Digital data encoding. Since its launch (an AES paper was presented on the basics of the system in 1991), DSD has maintained a relatively low profile, yet during the time of its development the whole professional and consumer audio industry has been persuaded of the advantages of 24-bit/96kHz PCM systems. Indeed, there are already a large number of A-Ds, D-As, DASH recorders, signal processors, and mixing consoles which can accommodate 24-bit PCM signals at 44.1 and 48kHz sampling rates. There are even a quite a few digital tape and disk recorders available now, as well as one or two converters, which offer 96kHz sampling (or even 192kHz in some cases).
However, DSD actually has a larger data rate than 24-bit/96kHz (2.8224Mbits/s as opposed to 2.304Mbits/s) and therefore potentially greater fidelity. DSD also has some claimed significant advantages, such as avoiding the need for the decimation and oversampling filters required in any conventional PCM format. There's still some experimental and development work to be done in signal processing 1-bit data streams, to allow, say, the design and construction of a suitable digital mixer, but pioneering work is currently being carried out to solve these problems (not least by the Advanced Technologies Division of Sony Broadcast & Professional). All the indications are that by taking a radical approach to the problem, everything we have come to expect of an analogue or multi-bit digital mixer is perfectly achievable with DSD audio data -- and potentially with greater accuracy than either of the existing technologies (see 'DSD' box).
Perhaps the industry will be swayed when professional DSD equipment becomes available, but there is already a lot of enthusiam for the system from the growing number of studios and engineers who have experience of the prototype systems currently 'doing the rounds' in the UK and Europe.
The first commercial CD release of DSD-recorded (and SBM Direct-converted) material was an album featuring the guitar and flute of Joe Beck and Ali Ryerson, called Alto (dmp CD-521). At the Super Audio CD presentation at last year's AES convention in New York, an invited audience auditioned the original DSD recordings of tracks from this album, and compared them with 24-bit/96kHz, 16-bit/44.1kHz, and the SBM-Direct 16-bit/44.1kHz versions.
| "Copyright piracy and illegal duplication is seen as a major problem by the record companies, and any new music carrier would have to have extensive copyright protection measures for it to be accepted by the record industry." |
Under the inherent limitations of time and familiarity, I was unable to spot any reliable differences between DSD and 24-bit/96kHz, both of which gave pin-point imaging and had very natural, believable acoustics. However, the comparison between DSD and the simple reduction to 16-bit/44.1kHz revealed very obvious differences, particularly in a huge loss of air and space within the acoustic environment, and an 'aggressive' character to the overall sound quality.
The SBM-Direct version also suffered much of the absence of space and air in the recorded acoustic environment, but probably not to quite the same degree, and there was none of the hardness associated with the simple PCM reduction. These subjective observations are entirely in keeping with what could theoretically be expected of the demonstration, and I was left with the impression that DSD is at least as good as 24-bit/96kHz, and probably better. Indeed, many of the professional engineers I've talked with who have been able to compare live sound through a desk with DSD and 24/96 recordings suggest DSD is slightly better. The fact that the SBM-Direct system appears to do such a good job of getting much of the apparently enhanced resolution of the original DSD recording onto a conventional CD format is also noteworthy.
In the past I've gone on record as being highly sceptical of the high-sampling rate lobby, simply because the few examples of 96kHz equipment I had auditioned failed to convince me of any significant aural benefits. However, after a number of recent demonstrations involving purpose-designed, state-of-the-art equipment, I'm now happy to change my mind and admit that there is life above 20kHz! Moreover, it really does seem to make a difference to the music and the illusion of reality if that difference can be captured properly.
Although both 24-bit/96kHz and DSD appear to be able to achieve an extra degree of realism in music recordings, and therefore advance the cause of high-resolution audio recording, only time will tell which format will be adopted by the industry as the standard for the new millennium. My opinion is swaying in favour of the DSD approach, mainly because of its simpler engineering principles and the fact that it is offering new opportunities beyond the confines of the traditional PCM world we have become so used to.
If DSD can win the high-resolution argument, the Super Audio CD has a very rosy future ahead of it, and I think we'll all benefit from that situation. On the other hand, if DSD is not adopted by the professional audio industry, and the audio-only DVD becomes a success, double inventories and incompatible disks will be the unattractive future of consumer music formats!