Danish company TC Electronic have been at the forefront in developing new and more realistic reverb algorithms. Most recently, they've been putting their energies into overcoming the particular problems associated with artificial reverb for surround sound. Thomas Lund, TC's Programme Manager, explains their approach to Paul White.
TC Electronic have just launched a high‑end, multi‑channel reverb processor that uses elaborate ray‑tracing and wavefront techniques to allow both sound sources and listeners to be positioned within a virtual room generated from the dimensions and reflective characteristics of a real concert hall. Operating at up to 96kHz, the System 6000 is aimed towards music, film/post‑production, broadcast and mastering applications with an emphasis on surround. Apparently the System 6000's algorithms are based on a multi‑channel version of TC's VSS reverb technology developed for the M3000 (VSS‑5.1). This generates five sets of early reflections per sound source along with five uncorrelated reverb diffused fields. Four‑band Dynamics processing algorithms are also available to run on System 6000 as well as more conventional effect algorithms such as Chorus, Phaser, EQ and Delay.
At the heart of the system is a 2U rackmounting mainframe containing the DSP core, PSU and I/O options, all controlled by a desktop remote controller known as the TC ICON. The ICON features a touch‑sensitive colour display plus motorised faders, and because the standard TCP/IP communications protocol is used, the systems may be networked or even controlled remotely via the Internet. The DSP‑6000 card provides eight channels of digital I/O (through four pairs AES‑EBU) and word clock input; the mainframe can also house up to three 24/96 analogue 2‑channel I/O cards and/or additional digital I/O.
One of the key feature of the System 6000's reverb design is that several sound sources can be placed at different positions, and while natural reverb tends to dilute the ear's ability to resolve location, the TC system actually seems to enhance it. This is particularly effective when mixing for surround. At the moment, it isn't possible to change the sound source positions dynamically, because of the astronomical amount of real‑time processing power this would take, but as we'll see later, some dynamic elements are planned for the near future. On a trip to TC's headquarters in Denmark, I was fortunate enough to be able to speak with Thomas Lund, TC's Programme Manager, who has been closely involved throughout the evolution of the System 6000 project.
How does the TC approach to reverb generation differ from that used by other companies?
"TC reverbs traditionally haven't downplayed the role of early reflections. We believe in first creating a model powerful enough to render natural rooms, with all their good and bad aspects. We like ugly also! From there on it's easy to scale the number of reflections down for particular rooms, if you want them to have less identity or to exaggerate their better characteristics for example, but we have enough processing power and memory to use a maximal model so we also can get quite close to nature.
"Enhanced localisation is another reason for our focus on source‑specific early reflection patterns. Power‑panning relying on the creation of phantom images produces very variable and uneven localisation results outside a narrow listening sweet spot. We believe this is a big problem for the proliferation of the 5.1 concept. Localisation is one of the means to directly engage and influence the listener emotionally, and without improvements in that area, the incitement to put up with more speakers and new reproduction equipment may not exist with the average consumer. In home theatres the 5.1 setup has to add significantly to the illusion presented by the picture to be a justified purchase."
In real life, an early reflection is never just a clean echo — it is spectrally filtered by the wall material and smeared by diffusion. Did you always take account of this?
"No, not always. We would have liked to do it even in our old reverbs but there simply wasn't enough processing power available. The early models used filtering, pan and level of the individual taps in the pattern. From Rev 3 in the M5000 we added diffusion and from M3000 upwards, we included more and more directivity and more sophisticated diffusion to the patterns while also adding to the number of reflections."
Is diffusion like a mini reverb around each delay tap?
"You could see it like that. We have several layers of diffusion now, but in the real world, you don't go from the direct sound, through early reflections and then different layers of diffusion — everything happens progressively. It's a constantly evolving, chaotic process, so there are many shades of grey. You can't even identify the different stages in most real rooms. We have made a model where we can go through some of these different shades of grey — not all of them — to give us a smoother, more natural reverb."
TC Electronic and Lexicon choose to generate the early reflections and the later chaotic reverb tail using entirely different processes, whereas in real life the late decay is the result of chaotic interaction between the early reflections and the room boundaries. What is the advantage of doing it your way, rather than the original Schroeder/Moore approach of recirculating the early reflections via networks of comb‑ and all‑pass filters?
"Actually, the reason was to get rid of as many unpleasant artifacts as possible and still maintain precise control. Generating the reverb tail from the early reflections can result in there being some correlated components between the channels, which produce problems when you hit the mono button. Even if you feed the later reverb from the early part of the early reflections, it's still not possible to make it completely chaotic while maintaining controllability. What we're trying to do is create the illusion of a real or super‑natural room. We believe the most manageable way is how we do it now because this is a model that lets us create reflection patterns with a very high directional resolution, not limited to the five or seven speakers it's meant to be reproduced through. This makes it highly scalable.
"We have some chaotic behaviour built into the mathematics of the later part of our reverb, so we don't have to use modulation, but both the late‑reverb and early‑reflections generators are fed from the dry signal. We have to be very careful to make the early and late parts of the reverb blend well together, and that's one of the areas where the process of fine‑tuning the variables within the algorithms comes in. Even though the end user has a limited number of parameters available, our engineers in this algorithm have to control over 3500 of them. A lot of programming goes into making these parameters interact in a meaningful way and form a first meta‑layer consisting of roughly 300 parameters. The user interface consists of a second meta‑layer manipulating the first."
I would imagine that a further advantage of generating the early and late reflections separately is that you can process them separately, or even delay the late reflections with respect to the early reflections if artistic needs dictate?
"Yes, and that's a feature many users like. We've had a lot of discussions in‑house about how we wish to define pre‑delay — when we have a strong focus on early reflections, the traditional way of defining pre‑delay is that you don't have any response from the room simulator for a given number of milliseconds (the pre‑delay time). In our case, we have two parameters, a conventional pre‑delay and also a reverb delay that acts only on the diffused field. We have a quite flexible model that can create around 100 different reflections per source from something like 18 different directions, then these are processed to fit the reproduction system so it sounds right over a five‑speaker or seven‑speaker system. Also, we can downplay certain reflections, such as those in the median plane, while emphasising the lateral ones or vice versa."
Unlike yourselves, other manufacturers sometimes add modulation to make the decay more chaotic, but in real life, modulation occurs only when either the listener or sound source is in motion.
"Our algorithm decay is smooth and chaotic without it being necessary to add any modulation, because for an instrument like a piano that doesn't move, modulation would not be natural. If you want to add modulation for artistic reasons, however, you can do that in two different ways. We can modulate the room or the diffusion parameters of the room, which creates a slight pitch‑shift. We can also modulate the early reflections if we want to, but we generally modulate the diffused part of the reverb. With the M3000, we added another modulation which can be applied at the end of the chain, and this we call space modulation. This can be used to simulate a natural room where there is some air circulating, which will modulate the acoustic properties to some extent. This uses very slow LFOs and small modulation depths, but it gets very obvious if it's pronounced. All these modulation options are available within our multi‑channel algorithm, though we've added some new modulation functions for the System 6000. In the M3000 there are 12 different LFOs and random generators so we can modulate cells individually, or we can modulate them with a given phase relationship between them to produce a change in the stereo imaging. In the System 6000, we've taken this idea further. The user doesn't get to control all these modulators individually but instead, we have preset combinations that affect the stereo imaging in different ways. The user only has to pick the style of modulation and the amount. For example, you may want to emulate a good old reverb from the early '80s that has a chorused type of sound that suits a Fender piano. At other times, you may want to work with a grand piano that doesn't need modulation. The modulation between channels can either be correlated or uncorrelated and in the System 6000, we can have up to 12 modulators for each of five audio channels if we want them. Alternatively, we can distribute the same modulators between all the channels.
"In some scenarios, in order to add more width in the late part of the reverberation, many of the algorithm's parameters are identical for each channel. It's only tiny differences in the reverb cells and the way they're built that makes them uncorrelated, and the longer the decay, the more uncorrelated the sound becomes."
Location Is Everything
I've noticed that in real situations, strong early reflections can actually impede accurate localisation, whereas with your system, the reverb seems to reinforce the position of the dry sound. How does that work?
"The multi‑directional model accounts for much of that, and it seems that just having a number of reflections rendered from different directions, then making them a good fit to the speaker system does a lot of the work. We get a lot of free information from our ray‑tracing and wavefront‑simulation approach, then this is further refined when the algorithms are fine‑tuned by hand. Some of our tuners are amazing. For instance the talented opera singer, Ulrik Heise, who has a wonderful ear for how it feels actually being in a particular room. We constantly learn a lot from their refinements of our more scientific models. Sometimes we will exaggerate what they find, whereas at other times we can make make the model very close to being in the real space. It can be an advantage to improve on the natural perception of localisation, because in a 5.1 setup, it's not acceptable to have a very narrow sweet spot."
You've pre‑empted my next question, because I'd like to know how you can avoid the situation where somebody sitting closer to one of the speakers gets a completely misleading set of audio cues?
"Actually, we create some of the reflections taking into account that not all the listeners are in the same spot. In some of the examples we have tried, even with the listener positioned outside the circle of speakers, there's still some kind of impression of where we want the sound to be coming from. Perceptual refinement of physical models can be turned into DSP code to provide a sound engineer more artistic freedom than power‑panning alone does.
"Because human judgements can be taken directly into account in DSP‑based localisation and room modelling, we often find the results to be more convincing than the intuitive approach of placing multiple microphones in a real room. Imaging and room properties may actually sound more natural than if a certain event is subjected to the compromises of distant miking, imperfect rooms, arrays of microphones and finally the end listener's speaker configuration. Localisation can include directions not readily obtainable using discrete microphone techniques, and room colour and geometry is subject to engineer manipulation, leaving space for artistic freedom beyond the constraints of natural environments. We also have different filters that allow the reflections to be treated to make them sound more natural or more abstract, and we can also limit the amplitude of reflections to avoid the situation where somebody is sitting close to a speaker carrying a very strong reflection. Each of the reflections includes very comprehensive EQ parameters so we have a lot of control beyond the diffusion if we need it. It's actually quite a complicated process, so in some of the algorithms where there are a lot of reflections, some compromises have to be made to stay within the limits of the available DSP power. We've created what we think is a rather maximal model so we can scale it down slightly when we need to."
I said at the outset that you were limited to creating a number of discrete virtual sound sources, each with its own pattern of early reflections, but is there any way you can introduce virtual source movement?
"As it stands, each algorithm can take in four discrete source positions, so with a full System 6000 frame, that means a maximum of eight different positions. We believe that later this year, we'll be able to optimise the algorithms to the point where we can get 12 discrete positions at the same time. Another thing that will be coming out later in the year is a dynamic model. It will allow free movement of up to eight sources at once per engine using the touch screen, joysticks or whatever, but to conserve processing power, the algorithm will have a less comprehensive early‑reflection model."
Is the impression of localisation in your multi‑source model due entirely to the pattern of early reflections, or does the late reverberation also play a part?
"The early reflections are a big part of it, of course, but even when we turn off the early reflections, the late reverb still provides some directional information. We have a reverb feed matrix where you can feed the uncorrelated, diffused field from different delays where there are five delays per source. There are also five different equalisers and five level controls, so that we can use the precedence effect to add more spatial cues. It doesn't give much of a localisation effect compared with that of the early reflections, but it helps."
In a real room, by the time the late reverberation develops, it is so chaotic that there is little if any directional information left, so presumably this is another one of your techniques for giving nature a helping hand?
"Yes it is, and of course the early reflections are what really create the impression of the type of room. In a small room, there is no separate perception of early reflections and late reverb — you just get a sense of coloration. Presently we are at a stage with 5.1 room simulation where we present the user with a number of what we believe to be useful tools to make people look at digital room simulation in new ways."
Pitch Shifting For Surround Sound
"We have developed a System 6000 package for pitch‑shifting, both stereo and surround, designed for film and TV as well as music. One of the requirements is to be able to do 24‑ to 25‑frame transfers at a high audio quality, though we can also do larger amounts of pitch‑shift with things like delay and feedback to create special effects. In the stereo version, the algorithm can either run as true stereo or as dual mono where there are two different splice regimes, which allows the pitch‑shifting to be optimised for each track, but at the expense of phase integrity. Alternatively, the two channels can be processed as stereo to maintain the phase between the two channels.
"In 5.1 mode, the LFE (low frequency) would often be processed separately because absolute sub‑bass phase integrity isn't that important if the signal is not also used in the main channels. This allows us to optimise the algorithm used in this channel for processing low‑frequency material."
Do you still use an intelligent cut‑and‑splice system like most other pitch‑shifters?
"Yes, the principle is the same, but we have more processing power to help the software find good splicing points. Any pitch‑shifting introduces a small processing delay, so we are able to tell the user exactly what that is so that they can account for it. There are two main parameters that affect the way the splice 'brains' work. One sets the amount of variation you want to allow in the size of the splicing window and the other sets the maximum splice length — a typical splice is around 20mS long, but it has to be related to the lowest frequency you're processing. If you're going down to 50Hz, 20mS could be appropriate. A very rhythmic splicing pattern will sound annoying, but by setting the algorithm parameters to match the material, better results can be achieved."
A common problem with basic pitch‑shifters is that transients are often doubled as they are used twice when pitch‑shifting upwards. In listening tests with your unit, the sound seemed very smooth with no sign of this problem. Do you use any special safeguards against this happening?
"Yes. The basic splicing algorithm is optimised for polyphonic signals, but you can give it other priorities.Also, the crossfade between consecutive splices is very short compared to most pitch‑shifters. With older systems where the best splicing point couldn't be predicted so accurately, you had to use a longer crossfade to hide it, and that caused other artifacts."
We've talked about matching the splice points to maintain stereo integrity, but how does it work when you want to process a full surround signal?
"We have different ways of weighting the channels. Normally the system weights the left/right and front centre speakers 3dB above the surrounds when it works out the best splice points, but you can change that if you need to. You might only want to look at the centre speaker (as that's where most of the dialogue is) or you may want to look mainly at the left/right front. There's also an unlocked mode where the centre speaker is treated independently — in some cases the centre speaker will be all dialogue while the music will be carried by the left and right speakers, so you don't care so much about absolute phase relationship between the dialogue and the music. The main thing is that the user can choose."
5.1 Multi‑band Dynamics
Because the System 6000 is a DSP‑based processing engine, it can be programmed to do other things than reverberation, such as dynamic control: "Reverb is clearly the number one priority, but because it is a multi‑channel, 96kHz processor, we can also run additional algorithms. We have done multi‑band compression for a while, so we have a multi‑band 5.1 expander/compressor already. There are different ways to control the side‑chains, but as with stereo, it's important that the channels can be linked to prevent image shifts. The multi‑band system is followed by brick‑wall limiters, so if you only need sample‑accurate limiting, you don't have to employ compression or gain optimisation at all. As supplied, the system will handle reverb with both stereo and surround dynamics available as add‑on packages. There's also EQ and a 5.1 toolbox to perform calibration tasks, down‑mix, bass management and so on, so the software will be able to undertake a number of mastering functions, and it's likely that we will add other mastering and restoration software, such as denoising."