MONITORS DEMYSTIFIEDPart 1: Compromise and ApproximationPublished in SOS October 2000 Technique : Theory + Technical
If you stop to think about it though few people ever do monitors are a fantastically important part of our studios. You use these lumps of chipboard, plastic, paper and glue on every recording you make, and they colour your sound more fundamentally than pretty much anything else. Perhaps the lack of consideration they are normally given is because monitors are so simple to use. They rarely need adjustment, have no software, no internal memory, no processor, no user manual (well, at least, not one you can find), never need an update and probably haven't even got a power socket. Once a month or so you'll probably see, and no doubt even read, a monitor review in this very magazine. Typically, the reviewer will describe the design of the product, perhaps write a little about the engineering propaganda being disseminated by the manufacturer and then move on to describe some of the subjective audible characteristics that the monitor 'imprints' on recordings. This imprinting process is analogous to another situation involving a piece of equipment many of us have in our studios, which is also called a monitor, rather confusingly; the screen on your computer. If you do any graphic design and/or computer-based illustration, you'll know how easy it is to be caught out by the colour distortions that screens add to everything you draw. It's especially frustrating if you also use a printer. The screen and printer colours are very unlikely to match and neither will be nominally 'accurate'. However, in the case of colour-matching, there are calibration procedures, software patches and set-up routines that, if you have the time and patience, can be used to minimise the difference between input and output. There's nothing of this kind with monitor speakers, however, even though the basic problem matching an input signal with an output is similar. So why are speakers different? How is it that a 'simple' pair of speakers can imprint such audible characteristics on music? What are the mechanisms at play, and given the existence of these mechanisms, how do speaker designers decide how their products should sound? Do the specifications that manufacturers publish have any value? And perhaps most importantly, is it possible to be smarter in our choice and use of monitors so that we can minimise, or at least better understand, the contribution they make to the sound of our recordings? Well, yes it is, and with only the analysis of a typical, smallish nearfield monitor for a safety net (and no, I'm not going to tell you which one it is, that wouldn't be fair), I'm going to make an attempt over the next couple of issues to shed a little light on the dark, mysterious world of monitor speakers and what horrible things they might be doing to your lovingly crafted music. The Art Of Compromise It's no fun being a speaker designer your entire professional career is devoted to making the least awful sounds you can out of beautiful ones. Give speaker designers the perfectly recorded sound of a Stradivarius, a Martin, a Steinway or a voice and deep inside they'll know the very best they can do is compromise and approximate (slice a speaker designer and, like Blackpool rock, you'd probably find those two words printed right through the middle). The compromises start as soon as a new product is conceived and they arrive from three directions: first, the inconvenient fact that our ears are cleverer at listening than speakers are at speaking; second, the more inconvenient fact that users tend not to have unlimited space or cash; and third, the even more inconvenient fact that, as yet, nobody has developed a technique for telling the laws of physics who's boss. However, because they are the fundamental villains of the piece, I'll start this month with the laws of physics. Relatively speaking, music is a wide-bandwidth beast with a wide dynamic range. A speaker is limited in both bandwidth and dynamic range (the latter perhaps to a far greater degree than many realise). When speaker designers try to widen both the bandwidth and the dynamic range of their products, the laws of physics exact a heavy penalty and the listener hears it happening. Following is an explanation of the classic low-frequency bandwidth versus dynamics trade-off, illustrated by a few acoustic measurements from the typical nearfield monitor I borrowed to write this piece. Pass The Port Like many speakers we can all name (there are pictures of several dotted around this piece in case your memory needs jogging), our No-Name Acoustics nearfield comprises a couple of drive units mounted on the front surface of an 'airtight' box. The box is there to suppress the output from the rear of the bass driver so that, at low frequencies, it doesn't cancel the output from the front. In addition to this (and providing somewhere for you to put your soft-toy studio mascots), the box also fundamentally defines the low-frequency limit for the system, as the 'stiffness' of the air inside makes it harder and harder for the drive unit cone to move as the output frequency falls (the cone displacement required to generate constant acoustic power increases exponentially as you move down the frequency spectrum).
Here's why not. Firstly, the extension of low-frequency bandwidth produced by reflex loading comes at the expense of dynamic accuracy, as the amplifier now not only has to control the movement of the drive unit cone but also the air in the tube. Secondly, reflex loading causes the system to display rapid change of phase with frequency, and if you express phase change as time you can see that a reflex-loaded speaker effectively adds a delay to low frequencies. That time delay can be expressed as a distance (ie. the speed of sound multiplied by time more on this later) with the result that when speaker designers choose reflex loading they are also effectively choosing to move the bass player's fundamentals back three metres or so. And you thought kicking him was the only way to do that... Thirdly, as soon as you increase the sound level to a point where the air passing through the port becomes turbulent (as any substance does eventually when put through a pipe at a certain speed it's those dastardly laws of physics again), the air becomes a non-linear mess. At best, the port will make some odd farting noises, at worst, it effectively stops working at all. However, you can make the air continue to flow in a linear, non-turbulent fashion at higher sound levels by designing the exit surface of the port in a flared shape. Unsurprisingly, the manufacturers of the No-Name Nearfield have taken exactly this approach (see the 'Load Of Balls?' box on page 194 for another way of reducing port turbulence). Fourthly, with or without generous flaring on entry and exit, a reflex port is fundamentally non-linear. The acoustic impedance (ie. resistance to movement) of the big wide world on the outside of the box is obviously not the same as that inside the box, so the flow dynamics as the air in the port rushes outward are not the same as when it rushes inwards. As a result, the average air pressure inside the box will drift away from the nominal atmospheric pressure outside, and the bass driver's voice coil will take on an average offset away from its nominal rest position. This offset can be relatively innocuous and result only in a slight increase in low-frequency harmonic distortion, but if the driver happens to have non-linearities in its magnet system that cause an offset in the same direction, they can be completely disastrous as the cone suddenly 'locks out' at one end of its travel. I've seen this happen in a reflex-loaded bass guitar cabinet fitted with a very well-regarded American bass driver. It's quite an impressive trick, and boy does it quieten the bass player... Grasping The Graphs That gives you an idea of the theoretical compromises involved in reflex loading. Now let's see how they affect the No-Name Nearfield in practice, by examining its frequency response graphically.
The black curve on Figure 2 shows the impulse response of the No-Name over the same frequency range in other words, a plot of the excursion of the speaker cone against time when a fast click or impulse is put through it. If you're surprised that the movement of the cone is expressed on the left-hand axis in Volts, don't be speakers work because a voltage put through a moving-coil driver creates movement, after all. The negative voltages simply indicate that the cone is behind its usual rest position (which is equal to 0V). Now you know what the graph means, can you see how much difficulty the speaker has in stopping after the impulse is put through it? The characteristic frequency of the ringing overhang is equal to the resonant frequency of the reflex port. If a bass player, say, were to play a note at that frequency and stop instantaneously (we can all dream...) the speaker would add that resonant tail to his playing (in fact he wouldn't need to be playing a note at exactly the port resonance, just somewhere near it would do). The No-Name Nearfield actually has a pretty well-behaved and well-damped port, but in extreme circumstances, where a highly resonant port has been chosen, the port resonant frequency overhang can begin to affect the accuracy with which a speaker reproduces pitch. It can make the pitch of things sound slightly hazy, if not definitely out of tune. And you thought it was that bass player trying his fretless... Again, there's a green overlay in Figure 2 showing the time-domain behaviour of the No-Name with the port blocked up. Notice how much better the cone stops moving? Figure 3 is a graph of delay in milliseconds plotted against frequency in Hertz. The black curve illustrates the added time delay of the No-Name at low frequencies (this is known as the 'group delay' and is phase change expressed as time, as explained earlier). Again, a green overlay shows what happens without the port. Comparing the black line with the green, and taking a specific example, you can see from the graph that a bottom E at 44Hz with the port occurs over 10 milliseconds later than without it. Since the speed of sound is 330 metres per second, this delay is equivalent to moving bottom E back over three metres (as mentioned earlier). However, the real-world specific audibility of such low-frequency time delay effects is a subject of much discussion and argument among speaker folk and hi-fi reviewers. I sit firmly on the fence, and I've included the group delay graph primarily to illustrate the complexity of the issue (and so that I could do that gag about the bass player's fundamentals). There are so many factors that influence the perception of low-frequency performance that it's pretty much impossible to identify one and be sure of its guilt. I think there's little doubt though that the choice of a low-frequency response that seriously distorts the dynamic, temporal and even pitch information present in the signal, all in the name of a little more bandwidth, is not the right one for a monitoring tool. Perhaps the most interesting measurement on the low-frequency end of the No-Name Nearfield especially in the context of a speaker's dynamic range failings is illustrated in Figure 4. This graph of output level against frequency shows the differing frequency response at two different (but fairly low) drive levels. This time the green curve has nothing to do with the presence of the port or not it simply represents the response of the ported No-Name at a higher drive level than the black curve. This shows the compression introduced predominantly by the port, as the two curves should be the same. The difference is only about 1dB from 30 to 60Hz but then the green curve's drive level wasn't particularly high either. At higher levels I would expect the port compression effect to become quickly more obvious. And if the No-Name didn't have a reasonably well-flared port exit I'd expect the port compression to become, if not specifically audible, then certainly a significant influence on the way compression was used at a mix. Now, you probably read that paragraph and felt cheated because I said it was interesting. Bear with me here's why. The No-Name was lent to me by its owner (brave man), and one of his subjective feelings about the product is that it sounds a little 'bass-light' until it's being driven reasonably hard. Now that looks to be at odds with Figure 4, which shows the low-frequency level decreasing as the product is driven harder. So what's going on? Well, it's hard to say exactly without carrying out a lot more measurements, but based on my experience with other monitors, I am in a position to speculate why the readings I have taken seem at odds with the owner's carefully-formed, long-term opinion. Firstly, it's well known that, far from being flat, the frequency response of the human ear is level-dependent at both frequency extremes. Secondly, the actual situation is far more complex than can be revealed by one or two measurements. It's quite possible, for example, that after initially falling with increasing level, the response of the No-Name develops a resonant peak as its port starts to misbehave more seriously. Or perhaps the resonant low-frequency energy that the speaker begins to generate as other compression effects and non-linearities begin to unfold is perceived as the missing bass. Whatever the underlying mechanisms for the subjective feeling, measurements of the No-Name reveal that it has a pretty well-behaved and sensible low-frequency response. So, rather than drive the speakers into non-linear behaviour in order to feel they are working properly, it might be better to change their position in the room (nearer to rear wall or corner) so that they sound 'right' at lower levels, which produces more linear behaviour. In turn, this 'linear' solution is likelier to result in monitoring that more accurately reflects the recording and the mix. Tips For The Ported All of the above might read like a character assassination of any speaker with a reflex port, but that's really not my intention. Thoughtfully and carefully conceived reflex loading (and I'd include the No-Name in that category) can work well, but the technique by its very nature has characteristics that it pays to be aware of. So, if you have a pair, how best to work with reflex-loaded monitors? Maybe try a few of the following little experiments.
Have a good critical listen to some simple low-frequency material recorded at different levels. Maybe record a bass guitar or keyboard piece especially for the purpose. Does the quality and character of the sound change alarmingly with level? If it does, you can bring this knowledge to bear when you record, or more likely when you mix. Put a sock in it literally. If you have a pair of passive reflex-loaded monitors you'll do no harm just having a listen to how they behave with the ports blocked. The measurements on the No-Name Nearfield illustrate the fundamental change that blocking the port can bring about on the low-frequency behaviour of the system. If you have a bass problem on a mix, blocking the port is as useful as trying a different monitor or even room perhaps more so, because you're only changing one variable. Don't however, just listen to bass level there's bound to be less bass with the port blocked. Listen instead to how the bass character changes with level, and to how clearly you can identify the pitch of the bass notes in each case with port blocked and unblocked. Next month, we'll explore more deep and dark speaker mysteries, this time further up the bandwidth, including the complete tosh that is the typical 'frequency response' curve. Now, where's my anorak got to...? Published in SOS October 2000 | Saturday 21st November 2009 December 2009
Click image for Contents
Other recent issues: Photos too small? Click on photos, screenshots and diagrams in articles to open a Larger View gallery. |