Loudspeakers are traditionally designed for either monitor or hi-fi applications, but is there actually any difference? And just how suitable are hi-fi speakers for nearfield monitoring in the project studio? In part one this month, we examine measured responses.
"Can I use my hi-fi speakers as nearfield monitors?" It's a question often asked of Sound On Sound, but it's hard to answer, unless you're the kind of person that's satisfied with the reply 'it depends'. To be a little more precise (but not a lot), it depends on the type of material you're recording, your aspirations for it, the specific hi-fi speakers you have (the term can cover a multitude of interpretations) and the recording environment you're working in.
Of course, at one level, you can use any old speakers as monitors, nearfield or otherwise, but whatever you use, you can guarantee one thing: the characteristics and idiosyncrasies of your chosen monitors will colour the sound of everything you record and mix. And if you've used speakers that demonstrate extremes, they may well leave the results of all your efforts sounding far from as intended when your music is played back on any other system.
So monitoring is a vital part of the recording chain — but that doesn't answer the hi-fi speakers question. Even at the end of this two-part feature the answer is likely to be a shade of grey rather than black or white. But through a technical analysis and investigation of four speakers — two that carry the 'hi-fi' label and two the 'pro' — I'm going to try and discover a little about the typical differences between them. And perhaps that question won't then be quite so hard to answer.
The four quivering, apprehensive guinea-pig speakers are, in the 'hi-fi' corner, the B&W DM303 and Wharfedale Diamond 8.2, and, in the 'pro' corner, the Dynaudio BM5 and the KRK K‑Rok. Both the B&W and the Wharfedale were launched recently, and both have been pretty well reviewed in the hi-fi press and very widely distributed. The Dynaudio and KRK have been established in the market for much longer (and indeed both have been reviewed in Sound On Sound before — see KRK SOS June 1995 and Dynaudio SOS June 1996 respectively).
The longer-established nature of the 'pro' models perhaps reflects the less rapid turnover of new models in the 'pro' sector, and are, similarly, widely used products from well-respected manufacturers. In fact, there are some strong similarities between all four. They're all of similar physical size, and are all passive two-way systems. They all feature 6.5-inch bass/mid drivers and one-inch dome tweeters. And they all use reflex bass loading, a design technique used to increase the perceived amount of bass a speaker can produce (for more on this, see part one of my previous article on loudspeakers in SOS October 2000 issue).
However, there's one significant difference between our guinea pigs — price. The 'hi-fi' speakers will set you back around half the cost of the 'pro' monitors. Perhaps this explains why the nearfield question comes up so regularly in the SOS Forum?
The archetypal studio nearfield monitor, Yamaha's NS10M, is well known to have (put politely) a 'characterful' tonal balance. The general consensus seems to be that it is uneven through the mid-range and too bright at the top (hence the commonly employed trick of hanging tissue paper over the tweeters to calm it down). So if the Yamaha's balance is questionable, what should we be looking for in the perceived tonal balance of a nearfield monitor?
In two words, we should be looking for the 'neutral average' — neither too bright nor too dull. If the aspirations we have for our work are that it should sound tonally acceptable on the widest range of systems out there, from hi-fi systems costing many thousands to AM radio (or its low-bit-rate MP3 equivalent), then the perceived tonal balance of our nearfield monitor, which will be used up close and probably with its back to the wall, should be as close to the 'population average' as possible. The same is probably true of a hi-fi speaker, because in balance terms, the hi-fi speaker is nothing but a monitor that's used a few stages down the line (ie. by the consumer of the music you're mixing).
A neutral tonal balance, however, is not the same thing as a flat axial frequency response — this is a common mistake. The perceived tonal balance of a speaker is the combination of the direct sound from the drivers and reflected sound from nearby surfaces, and, as speakers are directional devices (especially at high frequencies), a flat axial frequency response doesn't mean that a loudspeaker will necessarily sound that way. But we can work backwards, and predict generally how the axial response might look for different types of loudspeakers if they are to sound neutral. Firstly, a hi-fi speaker that's positioned perhaps four metres away from the listener and is mounted a little away from the wall on a floor stand should have a subtly different response shape compared to a nearfield monitor that's maybe one to two metres away and mounted on a wall. Where the nearfield should demonstrate a slightly up-shelved response at a few hundred Hertz combined with a slow roll-off at either frequency extreme, the hi-fi speaker, unless it has been specifically balanced for use against a wall, should probably be closer to flat.
Secondly, if a nearfield monitor is to be used in a small room, where strong reflections from the side walls will reach the ear within a few milliseconds, the shape of the horizontal off-axis response is vital too. Wild variations between on and off-axis responses are well-known to result in perceived tonal imbalances, so if the monitor is not to sound unnaturally coloured, its off-axis response should be as close to a gently down-tilted version of the axial response as possible. This is one very good reason why landscape-mounted nearfield monitors tend to be a bad idea — the horizontal off-axis response from laterally adjacent drivers will almost certainly display major discontinuities through the region where the drivers' outputs overlap.
But, as I'm probably too fond of writing, there's more to a speaker than its simple frequency response, and there's a few other characteristics that a loudspeaker that aims for accuracy ought to possess. And while we'll concentrate this month on frequency-response issues, next month we'll look at power-handling and compression effects.
The term 'nearfield monitor' was an invention of the early '80s. It just about predates the explosive rise of the home and project studio and was originally the term applied to auxiliary monitors that sat on the meterbridge in large commercial studios, and were supposed to reflect the sound of typical home audio or TV speakers.
One speaker originally defined the breed: the Auratone 5C. The Auratone was, and is, little more than a five-inch 'full-range' driver screwed into a small cube-shaped enclosure. It had little pretence to audio accuracy or wide bandwidth, and was simply intended to provide a reference for the likely sound of recordings when reproduced on an AM radio, or via a TV. So the Auratone was not really a 'nearfield' in the sense that we understand the term now, but it did set a precedent for auxiliary monitors, and prepared the ground for the second nearfield icon — the Yamaha NS10M.
The early '80s also saw the rise of freelance 'celebrity' engineers, and I suspect it was one or two of these, carrying a few items of favoured gear from studio to studio, that first introduced the NS10M to the world. As studios began to realise that equipping with favoured gear helped to attract the celebrities, NS10Ms began to pop up everywhere, taking up a position on the meterbridge next to the Auratones. The role of the NS10M, however, was not to mimic the low-fi performance of a TV speaker, but to offer a level of performance and sound that reflected that of a domestic hi-fi. In fact, I believe the Yamaha was derived from a domestic hi-fi product — which, in the context of the question asked at the start of this article, is perhaps significant. But the NS10M had something more going for it. Probably by accident, it displayed a pretty characterful tonal balance and this perhaps helped it become the nearfield benchmark, as material mixed on NS10Ms sounded 'wrong' on anything else. The balance of the NS10Ms also resulted in many a discussion about the exact brand of tissue paper that should be draped over the tweeter in order to dull the balance a little. So despite becoming the industry-standard nearfield monitor, the NS10 has always provoked derogatory mutterings about its sound and tonal balance. What's more, the niche it opened up was soon crowded by countless competitors.
We now live in different times. The huge studios, if not quite heading the way of the dinosaurs (there'll always be a role for recording spaces the size of tennis courts, and mixers that could do with a Burger King at the halfway point), have long been under threat from small-scale recording spaces and control rooms. And being very much smaller than of old, the typical control room now has little space for vast main monitors. These days, the nearfield has had a promotion. More often than not, it's now out on its own, the top dog.
The frequency-response measurement of each of our guinea-pig speakers shows scant evidence of subtle tailoring to suit different roles. The curves are primarily dominated by the particular strengths and weaknesses of the drivers used, the size and proportions of their front panels and the low-frequency roll-off shapes chosen. I say primarily dominated because, particularly in the design of the hi-fi speakers, there does tend to be a certain amount of balance-tweaking, engineered to help ensure that a speaker reviews well and sounds competitive in the retail environment. But there's certainly no obvious split here between 'hi-fi' and 'pro' in terms of frequency response.
Figures 1 to 4, which appear over the course of the next couple of pages, illustrate the on-axis and 30-degree off-axis frequency response of each of the four speakers. Figures 1b to 4b additionally show a 'waterfall' plot indicating how each speaker's response decays with time. The waterfall is a 3D representation of the speakers' response to a wide-bandwidth signal that stops instantaneously. The Y-axis is level in dB, the X-axis is frequency from 200Hz to 20kHz, and time runs from back to front on the Z-axis (measurement constraints limit the length of time window available as frequency falls). The waterfall plots therefore illustrate how good the speakers are at switching off, and the Z-axis plot for a notional 'perfect speaker' would be empty. Any signals within the plot occurring after zero time represent the decay of mechanical or acoustic resonances. This is output that the speaker adds to the intended signal, colouring the sound and effectively degrading the signal-to-noise ratio.
The curves were generated through a combination of direct acoustic measurement and, below 150Hz, prediction and synthesis from the measured electroacoustic parameters (see the box on measuring low frequencies at the end of this article for more on this subject). The vertical scale of each diagram shows sound pressure level in dB measured at a distance of one metre, and the curves are calibrated for a nominal 1W input into 8Ω. Each curve reveals some interesting characteristics of the speaker in question.
Figure 1, the graph of the B&W DM303, is probably the tidiest basic frequency response of the four and, along with that of the Dynaudio BM5, has a shape that should be pretty well suited to nearfield use in a small control room. The response is characterised by an early but gentle bass roll-off and well controlled, wide dispersion — the tweeter is only 6dB or so down at 20kHz. The B&W also demonstrates a definite lump in its response between 4kHz and 8kHz, however. This will be audible, particularly as emphasis on sibilance and cymbals, and is in a critical region as far as balance perception is concerned. It will also tend to give the speaker an explicit, detailed kind of sound; I wonder if it may be an intentional part of the 'voicing' of a hi-fi product.
The B&W's tidy frequency response is accompanied by a similarly good waterfall plot (see Figure 1b). There's very little to note which suggests that, apart from the enclosure panel resonance effects that all these speakers will display, the DM303 will make little contribution of its own to the sound. For a hi-fi speaker, the DM303 is a fine nearfield monitor — at least in terms of its frequency response.
Figure 2, the graph of the KRK K‑Rok on the previous page, could hardly be more different. Two features dominate. First, the bass-response shape chosen is far more extended than that of the B&W, but it is bass extension at the expense of a resonant peak at 70Hz and consequently of transient behaviour. The K‑Rok will play 70Hz for many milliseconds after a signal around that frequency has stopped. In a small control room, with the speakers positioned close to a wall, bass on the K‑Roks is likely to be significantly emphasised. Certainly, the added low-frequency bandwidth will mean that bass signals all but inaudible on the B&W will be heard, but its resonant nature is likely to result in bass-light mixes.
The second interesting feature of the K‑Roks' response is a sharp notch at 750Hz — more on this in the following paragraph. The rest is relatively tidy, with good dispersion control up to 7kHz, although with a fast decay thereafter.
The KRK's waterfall plot, Figure 2b, is also quite a contrast to that of the B&W. The response notch at 750Hz is revealed as a strong and persistent resonance. The resonance is almost certainly the result of a mechanical mismatch between the bass/mid driver cone and its rubber surround, and may be audible as a distinctive mid-range character — one that we'd all be likely to try to suppress during recording or mixing. There's also more general 'hash' in the K‑Rok's waterfall, suggesting a generally higher 'noise floor' and less ability to resolve low-level detail. In terms of frequency response, it's hard to see where the 'professional' K‑Rok is a better nearfield monitor than the 'hi-fi' DM303.
Figure 3, the plot of the Wharfedale Diamond 8.2, demonstrates another neat frequency response. However, an upper-bass to lower-mid emphasis, while not making it a bad speaker, perhaps rules out the 8.2 for monitoring duties if it likely to be positioned close to a rear wall. Speakers demonstrate an effect analogous to the proximity effect with microphones. If a speaker is positioned close to a solid boundary (for example the wall behind), its natural tendency towards omnidirectional dispersion at low frequencies and narrow dispersion at higher frequencies will mean that only lower frequencies will be reflected forward and add to the perceived output. So a speaker such as the Wharfedale that already has an emphasis below a few hundred Hertz will begin to sound tonally unbalanced. And a neutral perceived tonal balance is one of our vital criteria for a monitor. The Wharfedale also has a discontinuity in its off-axis response at 12kHz that suggests a not entirely well-behaved tweeter. Generally the 8.2's down-tilted balance would probably result in over-bright mixes.
The Wharfedale's waterfall plot has no obvious dominant problems, but is still less good than the exemplary B&W. There's evidence of some cone/surround mismatch problems around 1kHz but the energy decays quickly, and, although it's likely to add some character, it's relatively benign.
The final frequency response curve, Figure 4, is that of the Dynaudio BM5, and again there's a couple of features to note (see below). Firstly, I've included two 30-degree off-axis curves to reflect the fact that the BM5's tweeter is not placed symmetrically in the cabinet. The differences between the two off-axis curves are perhaps not as significant as they might look, as the mechanism of off-axis measurement (rotation of the speaker on a turntable) moves the tweeter either closer to or further from the microphone in addition to changing its angle. The sharp dip in the response at just over 4kHz is not a measurement artifact, however, and reveals that the integration of woofer and tweeter is less good on one side of the speaker. BM5s should be used with their tweeters facing inwards, but they are then at risk of coloration effects from tonally diverse side-wall reflections. Another feature of the BM5 is the sharp lump and dip in response between 4kHz and 5kHz. This feature, one that I've seen demonstrated by other Dynaudio bass/mid drivers, is, I suspect, a consequence of their unusual external voice-coil construction.
The response discontinuity shows up on Figure 4b, the waterfall plot of the BM5, but in contrast to the K‑Rok's resonance, the energy decays very quickly, so it's likely to be reasonably benign — although not completely inaudible. The BM5 waterfall also shows some evidence of a driver problem in the 1kHz region (maybe you're getting the message that speaker designers face a continual struggle against cone/surround resonance in the 1kHz region), and this could add some audible character.
So, in terms of frequency response measurement, there's no obvious split between 'pro' and 'hi-fi' among these four speakers. In fact, if anything, they fall into two rather different groups, with the B&W and Dynaudio offering a balance appropriate for nearfield use in small rooms (coincidentally, the Dynaudio and B&W have low-frequency response shapes so similar they could almost be a pair), while the Wharfedale and KRK offer something more suited to listening at a greater distance in larger rooms. The B&W is also the best in terms of resonant behaviour.
In Part 2, I'll be giving the guinea pigs a harder time, by looking at some of the power-handling and compression issues that might sort the 'pro' from the 'hi-fi'. So while 'hi-fi', in the shape of the B&W, looks to have a lead at this stage, don't count any chickens just yet!
It is notoriously difficult to measure the low-frequency response of a loudspeaker with any accuracy. With wavelengths measured in many metres and room effects dominating, the chances of a measurement microphone capturing any truly reliable data are remote — even in an anechoic chamber. However, the electroacoustic parameters of a speaker that define its low-frequency characteristics (box volume, driver compliance, magnetic flux density, and so on) can be derived from its input impedance, then analysed in terms of classical analogue filter theory, and finally turned into a frequency-response prediction. With care in measurement and analysis, predictions can be accurate to within 0.5dB, which is far more accurate and reliable than readings taken in all but the largest anechoic chambers. The low-frequency sections of the response curves published here were generated via such a technique and then spliced at 150Hz onto directly measured frequency-response data. Although the splicing technique doesn't produce definitively accurate results, in the case of similarly dimensioned speakers like these four, it is accurate enough for valid comparison.