For more hints, tips and problem-solving visit the SOS Forum at www.soundonsound.com/forum
I think formants have something to do with speech and distinguishing one person speaking from another, but I've noticed some synths now advertise 'formant filters'. What are they for and do they have anything to do with synthesizing speech?
SOS contributor Len Sasso replies: A formant is simply a fixed resonance in a sound-producing mechanism. Although the term is most often used in the context of speech, acoustic instruments also exhibit formants, and their frequencies and relationships characterise the sound of the instrument. The formants of the human voice are derived from the shape and size of the vocal tract, which can be changed by moving the jaw, tongue, and lips. We use those changes to make different vowel sounds, but speech is much more complex than that and is beyond the reach of most synthesizers.
As the above description suggests, you might use a multi-band resonant EQ or resonant band-pass filters in parallel to create formants in a synthesizer or as an add-on effect. When a formant filter is specifically included in a synth, it is typically constructed to emulate the formants of speech and usually contains a single control for morphing through the vowel formants as well as controls for overall formant frequency and resonance. Examples of synths with such filters include rgcAudio's Pentagon, Virsyn's Tera, Clavia's Nord Modular, and even the amusing Delay Lama from Audio Nerdz.
Beyond speech, formant filters are very effective for adding a unique and consistent flavour to a synth patch, just as they do to acoustic instruments. This technique is useful for any type of sound from leads, to pads, to percussion. In that context, using fixed formants or ones that change in exactly the same way when reproducing the same pitch or percussive sound produces more natural-sounding results.
I recently read with great interest an Australian university web page about acoustics, suggesting that monitors could be mounted in corners if they were mounted flush behind a wood panel. Does this effectively mean putting the monitor in another box? I can't mount my monitors in the corners of the room as there's a window that goes right up to the corner, so is it possible I could mount my monitors in wooden containers and sit these up against a wall to save space? Would it stop the bass reflecting off the walls behind?
SOS Forum Post
Technical Editor Hugh Robjohns replies: There was a fashion long ago to place loudspeakers in the corner of a room in much the same way as you suggest. Basically, you put a sheet of wood the full height of the room across the corner, creating a triangular hole behind. The speaker is placed in this space, suitably supported with its baffle pushed through — and flush with — a suitable hole in the corner board, while the area behind and around the sepaker is heavily damped. Obviously, the whole thing has to be very solid, well-damped and free from rattles and resonances.
The benefits of this technique are that the reflected rear wave no longer causes destructive interference, and there will be a great deal of extra bass lift. This latter point was the real reason for undertaking the technique back when speakers and amplifiers weren't very efficient — if you do it today, you'll have to re-equalise the bass end of the speaker to get a flat response.
I've heard that a digital filter is just a delay line — how does that work?
SOS contributor Len Sasso replies: While that's a gross oversimplification, the general idea is this: when you mix a delayed signal with the original, you are in effect averaging the two signals. If you mix equal amounts of the output of a one-sample delay and the original signal, each sample of the resulting signal will be the average of two consecutive samples of the original signal. Like any sound, the original and delayed signals can be represented as series of sine waves — and that one-sample delay will represent a larger offset for higher frequencies until at the Nyquist frequency (half the sampling rate) the sine waves completely cancel. In short, higher frequencies are reduced and in effect, you have a low-pass filter.
A number of techniques are used to produce different filter characteristics. Inverting the original or delayed signal produces high-pass filtering. Using every other sample yields band-pass or band-reject filter characteristics, again depending on inversion. Averaging more samples (ie. more delay lines) yields different filter slopes. Filters constructed by processing only the input are referred to as non-recursive or Finite Impulse Response (FIR) filters. Adding feedback to the system produces resonances and more complex filter characteristics. Such filters are called recursive or Infinite Impulse Response (IRR) filters.
Creating filters using delay lines requires very accurate, very short delays not found in your typical hardware or plug-in synth. However modular synths and synth construction software often provide the tools. Although the theory is daunting, just flailing around can be both fun and productive, but be careful of your ears and speakers, especially when constructing recursive filters (ie. with feedback).
I'm a relative newcomer to the world of multitrack recording, having only used had a four-track cassette recorder in the past, and am considering buying a Fostex DMT8VL eight-track digital multitrack recorder for around £350. Could you tell me whether this would be a good choice, or is there anything else currently available on the second-hand market in this price range that might be of better quality?
Assistant Editor Tom Flint replies: My advice would be not to get the DMT8VL, especially for that price. This product was one of the first digital multitrackers to follow the cassette multitracker principle, and in its day it was quite a nice bit of kit. The internal drive was just 540MB (these days you can have an 80GB drive in some digital multitrackers), although a slightly larger drive could be installed. The DMT8VL didn't use data compression which is a good thing in terms of audio quality, but the analogue-to-digital converters won't compare well to modern equivalents.
The DMT8VL is limited by the fact that it can only record two tracks at once, whereas many newer products can record as many as 10 or more simultaneously. Another big drawback is that it has an analogue mixer section, which means that the DMT8VL has none of the automation or recall features that you would now expect on a machine like this. Other drawbacks include its small display, limited editing and undo functions, a lack of SCSI out or CD drive for file backup (you have to back up to DAT using the S/PDIF outs), and the list goes on.
The point is that these days there are loads of other options which all beat the DMT8VL hands down. For example, the Roland VS880 is also an old machine but it has a much better specification. Since its release it has had many useful upgrade options, and it is available second-hand for about £300. For about £400 you can buy a Korg D16 giving you loads of memory, editing, backup options, 16 tracks and a big display. Similarly, there are many Akai DPS16s on the market for about £500 and they're just as good as the Korg models.
A couple of years ago we published a buyers' guide which I'm sure will be of use to you. Obviously there are many newer products on the market now, but many of the products listed in the guide are now on sale for good second-hand prices, and many have been upgraded via software updates and hardware additions.
I need some advice about duplicating my own CDs for a small-run commercial release. Would I be better off going to a CD manufacturing company, or could I use the CD-R/W drive on my PC to do the job? Alternatively, would a stand-alone audio CD recorder be more suitable if I were to go down the DIY route? What concerns me is whether the CDs would be guaranteed to play back on any kind of consumer CD player. I've heard that audio CD recorders use cheaper blank discs, but are there any other advantages to using one?
I'd be grateful If you could offer some relevant information about how feasible it would be to do the job myself, including printing the CD labels, and any possible pitfalls.
SOS Forum Post
Technical Editor Hugh Robjohns replies: You have many choices when it comes to getting a small run of CDs produced. Firstly, you could have your discs pressed in a commercial pressing plant — the practical minimum number of copies is 500, although many companies suggest a minimum of 1000 CDs. Some plants will do everything for you once you supply them with a good master and artwork ideas: they'll do the PQ-coding, the artwork layout, printing, pressing, and packaging. Bear in mind that you'll probably have to provide all the relevant PRS and MCPS paperwork before the plant will stamp the discs, and some companies prefer you to work through a third-party company, although there are plenty of those advertising in the back of SOS. These companies do the donkey work and provide the factory with the finished disc master and film-ready artwork, and you with boxes of finished discs.
Secondly, you could consider professional CD-R duplication. Practical quantities range anywhere from one to 1000 copies, but this solution is most cost-effective for small runs. On-body printing can look as professional as pressed discs, but there's a risk that the CD-Rs won't play back on some older CD players, and hence result in grumpy customers. Again, you can go through third-party companies who will do the artwork and master preparation, and may even do the duplication on their own equipment.
Finally, you could indeed burn discs at home, and the practical quantity is basically as many as you have the time and patience for. Small runs of 25 are manageable in my experience, but any more gets tedious very quickly and, again, some older players won't play CD-Rs. If you opt to burn the CD-Rs yourself, you have two choices: using a computer burner or a stand-alone CD-R recorder. The computer burner will be faster, but the quality may be poorer. The CD-R recorder will be real-time only, though the quality may well be better — but there's the issue of whether the burner can use standard data CD-Rs or the slightly more expensive 'audio' CD-Rs.
When in comes to guaranteeing playback compatibility with all domestic CD players, in general, CD-Rs that have been recorded in the correct Red Book audio format and correctly 'finalised' will play quite happily on all modern CD players. Unfortunately, many older CD players simply don't have sufficiently sensitive optical pickups to cope with the lower reflectivity inherent in a CD-R disc. Obviously, the number of such machines still in use is relatively small and falling every day, so the problem will eventually go away, but it hasn't yet. Incidentally, very few CD players can cope with CD-R/W discs at the present time.
If you can print on-body with something like the Matisse or Epson inkjet printers, on suitable inkjet-compatible discs, the results are very professional indeed. Pressit-style sticky labels are always going to look a little amateurish, but are fine for one-offs and demos. Printing the booklet depends on your graphic and editorial abilities, your software, your printer and your choise of paper. Results can vary from superb to awful.
Whichever way you look at it, producing discs at home is expensive in terms of time, blanks, paper, ink, and all the rest, but is a practical route if you only want small numbers of discs at a time.
I'm confused by the terms partial, harmonic, and overtone. They seem to be used for the same thing, but I'm not sure. Also what's the difference between inharmonic and enharmonic?
SOS contributor Len Sasso replies: Periodic waveforms, which correspond to pitched sounds, can be analysed (and reconstructed) in terms of sine waves whose frequencies are whole-number multiples of the lowest. This lowest-frequency sine wave is called the fundamental, and a sine wave at a frequency of N times the fundamental is called the Nth harmonic. Another way of describing the same set of relationships is to refer to the harmonics above the fundamental as overtones. In that case, the first overtone is two times the fundamental, the second overtone is three times the fundamental, and so on. The term partial is usually reserved for naming the components actually present in the complex waveform — so in a square wave, for example, which has only odd-numbered harmonics present, the fundamental would be the first partial, the third harmonic would be the second partial, the fifth harmonic would be the third partial, and so on.
Unpitched sounds can also be analysed in terms of sine wave components, but in that case they are not always whole-number multiples of the lowest-frequency component. The components in that case are also called partials and are numbered in order of their frequency. Sometimes the partials at whole-number multiples of the lowest frequency are referred to as harmonic partials to distinguish them from other partials, which are then referred to as inharmonic partials. The term enharmonic has no relation to frequency analysis, but rather refers to the notation of the same pitch in different ways — for example, C# and Db are enharmonic spellings of the same pitch in the equal-tempered scale.
I'm using an M-Audio Audiophile 2496 soundcard with a Focusrite Penta and am about to fit the digital output board to the Penta, so I was wondering if the digital inputs and outputs on these devices will be matched for digital recording. Also, I'm currently recording with the analogue inputs and outputs, and to achieve good input levels in Cubase I have to pull the Penta's output level down to -20dB. This is with the output switch at -10dB — the Audiophile's input is +2 not +4dB. Will the digital board allow me to set 0dB levels on my Penta to get a sensible level into Cubase?
SOS Forum Post
Technical Editor Hugh Robjohns replies: If you connect the digital out of the Penta to a compatible digital in on the audio card, yes, the levels will match perfectly between the two — a signal showing on the Penta at -6dBFS will arrive in Cubase at -6dBFS as well.
With regard to your second point, this is probably an issue of headroom. If you peak signals close to 0dBfs on the Penta, the actual analogue output level will be pretty high — depending on the machine's calibration, this will be somewhere between +18 and +24 dBu. Even if you switch the analogue output level down to -10dBV, it will still be peaking between +10 to +16 dBu, which may well be as high as a semi-pro audio card can cope with. The answer, as always, is to leave enough headroom.
I use Sonar XL in my home studio, and everything is perfect except for one thing. When I try to join two different guitar parts I've recorded, there's almost always a strange click sound on a border between them. How can I eliminate this?
SOS Forum Post
Editor In Chief Paul White replies: Clicks are quite common when joining two pieces of audio and are caused by the discontinuity in the waveform where one section ends and the next starts. Even two low-frequency sine waves can cause a high-frequency click if they are edited in such a way as to create an abrupt step in the waveform, as sharp steps translate into a short burst of high-frequency components. It is this short burst of high frequencies that you hear as a click. Fortunately, there are a couple of ways to avoid the problem. The first is to make your edit between notes or phrases where the audio level is near to zero.
Of course this isn't always possible, with electric guitar in particular, as notes may still be sustaining when you reach the edit point or the notes may be too close together for there to be a period of silence between them. If this is the situation, you should create a short crossfade at the edit point so that you get a smooth transition between the two sections rather than an abrupt step. A short crossfade time of between 20 and 50 milliseconds is usually adequate to avoid clicks.
I used to have an analogue Mackie desk and an SPL Vitalizer, and I'd connect the SPL through the auxiliary sends and returns on the desk in the same way as an effects unit — it worked fine since I could control the amount of feed from different sources. However, I've now changed my desk to a Soundcraft 328XD, and when I connect the SPL in the same way as with the Mackie, the signal comes back out of phase and flangey. I've tried changing the phase of the signal, but that doesn't help, and since there are no channel inserts on the digital inputs, I can't find a way of using the SPL and Soundcraft together.
SOS Forum Post
Technical Editor Hugh Robjohns replies: The flanging sound is caused by the fact you'll be mixing the original signal and the processed version together in your desk. The reason it didn't happen with the Mackie desk is because that was analogue and the Soundcraft is digital.
Going out of the analogue outputs of the Soundcraft requires passing through a D-A (digital-to-analogue) converter, which will cause a processing delay of between 0.5 and 1.5 ms depending on the design of the converter. The signal then passes through the Vitalizer and back to the desk's analogue input via another A-D converter and another millisecond or so of delay.
The result is that the processed signal is delayed between one and three milliseconds with respect to the original signal, which is the most obvious delay time area for phasing and flanging. It's an inherent problem with digital desks and analogue outboard. All you can do is either avoid mixing the direct and processed signals together, or use only digital outboard gear and connect digitally to avoid unnecessary converter delays.
Is the measurement of 'dynamic range' the same as the signal-to-noise ratio?
SOS Forum Post
Technical Editor Hugh Robjohns replies: The signal-to-noise ratio is a technical measurement. In analogue equipment it's normally defined as the difference in level in decibels between the average level of the noise floor (hiss) and the nominal signal level, which is typically +4dBu. The level of the noise floor may be 'weighted' to take into account the sensitivity of the human ear to different frequencies (ie. A-weighted).
The dynamic range is, more often than not, a marketing device, quoting the maximum possible difference in level in decibels between the loudest thing that can be passed through the equipment, and the quiestest signal that can still be heard under the noise floor. Dynamic range figures will always be greater than the signal-to-noise figures.
In the case of digital equipment, the nominal signal reference is generally taken as the 0dBFS point, and the noise floor will be defined by the dither noise at around -93dBFS for a 16-bit system. However, the dynamic range will often be claimed to be greater than this if the dither is pychoacoustically shaped. Apogee, for example, claim that signals can be heard up to 24dB below the theoretical noise floor when using its UV22 dither system. Hence, the signal-to-noise ratio might be 93dB but the dynamic range might be 120dB or so, and so the manufacturers will claim near 20-bit performance from a 16-bit recorder.
I've read a few articles about compression in previous issues, but I was wondering what kind of compressor settings you would suggest for recording rap vocals?
Editor In Chief Paul White replies: For rap, a good assertive hard-knee compressor would be best to keep the levels ruthlessly even, and these days many compressors will do that adequately — both plug-ins and hardware. A tube or opto compressor might give you a slightly more colourful sound, although you could simply add a little distortion after recording to achieve a similar result. I'd suggest a compression ratio of around 8:1, and perhaps apply a little more compression than usual by adjusting the threshold so that you get up to 10dB of gain reduction on the peaks. Use a fast attack and a release of around 200ms, and make sure you use a pop shield with the mic as rap can be pretty forceful on those plosive 'B's and 'P's — if you get popping, it's very difficult to deal with afterwards.