Many problems encountered when using PCs to record music are caused by MIDI or audio data not being delivered on time, or by its flow being interrupted. Martin Walker outlines some of the most common causes and symptoms, and tells you how to go about eliminating them.
Next to clicks and pops, the problems I get asked about most in emails from SOS readers concern audio and MIDI timing — and just as with random clicks and pops, the difficulty is that the cause of timing problems is often not obvious (indeed, many clicks and pops are caused by timing problems, as we shall see). If you start playing back a track and everything sounds obviously out of sync then you immediately know something's wrong and can start to track down the cause. However, what's far more likely is that your audio and MIDI tracks may slowly drift apart after a few minutes of playback, or that some MIDI notes that sound perfectly all right most of the time will occasionally be slightly ahead of or behind the beat. Often the difference is subtle, and may not even occur every time you play back your song.
We've had features on both audio and MIDI timing problems in the past, most recently in SOS March 2000, but technology marches onward to provide us with a fresh set of problem areas including software synths and USB. While it's tempting to try out every tweak that's suggested, this may not fix your particular problem, and may in some cases make other aspects of performance such as audio track count or latency worse. So, this month I'm going to explain how the timing for both audio and MIDI works on PCs, and what can interfere with it. Armed with this knowledge you should then have a much better chance of narrowing down the likely origin of a particular timing problem.
Analogue signals are converted into digital data by regularly sampling their level, tens of thousands of times a second. Most recording setups set this sampling rate at 44.1kHz, 48kHz, or possibly 96kHz, but whichever rate you choose, it's vital that it remains absolutely constant. If it varies slowly with time the playback pitch will alter, and if it varies rapidly the playback waveform will differ from the recorded one.
The clock signal that determines the sample rate in PC recording setups will usually be provided by the soundcard, and be derived from a high‑stability crystal‑controlled oscillator running at some frequency in the Megahertz region. This frequency is then divided down to provide the various sample rates required. In the real world the oscillator frequency always wobbles about slightly, and this wobble is termed clock jitter. The less jitter there is, the more focussed and pin‑sharp the final stereo image will be when the data is clocked out through the D‑A converter, and conversely as the jitter level rises the stereo image will become more smeared and confused, and low‑level resolution will suffer. This is simply because some samples arrive slightly early and others slightly late; tiny echoes and louder high‑frequency sounds such as transients are the first to be affected. More expensive soundcards tend to have lower clock jitter levels, and in many professional studios a specially designed master clock is used to provide a particularly stable frequency from which every piece of digital audio gear is clocked.
Although the clock oscillator will always carry on regardless, the digital audio waveforms it samples and plays back may get interrupted in a variety of other places inside the PC, and this is normally what causes timing problems. In the case of mono audio, even a single missing cycle of 44.1kHz sample (just 23 microseconds long) can give rise to an audible click. Once you start mixing tracks together, a permanent timing difference of a single cycle between one side and the other of a stereo recording is also audible as a subtle loss of high frequencies, and as this timing interval increases you'll hear the familiar flanging sound as different frequencies start to cancel out.
The clock data itself can only be responsible if you're using an external clock down a long or poor‑quality cable. In this case, its waveform can become degraded so badly that some oscillator cycles are missed altogether. If you are suffering clicks and pops when clocking your soundcard from an external DAT or Minidisc player via a co‑axial cable, then make sure it's a proper 75Ω digital cable (or 110Ω for AES‑EBU connections), and not one designed for audio. Digital cables are designed to carry information at much higher frequencies than audio ones, and all sorts of obscure problems can be cured by using a quality digital cable.
There are many other factors that can cause individual audio clicks and pops, often related to low‑level timing glitches, and I've covered the main causes and solutions before, most recently in 'Getting Going' in SOS July 2000. These include various Windows tweaks, the removal of unnecessary background tasks (see box on page 148), and the proper setting up of your audio drivers. These days, they can also include problems with USB audio.
Judging by my own experiences and researches, most USB problems are caused by incompatible components in some PCs. I've written in these pages before about such issues with some PCI to USB Host Controller chips, most recently as part of my 'Lap Land' feature in SOS January 2001, but until more general publicity is given to such problems, musicians will continue to buy peripherals that will never work satisfactorily with their PCs. USB peripheral manufacturers certainly know which rogue chips cause problems with their products, so a wise precaution might be to speak to their technical support staff before you buy your new MIDI or audio interface. Go into Device Manager, and then note down the manufacturer and number of your motherboard's 'PCI to USB Universal Host Controller' chip, and then check with them to see if this is on their list of verified compliant devices.
Ultimately many of these problems should all go away as older PCs with non‑compliant chips become obsolete and musicians buy newer models. However, you can still take certain steps to minimise problems. Use the latest drivers available, and if you still have problems it may be worth investigating Proppagamma's generic USB‑Audio driver, which is now available in beta form for PC as well as Mac (see Paul Wiffen's Mac feature in SOS December 2000 for more details). A free 508K downloadable demo version is available, and this already supports products including Roland's UA30, the Ego Sys U2A, and Swissonic's USB Studio D.
Using good‑quality USB cables will also help, since once again we are dealing with digital transfers. Don't be tempted to run too many USB peripherals simultaneously, since they may end up battling for bandwidth — even using a USB mouse can make a difference in some circumstances.
Once we move beyond the timing‑related causes of individual clicks and pops, we enter the area of sequencer resolution. The latest versions of most MIDI + Audio sequencers offer extremely high resolution for the timing of both MIDI and audio events. Emagic's Logic Audio is renowned for its tight timing of 960ppqn (pulses per quarter note) and Steinberg's release of Cubase 5.0 raised this resolution even further to 15360 fractions per quarter note. Although this internal resolution is always used, you can choose a lower display resolution such as 7860, 1536, 480, or 384 ticks per quarter note, which makes it quicker and easier to position things on screen.
Some musicians expected all their MIDI timing errors to disappear once such high timing resolution was offered by sequencers, but unfortunately this only deals with part of the problem. While the start times and lengths of both audio and MIDI events are being time‑stamped to an accuracy of 1/15360th of a quarter note, they still each have many more hurdles to jump before you hear the end result, such as passing the PCI buss and surviving the interruption of the many background tasks of the Windows operating system.
As a consequence, manufacturers use software buffers to calculate and store data in advance. When available processor time does occasionally get too short for the sequencer to do more calculations, there is still enough data sitting in the buffer to trot out to the soundcard and maintain a continuous smooth flow. Such buffering benefits MIDI as well as audio calculations, and many Cubase VST users gained far better timing once the System Pre‑roll setting was more widely advertised. Found in the Synchronisation window, this controls the size of the MIDI buffer. By ensuring that the pre‑roll setting is at least as high as the audio latency, both audio and MIDI get an equal chance of surviving being interrupted by other routine tasks.
Windows 98 currently seems to be the best operating system for MIDI timing, since as I discussed in last month's PC Musician, Windows NT caused various problems in this area. If you are already using Windows 2000, make sure that you're using WDM MIDI drivers specifically intended for Windows 2000, and not NT 4.0 ones, since these are also likely to give you timing problems.
However, in the case of MIDI events there's another major bottleneck — the MIDI interface itself. The clock used to march data out of a MIDI interface runs at a frequency of 31.25kHz, using a serial protocol first introduced in 1983. Ten cycles of this clock are needed to define each MIDI byte, and with three MIDI bytes needed for a typical Note On command, the total time it takes to transmit one typical MIDI message is thus nearly 1mS. Since MIDI is a serial protocol, successive messages must wait until their predecessors have been transmitted, meaning that MIDI has a very much lower timing resolution than sampled audio.
Although a single MIDI interface supports up to 16 MIDI channels, they all have to travel down the same serial cable, and so a single note played back at the same time on each of 16 MIDI tracks will emerge as a stream of events spread across 16mS. In slower ambient music this spread may not be audible, but it certainly can be in high‑tempo music that uses lots of drums.
There are loads of standard tips to minimise such problems, and Paul White and I mentioned most of them in our 'Split Second Timing' feature back in SOS March 2000. The first and best advice is to make sure that you're running the very latest MIDI drivers available, whether for a dedicated interface or one built‑in to a soundcard. It doesn't make any difference to MIDI resolution whether your interface uses serial, parallel, or USB ports, or sits on a PCI soundcard.
The next thing to try is reducing the load on each MIDI interface. Strip out any unnecessary aftertouch or other real‑time data, and use any thinning algorithms in your sequencer. Many sequencers also prioritise the topmost tracks on their display, so use these for time‑sensitive drum and bass tracks, and if necessary move slower pad sounds fractionally off the beat so they're not fighting with the rhythm sounds to emerge first. You could also try giving your MIDI tracks a higher priority in your sequencer.
The best solution is to add more MIDI ports, and this is the biggest reason why multi‑port MIDI interfaces are so popular: although each separate port can theoretically run 16 MIDI channels, it's far better for timing to allocate a separate port to each of your synths, especially when large amounts of other real‑time automation data such as filter sweeps are being sent as well in between the more important note data.
As explained earlier, the master clock used to set the timing for both MIDI and audio events is normally one and the same. However, don't forget that once these signals start moving about inside the PC from buffer to buffer, across the PCI buss to the soundcard, and in and out of serial, parallel, or USB ports, they can have the potential to interrupt each other. So for instance, audio glitches may be caused by a USB MIDI interface driver, and vice versa. You might think that since MIDI data occupies such a small bandwidth compared to audio, it will always get through unscathed. However, this isn't always the case, especially since the priority of audio tasks tends to be set much higher than MIDI: if your PC has scarcely enough time to complete its real‑time audio duties then MIDI data may be delayed. For this reason your MIDI timing might start to suffer when the number of audio tracks is increased.
In the case of audio clicks and pops, or in extreme cases the continuous audio crackling sometimes caused by USB MIDI interfaces, you could try increasing audio buffer size, decreasing the amount of MIDI data, or investing in one of the new hardware/software USB MIDI interfaces specifically linked to your sequencer (see below).
There are now two ways to bypass MIDI timing problems. The first is to use VST Instruments, which while triggered by MIDI events are actually audio signals, and therefore still emerge with sample‑accurate timing, at least on playback (see next section).
The other way to improve MIDI timing is using an interface specially designed to work with your sequencer, such as Emagic's range of AMT (Active MIDI Transmission) interfaces or Steinberg's Midex 8, which uses very similar LTB (Linear Time Base) technology. These work by sending parcels of uniquely time‑stamped MIDI data in advance, to fill a small buffer inside the interface. The interface itself sends out the multi‑port data at the appropriate time using a low‑jitter clock, without being compromised by either the computer or its USB interface.
Both manufacturers claim timing accuracy down to sub‑millisecond levels using this technique. The only disadvantage is that because the normal MIDI data stream is replaced by a combined software/hardware solution, you can only benefit from it if you have both parts. So, Logic Audio users will need to buy Emagic's Unitor8 or AMT8 interface, while Cubase VST 5.0 owners will need Steinberg's Midex 8. You can of course use both types of interface with other software, but you'll lose their unique timing advantages.
Even worse than MIDI timing uncertainties in milliseconds are potential problems of latency jitter when using soft synths, which can reach twice your soundcard latency value. As I explained in SOS November 2000, these affect software synths because they use several small audio buffers, but most software waits to accept new MIDI data until the start of the next buffer‑filling cycle. So, depending on when you play a note in relation to the buffering cycle, the timing of adjacent beats when playing in real time or recording can vary by up to twice the current latency value of your soft synth. This problem can be solved only by the soft synth developer.
VST Instruments running in Cubase VST 5.0 and Logic Audio 4.5, on the other hand, suffer the same problem when being played from a keyboard, but playback from the sequencer should have sample‑accurate timing, so a quantised sequencer will remain rock‑solid.
When MIDI data gets interrupted too badly, occasional MIDI events may be missed out altogether or get corrupted. This can result in missing or hanging notes, erroneous pitch‑bend or modulation data, or even crashes caused by stray messages being interpreted as SysEx data. In my experience this usually occurs due to mismatched components, such as older serial or parallel interfaces being plugged in to a modern PC with much faster port settings, or modern synths sending data faster than the buffers on some early soundcard MIDI ports such as the Creative Labs AWE series can cope with it. You can find more on this in SOS May 2000. Other than buying a newer interface, the only solution here is to thin out the MIDI data somehow, and avoid sending large SysEx dumps.
A few musicians running Windows 2000 have also reported really long MIDI delays of tens of seconds followed by complete recovery, which is perhaps another reason to avoid this operating system for the time being.
One of the most frustrating sorts of timing problem is the slow but inevitable drifting apart of MIDI and audio tracks, sometimes over several minutes. This gradual loss of sync either means that the tracks are free‑wheeling, or that some sync is present but it is occasionally slipping. One very avoidable cause of slow drifts between audio and MIDI is attempting to loop audio fragments such as drum parts in a sampler — it's much better to retrigger the sampler each time from your sequencer than to try to build the loop into the sample. Timing drift can, however, occur for other reasons.
Normally you should choose audio clock as your MIDI Sync Reference inside a MIDI + Audio sequencer, which means that MIDI timing will be synchronised to the same sample‑accurate clock used by the audio tracks, and therefore permanently locked to it. However, when using MME drivers there are extra settings to take into consideration. These vary from soundcard to soundcard, and using the wrong options can result in long‑term drift of MIDI and audio (see the MME Settings box on page 150 for more details).
If you have two or more soundcards, their crystal clocks will all vary slightly, and unless you lock them together the audio tracks assigned to one card will slowly drift apart from those assigned to the others. However, the start position of each audio part is determined by your sequencer, so you can minimise this problem by only using short sections of audio. The only long‑term solution is to make sure all your soundcards can be properly sync'ed together, either using proprietary links like the Marian/SEKD Sync Bus, or a digital clock such as S/PDIF. You can then designate one card as master and the remainder as slaves, which should lock to it with sample accuracy.
One of the biggest enemies to consistent timing is our old friend the background task, since anything that periodically cuts in while you're running your MIDI + Audio sequencer can disrupt its timing. This may cause occasional random audio clicks and pops, or as the problem gets worse an uncertain feeling to MIDI note timing, or sudden lurches in time. In really bad cases your sequencer may stop altogether in the middle of a recording, or MIDI data will get corrupted. You can often reduce these effects by increasing audio or MIDI buffer size, but of course this will increase latency as well.
In a multi‑threaded application each task can be given a different priority, and the relative priorities of MIDI and audio can for instance be adjusted in Cubase in the Audio Setup page. However, although you can juggle these priorities depending on whether you are running lots of audio or lots of MIDI tracks, it's still far more important to remove any other unnecessary background tasks, since these tend to be what push your buffers 'over the edge'.
The best solution is to remove any unnecessary background tasks once and for all, or at least while you record and playback music. I've covered the main offenders on various previous occasions, and have now posted an updated FAQ on this subject in the PC Music FAQ section of the new SOS forum. The quickest way to find this is to do a search through the titles only for 'background tasks'.
Don't forget to disable Auto Insert Notification for your CD‑ROM and CD‑R drives, since this can provide yet another periodic interruption every few seconds that can disturb MIDI timing or cause an audio glitch. Also, as I explain in some detail in this month's PC Notes column, if you have several applications running simultaneously such as a sequencer, stand‑alone soft synth, and synth editor, these can sometimes demand their peak processing requirements simultaneously, which can also disturb timing in an even more unpredictable way.
If your soundcard provides ASIO or EASI drivers then the synchronisation settings are already optimised for you. However, many consumer cards don't have these, so you'll have to choose either DirectSound or MME drivers. DirectSound tend to provide lower latency, but can only be used during playback, and not for recording. For this reason many musicians still use higher‑latency MME drivers, and here there are various opportunities for MIDI and audio to slowly drift out of sync if the settings are incorrect. By far the easiest solution is to visit the web site of either your sequencer developer or soundcard manufacturer, and discover the recommended sequencer settings for your particular card.
For example, in Cubase VST you'll find the relevant setting in the Advanced Options page of ASIO Multimedia Setup, in the Sync Reference box. There are four options here — Sample Position Output, DMA Block Output, Sample Position Input, and DMA Block Input. Most modern soundcards can use one of the preferred Sample Position options, and in this case you are free to change the number and size of MIDI buffers to reduce latency. However, if you have to use a DMA Block sync reference you should always set the buffer size by running the 'Detect Buffer Size' function: the wrong value will result in slippage of MIDI to audio sync, and you can check this by running the 'Check Buffers and Sync' option.
Cakewalk runs its Wave Profiler utility to automatically discover the best settings for your soundcard, but if you still get problems a detailed set of instructions is available on their web site to determine the settings manually. They also maintain a fairly comprehensive table of recommended settings for different soundcards on the site, and using these should save a lot of time if yours is on the list. Emagic's Logic Audio Device Setup performs a similar function.
Some soundcards, like Terratec's EWS64, have two completely different audio playback options, and each requires a different Sync Reference. If you're using the EWS Wave Play driver you need to select DMA Block, but if you switch to EWS Codec Play then Sample Position should be used.
By far the easiest way to test your PC audio for clicks and pops is to use a 10kHz sine wave test signal at full digital level, which you can create in most audio editors including Wavelab, Sound Forge, and Cool Edit Pro. Either create a file several minutes long or loop a shorter one, and then play this back using your audio application. Since each 10kHz cycle will only use a couple of samples at 44.1kHz, even a glitch one sample long should be clearly audible as a 'tick', and I've yet to review any USB audio peripheral that can pass this playback test.
Once you're happy that your sequencer and soundcard can play back the test signal with no problems, try recording it back through an analogue input on your soundcard to check for glitching during recording. If your soundcard has S/PDIF or AES‑EBU digital I/O you can also try transferring the signal to and from a DAT recorder to check that no bits get lost en route.
To test for short‑term MIDI timing problems, create a quantised MIDI part containing continuous hi‑hats on every 16th note (ie. four to each beat). Copy this part to a second MIDI track, and then allocate one track to a VST Instrument like Steinberg's LM9, and the other to one of your external MIDI synths. Now pan the VST Instrument hard left, and the MIDI synth hard right, and play the song back at 120bpm, so that each beat should be exactly 125mS apart. Next, record the two signals into a stereo audio track. The sample‑accurate VST Instrument will appear on the left channel, while any timing discrepancy on your MIDI synth will show up as beats in the right channel ahead or behind those of the left. Don't forget that there will be a largely fixed delay of a few milliseconds with all MIDI synths — it's the variation in timing between each beat that's important here. Of course the results will depend on how many tracks there are running on your songs, and how many synths and MIDI ports you are using. For a real‑world test, add these two tracks to a complex song, and then turn down the volume of all the other synths but leave their MIDI data active, to see how bad the timing of the external MIDI synth gets under stress.