Software and hardware developments have made the 'studio in your PC' concept increasingly attractive, but many prospective users are put off by concerns about sound quality, timing, and the difficulty of adapting existing sample libraries. Many of these issues need no longer cause difficulty, however, as Martin Walker explains...
It's now just over a year since I last approached the subject of cramming a complete studio into a PC (SOS October '99), but a lot has happened in that time. First of all, PC processors have got much faster: the fastest model available at that time was a 550MHz Pentium III, costing just over £500, whereas you can now get a 750MHz one for under £400. This in itself makes the possibility of running all your synths and samplers as software applications more feasible. Moreover, soundcard drivers have provided us with lower and lower latency figures, making virtual instruments feel a lot more immediate to play in real time, and the increasing availability of VST Instruments has eliminated many of the problems that musicians were experiencing when attempting to run several stand‑alone software applications side‑by‑side without conflicts.
These facts would all seem to make the 'software studio' a more realistic goal. Many musicians, however, still view the approach with distrust, pointing to the possibility of long‑term unreliability, system conflicts causing occasional pops and clicks, and questioning aspects such as sound quality when compared to equivalent hardware. This month, therefore, I thought I'd bring you up to date with all of these issues — how easy or difficult it is to use soft synths or soft samplers, how their sound quality and timing compares to the hardware equivalent and, if you do decide to make the switch, whether you can carry on using existing sample libraries.
One of the biggest factors in convincing musicians that soft synths have come of age is the arrival in force of big‑name VST Instruments. Steinberg's Model E and Native Instruments' Pro Five (both reviewed in SOS April 2000) were the first major commercial releases that made people sit up and take notice, since they successfully modelled two classic analogue synths. Although the software versions probably wouldn't fool a die‑hard analogue aficionado, and certainly don't offer the same cachet as owning the real thing, they have already broadened the sonic palette of many a PC and Mac owner at a far lower price. The trend continued with the subsequent arrival of Waldorf's PPG Wave 2.V, reviewed in SOS September 2000.
For the consumer, loading a VST Instrument into Logic or Cubase is far simpler than trying to run a stand‑alone soft synth alongside a MIDI + Audio sequencer on one PC. The last year has also, however, seen many soundcard manufacturers address one of the major problems that had plagued free‑standing soft synth applications, by developing multi‑client audio drivers. This allows those with multi‑output soundcards to allocate one pair of stereo output channels to a soft synth while at the same time devoting the others to general‑purpose audio playback. With faster PCs and improved soundcard drivers, latency has also dropped (more on this later). Overall this means that even dedicated applications without VST Instrument or ReWire compatibility are still easier to get going alongside your MIDI + Audio sequencer than they were.
Many musicians who are used to hardware synths and samplers question whether a software version can ever sound as good as 'the real thing', and however sold you are on the concept of the software studio, it isn't a lot of use unless sound quality is up to scratch. Although well‑written and correctly set up applications and soundcard drivers should eliminate the sorts of glitches that can sometimes plague audio software, there still remains the question of whether this software, even when functioning correctly, can sound as good as a hardware equivalent.
Fundamentally, software synths and DSP‑based hardware synths both work in the same way, so there is no reason in principle why the former should suffer from inferior sound quality. There are, however, reasons why this is not always the case. One possible difference between hardware and software is that a hardware synth is a dedicated device, meaning that its developers can exploit the DSP to its maximum extent: on a software synth, by contrast, developers need to leave as much processor power as possible available for other applications. When developing a cross‑platform product, developers also tend to work with high‑level modules that are easier to port between the PC and Mac, but may not be as efficient as the lower‑level code used in hardware instruments.
A common complaint about the sound quality of software instruments concerns aliasing distortion, particularly with high‑pitched notes. Aliasing is a digital artifact which usually takes the form of ringing at frequencies not musically related to the source signal. Gordon Reid provided an excellent description of its causes in September's Synth Secrets, but in essence they are due to the presence of frequencies higher than half the sampling rate (the Nyquist frequency) which 'fold over' into the audible part of the spectrum to produce non‑harmonic tones.
Frequencies higher than the Nyquist frequency are normally carefully filtered out during the digitising process when recording instruments from the outside world. This function is normally provided as an integral part of every A‑D converter and means that no attempt is made to store frequencies that will be misrepresented on playback. This can break down, however, with samplers. Unless you sample the highest note you want to replay, some samples will have to be replayed by transposing existing samples upwards. This will immediately mean that their upper harmonics move beyond the Nyquist frequency and must be carefully filtered out prior to playback to avoid aliasing. This applies to both hardware and software samplers, and is nothing new. However, when even freeware software samplers are starting to appear, musicians shouldn't assume that all developers take the same amount of care to ensure low levels of aliasing distortion — some devote a lot of development time to this issue, while others may not.
The easiest way to reduce the effects of aliasing distortion is to multisample, using a different sample every few notes so that none has to be transposed very far. This also results in more realistic acoustic sampled sounds, but of course it does make each sampled instrument far larger. However, in these days of PCs with 256Mb or more of RAM, this needn't be as much of a problem as it is with hardware samplers.
A sampler also needs to combine multiple samples at different pitches in real time. To do this you need to use asynchronous sample‑rate conversion (ASRC): each note normally gets converted back to a sample at 44.1kHz (or whatever sample frequency is being used) and then mixed with the others. Unfortunately, none of the many sample‑rate conversion methods is perfect, although most software developers do take pains to suppress the audible harmonic distortion as much as they can.
The real challenge is to write software algorithms that are effective without using too much processing power. However, with the raw power available in modern PCs designers can often improve on hardware designs. For instance, some hardware samplers employ two independent mono signal paths to generate a stereo signal, which makes it far more difficult to keep them phase‑locked for the cleanest sounds. Also, many low‑cost stock DSP chips used in hardware samplers use fixed‑point calculations, whereas in a software synth or sampler 32‑bit floating‑point calculations can be used for greater dynamic range and internal headroom (see my review of Cubase 5.0 in SOS September 2000 for more details on this).
In theory, then, software samplers can sound every bit as good as hardware ones — in some cases even better — but if you're going to rely on a very cheap software sampler you should listen to its sound quality carefully. After all, many musicians claim that the mixing engines on different audio sequencers sound different, and these are only combining sounds recorded at the same sampling rate. When every note has to be independently sample‑rate‑converted, it's hardly surprising that sound quality may vary more radically.
When it comes to software synths the situation is slightly different. There are two main design approaches to software synthesizers: sampled and generated. In the first case, each oscillator waveform consists of a single looped cycle placed in a wavetable, just like a source sample in a software sampler. This is then passed through further virtual filters and amplifiers, and this design can suffer from exactly the same problems as software samplers when their waveforms are transposed upwards.
This problem can be reduced by having a set of wavetables for each basic waveform, with differing shapes depending on the pitch at which they will be used. So, for instance, the wavetable used to provide a square wave would be very nearly square at low frequencies, while the square wave used a couple of octaves higher would have slightly rounded edges as the offending harmonics are removed, and the 'square wave' used at 7kHz or above would have to be almost a sine wave, since most of its harmonics would be above 22.05kHz. Apparently this is the most common method used in soft synths, providing a reasonable compromise between processing power and low distortion. However, some aliasing will always be present, particularly with high‑pitched notes.
The other main way to produce digital waveforms is to generate them mathematically, and then only 'downsample' them to 44.1kHz just before sending them to the D‑A converters on the soundcard so that you can hear them. Examples of this approach include physical modelling synths such as Seer Systems' Reality (reviewed in SOS November 1997) and the AAS Tassman (reviewed in SOS July 2000), and some analogue soft synths like TC Works' Spark Modular. The mathematically generated waveforms themselves are usually created at a high sampling rate and thus don't suffer from audible aliasing. However, the downsampling process has to incorporate an anti‑aliasing filter to eliminate frequencies higher than half the output sample rate, and the design of these filters can vary in quality. It's possible that some low‑budget software synths might use significantly worse anti‑aliasing filters than more expensive programs or hardware synths, but there's no reason in principle why a soft synth must do this less well than a hardware equivalent.
Some synths of this design allow you to choose the internal sampling rate — NI's Reaktor, for instance, provides sample rates of up to 132kHz — but of course this increased processor overhead proportionally. If you find that you can only play up to a certain note before nasty noises cut in, you may be able to increase this by an octave by doubling the sampling rate, but this will also double the CPU overhead.
In all respects, carefully designed software synths can sound just as good as any other DSP‑based synth. Once again, the only solution is to use your ears, and don't assume that a freeware VST Instrument will sound as good as a £150 commercial release. Overall it seems comparatively easy to write code to generate waveforms, but considerably harder to develop high‑quality anti‑aliasing filters to keep the nasties at bay.
An important issue for many musicians considering the move from a hardware sampler to a software‑based one is whether or not their existing sample library on CD‑ROM or dedicated SCSI hard drive can still be used. Thankfully most commercial developers recognise this, and the majority of major computer‑based samplers such as Bitheadz's Unity DS1, Creamware's Powersampler, and Nemesys' Gigasampler range have built‑in facilities to extract samples and/or programs from Akai‑format CD‑ROMs, import them into the application, and save them in the appropriate new format. However, importing and converting samples before you can audition them can take some time, and is not very practical if you have a large sample library. To overcome this problem you really need a dedicated utility that can grab multiple sounds or even an entire CD‑ROM at a time, convert it to the new format, and save it on your PC hard drive for immediate use.
Gigasampler and Gigastudio owners can already do this using the integrated Sconverter utility. This lets you view Akai‑format CD‑ROMs like any other attached drive, and then convert your selections into Gig or WAV‑format files. It also provides batch facilities so that you can select any combination of programs, volumes, or partitions for simultaneous conversion.
Unity DS1 owners with large sample libraries on CD‑ROM will benefit from the Osmosis sample conversion utility from Bitheadz. This reads both Akai S1000/S3000 and Roland 760/770‑format CD‑ROMs, Zip disks, and sampler‑formatted hard drives, and can then convert their program data into Unity DS1 or SampleCell formats, and the samples into Unity DS1, AIFF, or WAV formats. Using a clean single‑window interface, it displays the contents of CD‑ROMs in a similar way to Windows Explorer (see screenshot, left), with nested folders showing partitions and the volumes within them so that you can navigate to any program or sample.
You can audition samples either in single‑shot or looped modes by clicking on the appropriate transport buttons; select any number of items for conversion using the mouse; or convert the entire disk. Various options make the process more transparent — if All Parameters is selected the Akai or Roland envelopes and LFO settings are used rather than default ones, Merge Stereo Samples combines samples ending in L and R to a single phase‑locked stereo file, and Roland De‑emphasis compensates for the high‑frequency response of some Roland samplers. Osmosis works well, and at around £125 will please many Unity DS1 owners.
For those whose PC‑based samplers use other formats, such as the SoundFonts used by the SB Live! and Emu APS soundcards and Seer Systems' Reality, or other formats like the MAP files of NI's Reaktor and STS files of Creamware's Powersampler, the best third‑party utility that I've found to date is CDxtract. Now at version 3.3, it is able to read Akai S1000/S3000, Roland S7xx, and SampleCell/PC files not only from CD‑ROMs but also from Zip drives, MO, Jaz cartridges, and in fact any hard drive formatted by one of these samplers.
Its main window is divided into three data columns and a display area. The left‑hand column shows a list of Partitions; in the middle are the Volumes contained within the currently selected Partition, and on the right is the list of Programs and Samples contained in the currently selected Volume. The column headings change to Volumes, Performances, Patches and Samples with Roland CD‑ROMs. If you select a Program the right‑hand resizable area displays its total size and the keygroup data, and clicking on the keyboard button beneath opens a further graphic window showing velocity and key splits for each multisample, along with further information including any modulation and amplitude envelopes. If you select a Sample the right‑hand area instead displays its waveform, complete with loop point. You can either auto‑play each sample when you click on it (which saves a lot of time compared with Osmosis), or click on the Play button.
You can save Samples in WAV, AIFF, or MP3 formats, and Programs can be saved along with their respective samples in SoundFont, Mesa, S5000/S6000, Reaktor, Pulsar, or Gigasampler formats. Various options exist when saving samples in WAV format: you can save only one‑shot samples, only looped ones, or just the looped portions (ideal for Sonic Foundry Acid owners). Selecting Volumes or Partitions lets you save all the Programs contained within them, and you can create an identical directory structure on your destination hard drive, and a detailed contents Report in HTML format.
Given the number of different sounds you can get on an Akai CD‑ROM, many musicians will find CDxtract valuable just for its Akai database and search functions. If you activate the Database function, the directory contents of every imported Akai CD‑ROM can be added to your database. Clicking on the Search function then opens the CDxplore window, where you can enter any text string to find matching Volume, Program, or Sample names.
CDxtract is incredibly quick and easy to use, and makes finding or converting sounds about as simple as it could be. At just 79 Euros (£48.20), the Full version is also extremely good value for money; for those who only need to save in SoundFont format, along with WAV, AIFF, and MP3, and can manage without the database, the Multimedia version is just 44 Euros (£26.85). A more expensive Publisher version is also available for those wanting to use CDxtract to convert commercial sample libraries.
Finally, another even more comprehensive conversion utility is Chicken Systems' Translator, which has an impressive list of supported formats including Akai, Emu, Ensoniq, Korg Trinity/Triton, Kurzweil, Roland, and Yamaha, along with a wide selection of software sampler destinations. It's also intended for musicians converting between hardware sampler formats, and looks extremely powerful. However, when I looked at the web site there was still no downloadable demo, and many of the formats haven't yet been implemented in the current release, so it's too early to judge. The full price of $149.95 will also put some musicians off, although cut‑down 1‑in/1‑out format versions are available at $49.95.
The other major issue which has put many people off software instruments is that of timing. Most musicians are aware of the unique problems of latency (if you're not, see SOS April 1999). The latency of a system is the delay between an input signal arriving, and the software‑monitored version being heard. In the case of soft synths and samplers, this means a delay between pressing a MIDI note and hearing the software‑generated audio output. Hardware synths and samplers also have latency, but here it is totally under the designer's control and is thus usually reduced to a negligible amount. In the case of software products it's largely down to the soundcard drivers and soft synth application, and can therefore vary a great deal.
The development of ever‑faster PC processors and hard drives, along with improved soundcard drivers, have resulted in typical latency values for soundcards with ASIO and EASI drivers dropping to under 10mS. This makes it feasible to monitor audio inputs with software effects without hearing an obvious delay, and to play software instruments in real time without them feeling 'sluggish'. Sadly, though, there is another timing issue that can cause sloppy timing — latency jitter. Let me explain the difference between the two, and how this can be prevented.
Although soundcard drivers provide very smooth continuous recording and playback of digital audio, the situation changes when you are triggering soft synths or soft samplers. With a typical buffer size of, say, 512 bytes, giving a latency of about 12mS (512/44100, at 44.1kHz), the time it takes to start a new sound triggered from a MIDI signal depends on when the trigger starts in relation to the emptying or filling of the buffer. With ASIO drivers, two buffers are used — one is filled while the other is being played back — so the shortest time it can take to react is one buffer, or 12mS, and this will happen if the MIDI signal arrives just before the changeover between the two.
However, if the MIDI trigger happens to arrive just after the latest buffer has been output, it has to wait 12mS before this new sound can be copied into the next buffer, which will itself take 12mS to be output, resulting in a latency of 24mS — double the previous value. These are the two worst‑case conditions, but it does mean that the delay between a MIDI trigger and the resulting soft synth or soft sampler sound will be anywhere between 12 and 24mS when the latency is 12mS. This can improve with software that uses more but even smaller buffers — Gigasampler, for instance, uses three buffers with a size of 128 bytes, resulting in a soft synth latency that varies between 6 and 9mS.
Software applications can automatically take into account the fixed soundcard latency value during playback, so the entire part won't sound consistently delayed. However, latency jitter can make individual beats within a track sound out of time. Let's consider a strictly quantised percussion track where every beat should occur exactly on a quarter note. Thanks to latency jitter, the timing will actually vary: the longest time between two adjacent beats is when the first occurs at the desired time and the next is delayed by the maximum latency jitter. The shortest is when the first beat is delayed by the maximum latency jitter, and the following one is spot‑on. Unfortunately, therefore, the timing variation between adjacent beats can therefore vary by as much as twice the latency jitter value.
So, if your PC system and soundcard drivers are capable of reliably using a latency value of 12mS or so without glitching, this will result in a maximum timing variation of 24mS between consecutive beats — not disastrous, but certainly audible. However, some musicians use a low latency setting to achieve good real‑time performance with soft synths, and then increase it to get reliable playback with lots of simultaneous audio tracks. In this case, with a latency of say 24mS, the playback timing between regular notes in a drum or percussion track may vary by up to 48mS — nearly a tenth of a beat at 120bpm!
This latency jitter is the reason why some musicians complain about their soft synth and soft sampler tracks not being 'tight'. Unless special steps are taken, all soft synths and soft samplers can suffer from this, although there is a cure available. The solution is for the software to always delay sending the generated waveform to the buffers until the entire current buffer cycle has finished, when it can place the waveform at exactly the required position within the buffer. Unfortunately, while completely removing latency jitter, this also doubles the latency value for ASIO drivers.
The timing of VST Instruments is sample‑accurate during playback, since the MIDI events are time‑stamped, but currently both the VST Instruments of Cubase 5.0 and Emagic's instruments in Logic Audio 4.5 suffer from latency jitter when recording, although Logic 4.2 apparently didn't.
However, some developers already provide a choice. For instance, Martin Fay's VAZ Modular (reviewed in SOS March 2000) has a main MME Buffer Size control that you set to the lowest value that gives stutter‑free audio. You can either accept this lowest but jittery timing, or set the associated Latency Trim control to a suitable jitter value to provide higher overall latency but more stable timing. Matthias Carsten of RME has just placed a very thorough and thought‑provoking article about this whole subject in the Tech Info section of the RME web site, complete with explanatory diagrams (www.rme‑audio.com).
As long as your PC is fairly powerful, well set up, and has a high‑quality soundcard, your software sounds should emerge just as cleanly as from a rackmount synth or sampler; you should be able to use your existing sample libraries; and timing problems should not be insurmountable. If you have enough computing power then your PC can probably run a MIDI + Audio sequencer alongside your synth or sampler, although this will prove somewhat easier if you use integrated software instruments rather than stand‑alone applications.
The main limitation still seems to be processing power — or rather the lack of it — and you'll need a powerful computer to provide equivalent polyphony to a modern hardware synth. One exception is Nemesys' Gigastudio, which perhaps gives us a glimpse of what may be commonplace in a couple of years' time: by streaming sampled voices direct from your hard drive, it's possible to achieve 160‑voice polyphony with an 800MHz Pentium III or Athlon processor, as well as non‑looping samples of virtually unrestricted length. Show me the hardware sampler that can do that!
I've discussed the basics of software synthesizers and samplers several times in the past, but here are the main points again:
- Sound quality will depend on how well the developer has written the software algorithms, but ultimately the quality of the sound you hear will be determined by your soundcard's D‑A converters.
- If you have a CD‑R drive, and intend to burn an audio CD after digitally mastering your music, the quality of the soundcard may be irrelevant, since it will only be used for monitoring purposes.
- Latency (the time between pressing a note on your MIDI keyboard and hearing the sound generated) depends partly on the synth software, but far more on the soundcard driver. ASIO or EASI drivers will nearly always provide lower latencies than DirectSound ones, with MME ones being slowest and in some cases almost unplayable as a consequence. Check on the software developer's web site to see how your soundcard fares before buying software.
- Drivers dedicated to one application, such as the GSIF ones used by Gigasampler and EASI ones used by Logic Audio, should always provide lower latencies than the other choices available, since they are more closely integrated with specific applications.
If the concept of using software synths and samplers to replace hardware is new to you, I've already written quite a few features that explain specific aspects of their performance. These back issues are also available in electronic form on the SOS web site at https://web.archive.org/web/2015..." target="_blank:
- Using multiple soundcards (February 1999).
- Latency (April 1999).
- Connecting audio & MIDI signals inside PC music software (May 1999).
- The all‑in‑one PC studio (October 1999).
- Understanding and using multi‑client soundcard drivers (November 1999).
- Improving specific musical aspects of PC performance (February 2000).