Granular synthesis is the core technology behind the latest time-stretching and pitch-shifting algorithms, but it can also be used to generate extraordinary evolving soundscapes. We explain how the process works and show you how to get the best from the software that uses it.
The majority of software instruments use variations on the synthesis method known as subtractive synthesis. This is the sound generation method where you start with simple (yet harmonically rich) waveforms such as triangle, square, and sawtooth waves, then use volume envelopes, filters, filter envelopes, and LFOs (Low Frequency Oscillators) to sculpt the starting sound into something more musical. The reasons why subtractive synthesis is so dominant are both historical and practical. The historical reason is that most of the synthesizers that shaped the development of electronic music production (the classic analogue Moogs, ARPs, Korgs, and so on) used this scheme. Hence subtractive software synthesizers are commonly known as 'virtual analogue' instruments. These instruments are what musicians are accustomed to using, and they make characteristic sounds that have become part of the common musical sound repertoire. On a practical level, these synths are relatively easy to learn, and can be modelled in software without using a huge amount of processing power. It is probably for this last reason that subtractive synths, and straightforward sample-playback instruments, have taken such a lead in desktop music. However, as computers have become much faster, digital signal processing techniques that were once the preserve of academic labs and telephone companies are finding a strong foothold in music software.
The technique known rather grandly as granular synthesis is an extremely powerful audio manipulation system that makes it possible to adjust the speed, pitch, and formant characteristics of audio samples independently of one another, and all in real time if your computer is fast enough. However, granular synthesis principles can also create new and often spectacular shifting sounds using very basic means.
Acid, now at version 5 and under the care of Sony, was the first to make real-time pitch and time adjustment well known, and nowadays most people into computer music will have played with Ableton's tempo-warping Live software, not to mention Apple's Garage Band. Celemony's Melodyne is now arguably the purest and most sophisticated package for editing audio using granular synthesis, managing to carve out a niche alongside the mighty Auto-Tune.
Native Instruments Reaktor has always had this technology right at its heart, but focuses on the creative sound-design possibilities of the granular approach. NI's work in this area has led to the powerful time, pitch, and timbre manipulation in their Kontakt, Intakt, and Absynth packages, finally blurring the line between samplers and synthesizers. Propellerhead's Reason package also contains a granular synth Malström, and even Fruity Loops Studio has the Granulizer.
The aim of this article is to explain the basics of how granular synthesis works (for those with an interest in these things), and also to describe some examples of it in action. For those who use software with granular synthesis technology under the bonnet — whether it is for time/pitch manipulation or sound generation — understanding how it works should shed some light on how to approach some of the more esoteric parameters like Grain Size or Smoothing.
Granular Synthesis 101
Have you ever wondered how some audio-editing programs and plug-ins can manipulate the tempo and pitch of audio independently? Normally, of course, the laws of physics tie these two parameters together: slow audio down and the pitch drops proportionally. The screenshot on the facing page shows some very close-up views of audio waveforms in Pro Tools. The top track is a very short section of a female vocal recording. The point we're looking at is part of the 'ooo' sound in the word 'you'. The second track has the same audio clip that's been slowed down dramatically using Pro Tools ' built in Time Stretch plug-in. Notice how the waveform itself has not been stretched — this would cause a drop in pitch, because pitch is inversely proportional to wavelength. Instead, the Time Stretch algorithm has detected a repeating wave pattern, and simply looped it to achieve the extra length. The third track shows the original clip transposed up seven semitones by the Pitch Shift plug-in. The original waveform has been squashed horizontally (in time) to achieve an increase in pitch, so again the algorithm has had to loop the waveform, this time in order to preserve the length.
This scheme works because, although most sounds sound to our ears like they change and develop quickly, when you zoom in and look at even the most complex waveforms (like speech) you see that, in fact, many parts of harmonic and vocal sounds consist of steady periods of a repeating waveform, with short transitions in between. A little experiment makes this clearer: try saying your name really slowly and listen to the sound you make. For me that goes something like, 'sss-aah-eee-mmm-nnn'. Your voice moves from one consistent steady sound to another, except for when you get to hard consonants like 'k' or 't' (see the 'Drums & Transients' box for more on this). At the waveform level, steady sounds appear as many cycles of the same small wave shape. So, if I recorded myself saying my name at normal speed into Pro Tools, I could zoom in and painstakingly loop the waveforms during each section of the word (crossfading the edit points), and end up with something that sounded similar to me saying the word slowly, but at the same pitch. Conversely, I could speed myself up by deleting some of the cycles from each portion of the word. This is the basis of how time-stretching works.
Now, say that I took a few cycles of waveform from each sound in my name, and mapped them as loops onto keys in a sampler. One key would give a steady 'sssssss', another a steady 'ahhhhh', and so on. Now if I pressed each key in rapid succession it would (roughly) re-synthesize the original recording using these 'grains' of sound. If I played the sequence of keys faster, the word would be reconstructed faster, but the pitch would stay the same. Also, I could push the pitch-bend wheel to pitch up the samples, but still play the key sequence at any speed I liked. What's more, I could play back the sequence in any order, and even make the sounds overlap by holding down more than one key at a time, generating an entirely new and more complicated sound. This is how granular synthesis works.
Granular synthesis is a catch-all term for a number of different audio systems that work by using tiny snippets of sound that can be manipulated individually and are recombined to generate the final output. The majority of granular systems available use audio files/samples as their raw material. Samples are sliced up (behind the scenes) into a series of tiny sections, each usually between one 100th and one 10th of a second in duration. Each slice is known as a 'grain', and a sequence of grains is called a 'graintable'. If the software made up a graintable which played back all the grains extracted from a given sample in their original sequence and at the original speed, then you'd hear the original sample reproduced. If the software played the sequence back more slowly, gaps would appear between the slices, so the current slice in the graintable is usually looped. Played back more quickly, each grain overlaps with the next one, or some grains get skipped depending on how the software works. To avoid clicks and glitches, each grain is faded in and out with a volume envelope, a process known as 'smoothing'.
Warping Time & Pitch With Granular Synthesis
Native Instruments' Intakt is a loop playback and manipulation tool with three 'audio engines'. As well as basic sample playback and a beat-slicing mode (handling each rhythmic event in a loop as a separate sample), Intakt has a granular Time Machine mode. To the left of the waveform display you get two knobs, both marked Tempo. The smaller one on the right gives you manual control over the speed of playback of the sample, and is great for experimenting with how granular time-stretching works and sounds. While playing the sample, if you turn this knob anticlockwise the playback gradually slows down, but maintaining the original pitch. If you go to extremes, you should be able to hear what is happening — at about five percent of the original speed you can clearly hear the playback graintable stepping from one grain to the next, each grain being looped until the next one takes over.
To the right of the waveform are some controls that show up in different guises in most granular synthesis-based software. The first is the Grain Size control, which is a pop-up list of options in Intakt. Grain Size is the length of each slice of sound, determining how finely the original sample is chopped up. In Intakt, the list gives suggestions for which grain size to use to obtain the most transparent results for different types of material. There are similar parameters in the Warp section of Ableton's Live software — again, rather than a continuous Grain Size control, a list of options is provided: Beats, Tones, and Textures.
Granular Samplers & Synths
Tools such as Intakt, Melodyne, and Live use granular synthesis to edit and match the tempos, timings, and keys of recorded audio clips. A whole other breed of products uses granulated samples as the source sounds for instruments. The likes of Absynth, Malström, and Kontakt all use the familiar synth/sampler instrument structure, with sound generators being modulated and filtered. The difference is that they can all swap the usual sound-generation stage of oscillators or samples for granular synthesis engines. A detailed analysis of how this works in Malström can be found in the Reason Notes column in SOS August 2005. The same principle applies to other current granular synths. When used in granular mode, each sound source in the instrument is a granulated sample: a graintable. Some instruments allow the user to load their own sample (for example Reaktor, Kontakt, and Absynth), while others provide preset waveforms (Malström).
Looking at Reason Malström, in the Oscillator A section (the sound-generating module of the synth), there is a small pop-up window which selects the sample (graintable) that is to be used as the starting point. In the screenshot, I've chosen Ambient Chord 2. The other parameters on the Oscillator A module should now begin to make sense. The Index slider sets the starting position for playback in the graintable, and the entire sample is mapped out along this slider. The Motion control simply sets the speed at which Malström sweeps through the graintable, and the main pitch settings transpose the sample — speed and pitch changes are, of course, independent. Finally, the Shift knob provides independent control over the formant characteristics of the sound.
With all these controls at their zero positions, Malström behaves like an ordinary sampler, with the significant advantage that playing up and down the keyboard does not speed up and slow down the sample: it's like having a multisample map, but without having to have more than one sample. Beyond this, there is a huge amount of flexibility, and you can quickly move away from the starting point to make radically different sounds. All the controls can be modulated with Malström 's LFOs, and it's the sweeping of the parameters that gives granular synthesizers their characteristically rich and 'alive' sound. Something you can do with Malström is modulate the Index control, or sample position. As we'll see when we look at Reaktor, this is one of the most valuable tools for creating deep granular sounds and atmospheres. Playing around with the graintable position and playback characteristics means that one sample can provide the material to generate a huge variety of unexpected results.
Kontakt cannot modulate the sample position (although you can create loop points), but it does give control over some other parameters that are preset in Malström 's graintables. Specifically, it features controls for Grain Size and Smoothing. In Malström, with it's preloaded list of graintables, the grain size has been preset to provide the most transparent response. Because Kontakt can load any audio file as its starting point, the user must set the grain size. This means that you can forget about transparency if you wish, and go for a more grungy sound. You can also modulate the grain size via an LFO or envelope. The Smoothing parameter, to recap, is the volume envelope applied to each individual grain, so it's effectively a fade-in/out control. Again, you can set this to produce a nice even response, or go for a special effect.
The last synth I want to look at is Absynth, because it features yet another parameter, leading us towards the full implementation of granular synthesis found in Reaktor. Most of the parameters in Absynth 2 should now be familiar, but a Density setting has also been added. This sets the number of grains that can be playing back at once, which in Absynth 's case can be between one and eight. All the examples we've looked at before can be likened to having a single 'play head' sweeping around the graintable in a mostly linear fashion. However, granular synthesis gets really interesting as a sound-design tool when you start firing off multiple grains simultaneously, and not necessarily in sequential order. Absynth doesn't go quite this far: its Density control just provides for varying grain overlap, which means that you can have several neighbouring grains firing at once as the graintable is played. This smooths out and thickens the sound, but inevitably adds a metallic or phasey characteristic, as you are overlapping a series of similar-sounding grains with a tiny delay between them.
Drums & Transients
Percussive sounds and drum loops pose some fairly major challenges to granular synthesis engines, especially when you're wanting to time-stretch samples to slow them down. Granular time-stretching relies on the fact that a lot of what we hear consists of repeated cycles of small waveforms, but transients (like drum hits and hard consonants in vocals/speech) are quite different. These parts of a sound are typically short, complicated, rapidly changing waveforms. When a sample is split into grains, the transients may fall within a whole grain, or split across several, depending on the grain size used. Neither of these situations is welcome, because when the graintable is played back slowly grains are moved apart and looped. You will probably have heard the problem this causes: drums that have been slowed down by time-stretching start to sound flammy. The same goes for vocals, with the hard consonants st-t-t-uutt-t-t-ering.
Systems that don't have any way of compensating for this problem have a very limited range within which a sample can be slowed down. If you load a drum loop into Intakt, you can slow it down and listen for when the problem starts causing noticeable degradation of the sound. Short, sharp sections of the waveform, such as rim-shots, present a particularly tough test, especially if the grain size is set manually without any intelligent analysis. My ears can detect a drum loop's rim-shot starting to 'break up' into two peaks at just two to three percent slower than original speed, and ordinary snare drums start to flam at about four to five percent down.
There are a number of ways in which the designers of a granular synthesis or time-stretching system can improve on this situation, two of which are present in Intakt. The first is to have the software analyse the sample and choose variable grain sizes. In other words, instead of relying on a user-defined grain size, the software tries to chop up each part of the sample in the most efficient way. The software makes distinctions between areas that change rapidly, and those that are more steady tones. This at least ensures that transients are not split across more than one grain. In Intakt this option is the default, with the user being able to change to a fixed Grain Size if desired.
The second way that transient handling can be improved is to use a transient detection system to ensure that the transients are preserved in their original state, at whatever playback speed (as they would in real life drumming or vocal performances). This means that not only must they be contained within one chunk (one grain), but that they should only be played back once instead of being looped at slower speeds. Intakt and Kontakt do something like this when you engage the TRC (Transient Copy) button. The software detects peaks and sudden changes, and interprets these as transients. A second control, TRS (Transient Size) is set manually and determines a length for these sections. During playback, the original transient sections are overlaid on the loop, with their position staying correct relative to the rest of the sample.
Ableton Live has similar functionality, although it doesn't use transient detection. Time-stretching and granular settings are chosen from the sample editor window's Warp pane, and setting the audio type here to Beats tells the software to try to preserve transients. Instead of detecting these, Live uses time divisions set by the user in the Transients field, and has to assume that the drum hits land close to these. Anyone familiar with beat-slicing software, such as Propellerhead Recycle, may have spotted that this system is a best-of-both-worlds mix of techniques, preserving the original hits (as with beat slicing) but filling the gaps with time-stretched material. Another problem shared with beat slicing is that decays and reverb tails are difficult to keep sounding natural. Where available, a mixture of small grain size and large transient 'windows' often works best with drums. Without transient compensation, larger grain sizes will probably be better.
Sound-Design Tools
Despite all the sampling and synthesis flexibility afforded by the applications we've looked at so far, when most people think about granular synthesis they probably think of the rich shifting soundscapes generated by certain Reaktor patches. It's perfectly possible to build synths in Reaktor similar to those we've already looked at. For example, Triptonizer is not a million miles away from Malström, except that it uses envelopes to control the movement of sample position, formant, and so forth. However, as with the synths we've covered, this kind of instrument generally sweeps fairly uniformly through a graintable. For the more weird and wonderful sounds, we want to be layering up clusters of grains, introducing randomness, and getting away from thinking about the samples as a whole. The result is a composite sound known as a 'graincloud'. Reaktor has a straightforward sample synth module, and a Pitch Former (which is similar but moulds the results into a definite pitched sound), but it also has a module called Grain Cloud. If you don't have Reaktor, you can download the demo version and check out the factory instruments Grainstates and Travelizer to get an instant idea of what this module can do.
Most of Travelizer's front-panel options should now make sense. The large X-Y controller sets the sample position and the grain size. The waveform display has two vertical lines that indicate the current playback position and the grain size (Length). The panels to the left allow modulation of the pitch and graintable position, and there's a familiar Smoothing control. So what sets this apart from, say, Malström or Kontakt? Firstly, the Grain Cloud module at the heart of this instrument has a parameter called Distance which sets the rate at which grains are triggered. This means that, as the current playback position moves around the graintable, you can fire off as many or as few grains as you want. The Grain Cloud module can overlap up to 1000 grains at once, so the output signal is the composite of many tiny portions of the sampled waveform.
The final ingredient is the inclusion of Jitter inputs on Grain Cloud. These allow you to add varying degrees of random 'jumpiness' to several of the main parameters, namely Pitch, Position, Length, Distance, and Pan. Now, begin to imagine how things come to life when combining all these things: grains of sound are fired off from across the original sample, some are clustered in small recognisable sequences, while others are thrown in at random. The length of the grains and rate at which they appear and disappear is chaotic, and they smear out across the stereo field, overlap, and become a boiling swarm. A soundscape builds up that's like nothing you've heard before, yet the chaos and movement tricks your brain into thinking it might somehow be natural and not a synth. From this point you can mould and constrain the sound with all the familiar tools — filters, envelopes, and effects — to create a playable musical instrument, or just enjoy it for what it is.
Advanced Possibilities
Most of what we've looked at is the brand of granular synthesis that uses a chopped-up audio sample as the source of sound grains. This is because the large majority of music products available that employ granular synthesis work this way. However, this is only a partial view of what can be done. For a start, it's perfectly possible for software to use a live audio input instead of an audio file. Computers are fast enough to chop a signal into grains on the fly, then synthesize and mess with them, all in real time. This is how granular synthesis-based effects, such as Spektral Delay, KTGranulator, and many Pluggo plug-ins, work. Most real-time pitch-shifters and vocal processors are likely also to be taking a granular approach.
Mentioning Spektral Delay raises the topic of other methods of granular synthesis that have rarely seen the light of day. Everything we've looked at so far uses grains based in the time domain, but it's also possible to split up sounds by frequency and then resynthesize them, as Spektral Delay does. The next logical step will be for synths to do away with sampled or digitised audio sources altogether, and synthesise their own grains from scratch. This would be like a two-stage synthesis process, with the first stage generating an array of grains and envelopes, each probably one cycle in length (and known as a 'wavelet'), which would then be synthesized by the second stage. Something close to this could probably be built in Reaktor, using the Grain Delay module, so if you get a few months off, there's a challenge!
Granular synthesis is likely to find its way into many more instruments in the future, and is perfect for those days when you're bored of the same old array of re-created analogue sounds. Not only do granular synths create dynamic, organic sounds, they have an untamed quality and often produce unexpected treats that turn into song ideas. In fact, if you produce ambient or film music, a decent granular synth can do half your job for you!