Will physical modelling continue to be at the leading edge of synthesis, or are there other methods moving up on the inside track? Paul Wiffen winds up the Synth School series with a little crystal ball‑gazing. This is the last article in a 12‑part series.
There are many lessons to be learnt from the various technologies we have examined in Synth School over the last year or so. The history of FM teaches us that a method of synthesis can go from being the be‑all and end‑all of the professional synth market to the lowest common denominator of computer video games in a relatively short time (and that despite this, Yamaha are probably making more money out of FM today than they ever did in the heyday of the DX7). The elevation of the fat analogue sound to the modern Holy Grail, when 10 years ago you couldn't give analogue‑sounding machines away, warns of the dangers of selling off old gear in pursuit of the latest sonic fashion. But perhaps the most important lesson is a general one on how the relentless development of VLSI technology driven by the computer industry (to which we are but a very small sideshow) turns today's impossibility or very expensive luxury into tomorrow's staple product (which doesn't really get anyone excited anymore).
Take additive synthesis as a classic example; it is a much more powerful technology in its only current production incarnation, the Kawai K5000, than the infinitely more restricted non‑real time implementations which the Fairlights and Waveterms offered 10‑15 years ago. Even the early real‑time implementations like the K5 and the never‑released Technox Axcel caused more of a stir than something wonderful which you can now buy for around £1000. Sampling is another classic example; the early Fairlight which turned the whole industry on its head had lower sample quality than the most despised Soundblaster‑compatible PC soundcard. The former would have cost you £25,000+, the latter you can pick up for under a ton.
As far as physical modelling is concerned, I feel we are midway between these two extremes. Yamaha, who released the first commercial available physical modelling synth, the VL1, have now adapted that same technology to a £500 module or an even cheaper plug‑in card for their computer‑based system. Korg's OASYS, perhaps the most powerful modelling synth exhibited to date, has never been released, because the days of even megastars shelling out thousands and thousands of pounds for the first implementation of a new technology are over. This hasn't prevented the technology it contained being extremely successful (in this country at least) in the Prophecy. Korg's current Z1 covers more territory than any other physical modelling synth, from analogue and FM‑type synthesis through to a host of string and wind instruments, but I often hear people complaining about it because it can only achieve 18‑note polyphony and 6‑part multitimbrality (PCM‑based synthesis has made people blasé about amounts of polyphony, sample memory and multitimbrality which would have seemed like science fiction five to 10 years ago).
The current state of DSP technology means that certain areas of imitative synthesis are still no‑go zones simply because of the sheer amount of DSP power required. But DSP technology is now progressing so fast that I suspect it won't be that long before all the sympathetic harmonic interactions between strings on that most complex of instruments, the grand piano, will succumb to the computational power of the microchip.
The real challenge these days for physical modelling is not the perfect recreation of acoustic instruments or even the biggest sounding, most powerful analogue‑style synth ever, but making the technology easy to operate by people who have never even learnt the basics of analogue synthesis (none of whom are amongst SOS readers, I am sure). The various solutions to this, from the increasing use of dedicated front‑panel knobs or X‑Y pads or ribbon controllers, through to SysEx control by computer programs, have helped expand the market for physical modelling, but I still feel that this is just another example of 'dumbing down' technology so it can be sold. For the time being at least, the development of physical modelling seems to be its consolidation into more marketable versions of the technology, and its integration into workstations (see last month's sidebar on "Combining Physical Modelling with PCM"). So what other contenders are there for the Future of Synthesis?
An old chestnut which periodically turns up is the concept of resynthesis. This is the name given to a generic process whereby an analysis of the sound (usually sampled) is made in an attempt to break it down into its constiuent parts, which can then be recreated piecemeal from basic building blocks. These building blocks are usually hundreds of sine waves which are used to build up the harmonic content of the sound, the sound having been analysed in the first place via a Fast Fourier Transform. Those of you who saw Duran Duran's 'Reflex' video will have seen Fairlight displays of FFTs on its samples, usually compared to a plot of a mountain range or the seabed. The Fairlight was not the only system which could produce pretty FFT displays. They were even possible on the cult UK sampler Lynex in the late '80s which ran on the Atari ST. However, all these systems had one thing in common; they could produce a lovely picture from a sample, but they wouldn't let you change the harmonic content, because they couldn't actually turn the sound into its constituent harmonics, let alone convert it back to a sample.
Because FFT analysis breaks the sound down into harmonic content, it made sense that the first systems which could attempt a reconstruction would be additive synthesisers. In fact, one of the earliest commercially available systems was a Dr T's program for the K5 that ran on the Atari. Although there were not really enough harmonics and envelopes available on the K5 to cope with really complex sounds, it would produce recognizable versions of simple sounds which made good starting points for sound design rather than having to set all the harmonics manually from scratch (in fact, if anyone out there still has a copy of this software, perhaps they would contact me via SOS as I would love to get my hands on it once again). Of course, if someone were to do something similar for the current Kawai, the K5000, which has a much more flexible implementation of additive synthesis, this would probably get a lot closer to a useable resynthesis system.
Perhaps the best resynthesis I ever heard was on the Technox Axcel, a system which came originally came from Canadian academia, but which went through the inital phases of commercial marketing. It had a flexible additive structure which could assign more or fewer harmonics to each voice as required (although this meant more complex sounds had less polyphony), and at the Paris show in about 1989 they had got the resynthesizer analysis working. I heard a very respectable resynthesis of a flute sound, complete with the more demanding breath component (a flute on its own wouldn't have been that impressive, as the pitched component is a fairly simple harmonic series). However, I never got sufficient hands‑on time to evaluate the potential of the system on really demanding sounds. I believe Jean‑Michel Jarre bought that unit, but the company went in liquidation shortly afterwards and very few units were actually shipped.
Over three years ago our venerable editor wrote a piece about Oberheim Electronics (now owned by Gibson) having developed a similar system in conjunction with Berkeley, Stamford, MIT and IRCAM (see the January '95 issue) under the unlikely name of G‑Wiz Labs, but we have no more recent information, so either the development process is taking longer than they thought or the project has been abandoned. Again, as its name implied (FAR — Fourier Analsis and Resynthesis) it seems to have used an FFT analysis of the source sample to set up harmonic components. One potential problem with resynthesis, the recreation of unpitched noise components, was dealt with rather elegantly by comparing the result with the original and then creating shaped noise to fill out the differences. At $10,000 plus a Macintosh, it was not cheap, but Paul's report mentioned a recognisable line from Suzanne Vega's 'Tom's Diner' being replayed at different pitches and tempos without any of the normal drawbacks of sampling. Certainly, resynthesis is one of the few systems which seems to have the potential for synthesising vocal performances.
The appeal of resynthesis is that it would have all the advantages of sampling, in that any sound which can be played in to the system could be reproduced, but without the disadvantage of samples playing back at different lengths when repitched. When a resynthesis is triggered at different pitches on something like the Oberheim FAR system, the replay time would be constant and noise elements in the sound would not be repitched at all. Looping would also no longer be a problem; you would merely extend the duration of the harmonic series in the sustain phase of the sound. Of course, the repitching would not necessarily remove all the problems associated with sampling. Sounds which have been shaped by some sort of resonant chamber (human voice, bowed strings, guitars, etc) would have the harmonic boosts/dampening repitched, which introduces the Pinky and Perky/Carlsberg effect that often forces multi‑sampling. This is where physical modelling triumphs as it splits the sound into the driver (which is usually repitched) and the modifier or resonator (which usually doesn't change).
Perhaps the ideal resynthesis system would be one which does not simply reduce each slice of sound to its constituent harmonics, but would instead look for the effect of a constant resonator in a longer sample of an instrument playing across its range, and would then recreate the harmonic spectrum of the driver separately from the resonant amplifier of the modifier. It might be referred to as 'remodelling'.
One drawback with resynthesis or 'remodelling' is that would leave nothing for the programmer to do. Just play the sound in, let the computer do its number‑crunching and hey presto — your sound can be played back from the keyboard. Of course, if the sound has been broken down into constituent harmonics, then the levels of these could be edited or adjusted in real time for creating new sounds or adding expression, but it still reeks of the increasing dominance of factory presets and lack of user editing and personalisation of the sounds. 'Remodelling' would be better as you could adjust the parameters of the model to make new sounds. But still I find I miss the challenge of 'pure' synthesis, where you have to be the brains and do the analysis of the sound yourself and then recreate it with the parameters available (or even make up a completely new sound).
Where Do You Want To Go Tomorrow?
So if you are interested in synthesis and sound design for its own sake, rather than having specific timbres to recreate or gigs to do with the minimum number of synths, then where are the new frontiers? Where can you rediscover the thrill of finding a new way of doing things, or even a technology to misuse or trick into doing something unique? The answer to this question, as with so many these days, seems to concern computers and the Internet. In fact, most new types of synthesis since the '80s have been developed at their theoretical and experimental stages through computers. Generally speaking, a designer/engineer had an idea or came across a phenomenon when doing something else which he thinks has potential. The cheapest way to investigate further was to set up some computations on a generic system, ie. a computer, which can be programmed to simulate (often not in real time) the effect which will be produced when certain novel configurations and/or processes are tried. He then took this to an electronic music company and tried to persuade them to take it a stage further. This sometimes took the form of developing specific hardware which is fast enough to do things in real time (like Yamaha's development of John Chowning's FM) or, alternatively, adding it to an exisiting generic product like a sampler. A good example of this latter is Emu's addition of Transform Multiplication as part of the SE software upgrade to their Emax samplers (see 'Transforming Samples' box opposite).
So back then to our own computers and their umbilical link to the repository of human knowledge that is the Internet. Modern personal computers' CPUs are now so fast that they rival the computational power of systems that only major manufacturers or universities could afford 10 or 20 years ago. You also now have a direct link to the people in educational establishments who are trying to push back the boundaries. Lacking any other public forum to publish their ideas, many academics now post their ideas and sonic experiments on the Internet, just for the satisfaction of airing their concepts to a wider audience who can try their techniques out (indeed it is difficult to see how some of these methods could be implemented into a tradition commercial synthesizer). As a result, you can get into more or less esoteric forms of sound generation at the leading edge of academia via that PC or Mac sat in the corner of your living room. One that has been coming to SOS's attention over the last few months is Granular synthesis, explained elsewhere in this article.
The main lesson however, is that it has never been easier to get into weird and wonderful forms of synthesis yourself. With a computer and an Internet connection, you can do your own research, download examples and descriptions and then with a sampler or generic synth you can recreate some of the things described and try them for yourself. New types of synthesis without expensive new keyboards — sounds great to me. So Synth School is not exactly coming to a close but transferring to the Internet (a sort of Open University for the new millennium). Get your search engines in gear and you can try three impossible methods of synthesis before breakfast.
And so we reach the end of the final instalment of Synth School. I have thoroughly enjoyed writing this series and I am particularly grateful to all those of you who have cornered me at trade shows or product launches and been kind enough to say how they have found it useful. Perhaps the most important message I have tried to put across is this: refuse to use factory presets and make up your own sounds using whatever tools come to hand — your music will be the better, or at least the more individual, for it. If you have been led by any of these articles to try out new ways of creating sounds, (or even return to some old ones you thought you had left behind) then these articles have done their job.
Sprinkle On The Granules
There are numerous references on the Web to Granular synthesis, a method which builds timbres out of very small snippets of sound stuck together to create completely new timbres. Having been informed by various authorities (including Leon Zadorin from some Antipodean seat of learning or other: www.academy.qut.edu.au/music/new...) that the content of the granules is less important than their size and shape (or, as he put it, "human perception of frequency, duration and amplitude tends to reside within a practical minimum)", I decided to dig out my sampler and have a bash myself. As long as your sampler does not restrict the smallest loop length you can have (as some of the early Akais and Rolands did), pretty much any sampler will do. The length of these 'sonic grains' (as each small snippet is known) should apparently be less than 100 milliseconds, because anything larger than that starts to reveal the source sound.
I started by cutting and pasting a small snippet of sound (less than 1/10th of a second) to itself until I realised that way was going to take for ever. Then I realised the I could use the loop length to replay the small snippet over and over. As long as you keep the loop length very short, the granulated sound bears absolutely no apparent relationship to the source sample. To begin with I used the auto zero crossing feature on the Prophet 2000 to make loops with a smooth cycle crossing in them, which tended to produce very pure sounds with not too many harmonics present, but then I realised that was spoiling the fun. So then I turned to the Roland S760, which doesn't automatically find zero crossings, and things got really interesting. By setting the loop points almost randomly, you get some fantastically twisted angular timbres. I then found a way to move the fixed loop length around quickly within the sample, which made a very quick way of changing the timbre radically.
Reading further with my faceless Australian mentor, I discovered that another factor is the 'density' of the grains (ie. how much silence there is between them). So then I started to cut and paste some silence in at the end of the loop and found that this tended to make the timbres slightly more acceptable to those of a nervous disposition. Basically, adding silence between the grains seemed to act like adding water to Scotch, making the sound more platable to the sensitive soul. Mind you I never got to any sounds I could have played to my mother, but then isn't that what rock & roll is all about? In these days of techno and other industrial types of dance music, this technique seems to have a lot going for it. I strongly recommend experimenting with it, if you have a sampler and a couple of hours to kill.
Transform Multiplication was a form of synthesis unique to Emu's Emax range of samplers, which used some heavy computational algorithms to combine two samples in a unique but time‑consuming way. The process came up with some weird and wonderful sounds ideal for futuristic timbres and sound effects, but it suffered from the same problem as many non‑real time implementations of synthesis: the process of tweaking a promising first try into a satisfying sound could take days. When a typical computation duration exceeds thirty minutes, the problem is not so much that creating a completely new sound from a set of parameters entered takes a long time (although this will deter the superficial user), but that each minute adjustment of those parameters, or to use the technical term, 'tweak', takes exactly the same time. So to refine a promising sound can be soul‑destroying, especially if you are at the experimental stage where you do not know exactly what each of the parameters will do. Changing a parameter in the 'wrong' direction or altering the 'wrong' parameter altogether means that you have sentence yourself to another long wait just to get back in the right direction. Indeed, to become as familiar with Transform Multiplication as I am sure many of you are with the other forms of synthesis we have looked at might well take a lifetime, unless someone comes up with a real‑time implementation. Perhaps Gerry Basserman, who did the demos for Emu for years, might well have reached the stage where he was confident of the effect that individual parameter changes to Transform Multiplication would have, but I suspect that there are precious few others. My experimentation with this technique often produced some fascinating results, but I never really felt like I was doing anything more than randomly combining samples which sometimes had serendipitous results. I certainly never felt completely on top of the method. However, if Emu or anyone else were to come out with a real‑time implementation of this style of synthesis, you can bet I'd be first in the queue to master the technique. Sadly, the cynic in me suspects that the market for synthesis styles which create new sounds rather than attempt to duplicate old ones is not large enough to prompt Emu or anyone else to produce the expensive hardware this would need (probably leaving physical modelling far behind in terms of the raw horsepower required). In the meantime, if you can get your hands on a Emax SE, Transform Multiplication will certainly satisfy an appetite for new weird and wonderful sounds.