You are here

Creative Convolution

Synthesizing Impulse Responses Using MathWorks MATLAB By Emmanuel Deruty
Published September 2010

In September's printed magazine, we introduced some innovative methods of recording and processing impulse responses. But if you know how to use a program like MATLAB, you can even generate them from scratch.

In the September 2010 issue's printed article on creative convolution (/sos/sep10/articles/convolution.htm), we've been recording and modifying impulse responses, but we have not discussed the idea of synthesizing them from scratch. In this on-line resource we'll be looking at doing just that. There are many programs around designed for this very purpose, such as IR designers like Impulse Modeler. Let's make it clear that our goal is not to emulate these programs. We don't want to spend a lot of time making IRs of a kind that already exists, only of a lesser quality. Instead, we want to find innovative methods for producing IRs that we wouldn't easily get otherwise.

Creative Convolution: MATLAB | September 2010 issue by Sound On Sound

Download all Hi-Res WAVs | 49 MB

What remains true in any case is that impulse responses are audio files. As such, they can be digitally synthesized by means of calculations, just as 'normal' audio files would be. What we need is a software environment that makes versatile audio synthesis possible, and in this article I'll be using MATLAB. Developed by the Mathworks company, MATLAB is a numerical computing environment and programming language with built-in audio capabilities. If you're just a bit into programming, the MATLAB language is not very difficult to use, and opens a whole range of possibilities.

Really, messing with audio files in MATLAB is easy. For instance, reading a mono audio file, normalising it, and saving it back to disk uses the following syntax:

(1) [audio_data, sampling_frequency] = wavread(name_of_file);

(2) audio_data = audio_data ./ max(audio_data);

(3) wavwrite(audio_data,sampling_frequency,name_of_file);

For a programming language with audio capabilities, it doesn't get any simpler. Learning a bit of MATLAB is a good time investment, and makes authoring original impulse responses quick and simple.

FFT And iFFT

For our first steps in IR synthesis, we're going to play with FFTs and iFFTs, with the purpose of creating ambience IRs. FFT stands for Fast Fourier Transform, an algorithm that makes it possible to calculate any audio file's frequency spectrum. FFT calculations in MATLAB are extremely easy, as you will realise when reading the actual code that was used during this experiment, which is available at the end of this article.

Let's consider preset 299 from the TC Electronics M3000 unit, 'Basement 1'. As can be seen on the screen capture below, early reflections are easily distinguished from diffuse field.

Using any audio editor, we're going to remove the diffuse field and keep only the early reflections. This makes a new, shorter impulse response, which sounds like this:

[IR: IR_299_Basement_1_Firstref.wav]

Let's import this IR into MATLAB and calculate its spectrum. Now, we're going to listen to the spectrum as if it were an audio file. In MATLAB, we take the variable that contains the spectrum, and export it as an audio file. This is what we get:

[Audio sample: 299_Basement_1_Firstref_FFT.wav].

It's a little short and high-pitched, so let's transpose it for convenience reasons.

[Audio sample: 299_Basement_1_Firstref_FFT_Pitched.wav].

This may seem unexpected, but it's clear that in the case of the TC Electronics preset 299, the spectrum of the early reflections is that of a harmonic oscillator. Now, this is a fact that has interesting consequences. It means that can follow the reverse path: we can start with an oscillator, calculate its inverse FFT, and that will lead us to a set of early reflections. Let's do that with three audio samples straight from virtual instruments in Pro Tools: Digidesign's Structure, Vacuum and Xpand.

[Audio sample: Sample_Structure_29_Orig.wav]

[Audio sample: Sample_Vacuum_Lead_54_2T_Orig.wav]

[Audio sample: Sample_Xpand1_Orig.wav]

To make them sound like the audio file we got from preset 299's spectrum in terms of pitch and duration, we're going to modify their pitch in Pro Tools using the standard AudioSuite Pitch Shift plug-in, without time correction.

[Audio sample: Sample_Structure_29_PiSh.wav]

[Audio sample: Sample_Vacuum_Lead_54_2T_PiSh.wav]

[Audio sample: Sample_Xpand1_PiSh.wav]

Now, we import these files into MATLAB, and calculate their inverse FFT (which is, in the context of the use we are making of it, the exact same thing as an FFT). This generates the following set of impulse responses (the 'Xpand1' IR has been EQed, in order to remove overwhelming bass frequencies).

[Audio example: Structure_29_PiSh.wav]

[Corresponding IR: IR_Structure_29_PiSh.wav]

[Audio example: Vacuum_Lead_54_2T_PiSh.wav]

[Corresponding IR: IR_Vacuum_Lead_54_2T_PiSh.wav]

[Audio example: Xpand1_PiSh_EQ.wav]

[Corresponding IR: IR_Xpand1_PiSh_EQ.wav]

These impulse responses are not perfect, but for such a mellow effort based on such simple principles, this method gives remarkably interesting results. This is the MATLAB code that needs to be written to get such results, in the case of mono samples and mono-to-mono IRs. Comments are written after the '%' sign.

(1) [audio_data, sampling_frequency] = wavread(audio_sample); % this imports the audio file

(2) spectrum = fft(audio_data); % calculates the spectrum (FFT algorithm)

(3) power_spectrum = real(spectrum).^2 + imag(spectrum).^2; % calculates the power spectrum

(4) power_spectrum = power_spectrum(1:floor(length(power_spectrum)/2)); % takes the first half of it

(5) power_spectrum = power_spectrum ./ max(power_spectrum); % normalizes it

(6) wavwrite(power_spectrum,sampling_frequency,IR_file_name); % exports the IR

In this example, line (2) is the central one: it uses the Fast Fourier Transform (FFT) algorithm to calculate the spectrum. Line (3) is needed so that the spectrum, originally a two-dimensional object, is converted into the power spectrum, a one-dimension object. If we were interested in producing a two-channel IR, this line would not be necessary. Line (4) is necessary in any case, since a raw spectrum is always a palindrome, like the sentence "Rats live on no evil star”. We're only interested in the first part of the palindrome, because it's got a nice decreasing volume envelope that recalls a real, acoustic IR.

In the case of stereo files, the syntax gets a bit longer. It goes as follows, and is easily understandable when compared to the code that's suited to mono files.

(1) [audio_data, sampling_frequency] = wavread(audio_sample);

(2) left_channel = audio_data(:,1); % extracts left channel

(3) left_spectrum = fft(left_channel);

(4) left_power_spectrum = real(left_spectrum).^2 + imag(left_spectrum).^2;

(5) left_power_spectrum = left_power_spectrum(1:floor(length(left_power_spectrum)/2));

(6) right_channel = audio_data(:,2); % extracts right channel

(7) right_spectrum = fft(right_channel);

(8) right_power_spectrum = real(right_spectrum).^2 + imag(right_spectrum).^2;

(9) right_power_spectrum = right_power_spectrum(1:floor(length(right_power_spectrum)/2));

(10) power_spectrum = cat(2,left_power_spectrum, right_power_spectrum); % re-joins the two channel

(11) power_spectrum = power_spectrum ./ max(max(power_spectrum)); % normalizes the result, takes peak from both channels

(12) wavwrite(power_spectrum,sampling_frequency,instru_IR); % exports the IR

Notice how it would also be possible to make mono-to-stereo, two-channel IRs from a mono file, by assigning the real part of the spectrum to the output's left channel, and its imaginary part to the output's right channel. It means that from a stereo file, we would be able to create stereo to stereo, four-channel impulse responses.

FFT, Echoes & Diffuse Field

The previous experiment didn't give any informations about the IR's actual content: we took samples of a certain duration and a certain pitch, and turned them into ambience IRs. That being said, there are many observations to be made about the process of turning samples into IRs by means of FFT.

First, let's consider the duration parameter. Samples turn into IRs half their size. If we take a one-second sample as a starting point, we will end up with an IR half a second long. In the case of ambience generation, it's a good idea to start with samples between 0.5 and 1 second long, so as to get 0.25 to 0.5-second IRs, which may be perceptually interpreted as ambiences. What happens if we choose much shorter samples? We know that if we work with samples shorter than 0.1s, thus getting 0.05s IRs, we will end up with EQ IRs instead of reverb ones. This is the result of the ear's integration time, as explained earlier in this article. What if we start with much longer samples, such as three-second-long ones?

To give an answer to this question, we must consider the process of FFT more carefully. Let's take a sine wave and calculate its spectrum using the FFT algorithm. The resulting IR will be a Dirac impulse, which, once translated into an IR, will sound like a single delay, or echo. Let's do the same with a square wave. The spectrum of a square wave is a series of Dirac impulses with decreasing amplitude. This series of Dirac impulses, once turned into a IR, will sound like a multiple echo. More generally, any harmonic sound, once translated into IRs via FFT, will result in more or less coloured echoes. The less strictly harmonic the starting sample, the more coloured the echoes.

What happens if we use a non-harmonic or noisy sample as a start for IR generation via FFT? The spectrum of a non-harmonic sample is not a series of discrete Dirac impulses anymore. On the contrary, it's continuous. If we listen to this spectrum as an audio file, it sounds like noise, and if we listen to this spectrum as an IR, it sounds like diffuse field. Most samples being a mix of harmonic and non-harmonic content, the reverb we get from the FFT of those samples used as an IR will be a mix of coloured echoes and diffuse field, whose respective importance depends on the harmonic/noise ratio of the original sample.

Now, those experiments are not difficult to implement in MATLAB. You can try them yourself, using signal generators based on the following syntax:

(1) A = sin(1:1:44100); % creates a 44100 sample, 22500Hz sine wave

(2) A = square(1:1:44100); % creates a 44100 sample, 22500Hz square wave

(3) A = rand(44100,1); % creates a 44100 sample section of white noise

Naturally, using plain square waves and white noises won't result in really interesting IRs. We want to create waveforms and noises that are a bit more complex. That's why we want to use standard audio samples, as we did when creating ambiences. Let's take longer instrumental samples this time, import them into MATLAB, calculate their spectra, export the spectra as audio files to get IRs. The following results are based on samples of oboe and trombone.

[Audio example: Example_Instrument_1_Oboe.wav]

[Corresponding IR: IR_Instrument_1_Oboe.wav]

[Audio example: Example_Instrument_2_Trombone.wav]

[Corresponding IR: IR_Instrument_2_Trombone.wav]

Indeed, those IRs are a mix of echoes, that correspond to the instrument sound's harmonic part, and of diffuse field, that corresponds to the instrument sound's noisy part. More of such IRs, based on flute, oboe, trombone and viola, can be found at http://1-1-1-1.net/pages/impulses/index.htm#instrument.

To get into more details, there is a slight twist in the making of these oboe and trombone IRs. They're not the raw IRs we get from the FFT. They're only parts from the spectrum. Why did we do that? Any instrument sound being noisier near the high frequencies, taking the high part of the spectrum as an IR (meaning the end of the audio file out from the FFT) will result in reverbs that are richer in diffuse field. Conversely the lower part of the spectrum (that's the beginning of the file out from the FFT) will result in purer delays. This is a good way to set the respective amount of echoes and diffuse field.

This shows that, more generally, we should never hesitate to 'cheat': for instance, if you find that an IR you get using this method is too echoey, or is too rich in high frequencies, open your IR in Pro Tools or Logic, and do the corrections the traditional way, with volume automation, EQ, and so on. It's also possible to shorten the IR files using a traditional time-stretch algorithm, as shown with the following example, based on the same trombone IR we've been creating before, simply shortened in BIAS Peak:

[Audio example: Example_Instrument_2_Trombone_Shortened.wav]

Pseudo-periodic Oscillators

The previous section showed that periodic oscillators, once translated into IRs via FFT, result in echoes, or delays. Admittedly, going to the extent of learning MATLAB code in order to produce basic echoes is a bit ridiculous. On the other hand, acoustic samples processed the same way give much richer results and justify the effort made, but acoustic samples are not configurable as programmed oscillators could be. Ideally, one would like to produce IRs based on configurable acoustic samples.

A possible solution to this problem would be to use pseudo-periodic oscillators. They're way more complex than periodic ones, and unlike acoustic samples, they're entirely configurable. Practical particular cases of pseudo-periodic oscillators consist in the so-called chaotic oscillators. Those mathematical models are simple and exhibit a very complicated behaviour. They're part of the chaos theory that was so fashionable between the '70s and the '90s, and their phase portrait possesses fractal properties, such as a non-integer number of dimensions. The pattern shown on the image below exhibits a dimension slightly superior to 1. Indeed, if we really want to sound cool, we can say we're going to learn how to make fractal reverbs. But, in truth, we're going to base the reverbs on the oscillator spectra, not on the oscillator themselves, and, unfortunately there's nothing fractal about those.

A portrait of the Lorenz system.That being said, implementing IRs based on chaotic dynamical systems in MATLAB is easy enough and does give interesting results. What follows is a musical example of a piano processed with a Lorenz-based reverb. This IR included post-processing in the form of dynamic compression (indeed, you can compress IRs, it's sometimes very useful). Notice the 'liquid' aspect of the reverb.

Here is the piano, in its dry state:

[Audio sample: Sample_Piano_Dry.wav]

And now the same piano, with the Lorenz-based reverb.

[Audio sample: Sample_Piano_Lorenz.wav]

There are many chaotic dynamical systems available. Let's take the so-called Duffing model, for instance. It's a two-variable system defined by the following equations:

(1) x' = y

(2) y' = ax - bx^3 -cy + Acos(wt)

It's indeed a very simple system, and it can feature extraordinarily complicated dynamics, as shown on the space phase representation below. For comparison, be aware that the space phase of a sinusoid is a plain circle.

A portrait of the Duffing system.Following are three examples of Duffing IRs taken randomly from a large list of MATLAB-generated files. Notice that except for volume normalisation and a bit of fading out, those IRs are really what is generated by MATLAB, no EQs or post-processing.

[Audio example: Example_DuffAsymAmbs_Light_3.50.wav]

[Corresponding IR: IR_DuffAsymAmbs_Light_3.50.wav]

[Audio example: Example_Duffing_8000_X1.5.wav]

[Corresponding IR: IR_Duffing_8000_X1.5.wav]

[Audio example: Example_Duffing_32000_X2.1.wav]

[Corresponding IR: IR_Duffing_32000_X2.1.wav]

Chaotic dynamical systems can provide an infinite variety of timbre in terms of impulse responses. Following is a model of the MATLAB code required to produce mono Duffing IRs. Actual code may include tweakings that could make it longer.

(A1) function [] = create_duffing() % the main function

(A2) global gamma omega epsilon GAM OMEG % declaration of global variables

(A3) gamma=0.1; omega=1; epsilon=0.25; OMEG=2; % system paramaters

(A4) length = 16000 ; % IR length settings in samples

(A5) [t x]=ode45(@A_duffing,0:2*pi/OMEG/50:length,[0 1]); % actual system calculation

(A6) X = x(:,1); % retrieval of one of the variables

(A7) XF = fft(X); % FFT

(A8) XF = XF(1:length(XF)/2); % takes the first half of FFT

(A9) X = real(XF).^2 + imag(XF).^2; % calculates the power spectrum

(A10) wavwrite(X,44100,24,'filename.wav'); % writes the file

(A11) end % closes the function

(B1) function xdot=duffing(t,x) % the system function

(B2) global gamma omega epsilon GAM OMEG % declaration of global variables

(B3) xdot(1)=-gamma*x(1)+omega^2*x(2)-epsilon*x(2)^3+GAM*cos(OMEG*t); % system equation, line 1

(B4) xdot(2)=x(1); % system equation, line 2

(B5) xdot=xdot'; % system equation, line 3

(B6) end % closes the function

Notice that in line A6, we choose one of the two system variables. In line A9, we reduce the two dimensions of the spectrum into one, by calculating the power spectrum from imaginary and real parts. We could have kept the two variables, and then for each variable, the two dimensions of the spectrum: that would have created four-channel, true stereo-to-stereo IRs.

I hope this article has encouraged you to investigate the potential of MATLAB and similar programs as sources of impulse responses. Happy experimenting!