The goal of turning musical omelettes back into eggs gets ever closer, thanks to some unlikely cloud-based DSP wizardry!
Audio engineering goes hand in hand with the tendency to be a control freak. We chafe if we can't move the musicians around on the studio floor, or tweak the settings on their amplifiers, or substitute our own mics for their favourite stage model. We close-mic so that we can choose artificial ambience later, and DI so that we aren't wedded to a particular guitar sound. We obsess about separation in order to give ourselves the maximum freedom to manipulate individual sources.
In short, there is nothing so upsetting to the modern audio engineer as the thought that something is what it is and can't be changed. Until now, though, we've always had to accept that some recordings just are like that. Records made direct to a mono or stereo master, flawed mixes made from multitracks that have since been wiped or destroyed, live recordings captured on handheld devices: all are now what they are, fixed and immune to our obsessive need to tweak.
Or are they?
The Sorcerer's Apprentice
Over the last few years, clever people at the bleeding edge of DSP development have been beavering away to undermine the idea that any recording has to be considered a finished artifact. The resulting technology has earned the name 'source separation', and allows a single recording to be broken down into separate tracks representing individual vocal or instrumental sources. These can then be reprocessed, rebalanced and remixed until we run out of time or our desire for control has finally been satisfied.
Up to now, the market leaders in this field until have been Audionamix, who offer a range of tools targeting serious professionals in the worlds of music production, audio restoration, film, TV and broadcast, as well as the more affordable and largely preset-based XTRAX Stems. However, last November's AES Show in New York introduced a rival technology. AudioSourceRE (pronounced 'audio sorcery') are a company set up to commercialise the work of Irish academic Dr Derry Fitzgerald, who has spent nearly two decades studying the problem of source separation. Fitzgerald's expertise has already been put to work on projects that include Beach Boys remixes, but this is the first time that his algorithms have been made available to the general public.
Available in a full-fat Pro and a more affordable Essentials version, DeMIX is a stand-alone program that runs on Mac OS and Windows, and is authorised to an iLok account. As with Audionamix's tools, the heavy DSP lifting for source separation is performed remotely on a cloud server, so a fast Internet connection is essential. However, DeMIX does have another string to its bow in the shape of spectral editing, which is carried out locally and works fairly conventionally (see box).
Philosophically speaking, DeMIX has another key feature in common with Audionamix's approach, namely that the basic act of separating a track into two or more source elements never undermines its overall integrity. Provided you don't alter the level or pan position of those separated elements, or process them in some way, they will perfectly recombine to reproduce the original recording.
Split & Polish
Beginning a separation in DeMIX is as easy as dragging in an audio file in WAV, AIFF, MP3 or FLAC format. You can then hit one of the three buttons at the top of the window that trigger different types of source separation. The most basic of these, as far as the user is concerned, is Drums, which offers only one option, labelled 'stereo smoothing'.
More novel, and more complex, is Pan, which separates sources depending on their positions within the stereo field — one of those ideas that is easy to grasp conceptually, but rather harder to implement in DSP! You can use this to split out as many as seven separate sources, the important parameter being Equal Spacing. Engaging this means that DeMIX Pro divides up the stereo field in a purely geometrical fashion, as it were, but if you switch it off, the separation algorithm attempts to locate the bands in which individual sources are strongest and sets the boundaries dynamically. (In this mode, DeMIX will sometimes return the separated tracks in an order that doesn't correspond to their apparent position across the stereo field; this, apparently, is a surprisingly difficult mathematical problem!)
Finally, there is Vocal separation, which can be as simple or as involved as you like. If you favour simplicity, you can perform an Automated separation, which requires only that you choose half a dozen settings, most of which are of the on/off variety. For more advanced work, and better results, you can carry out a Guided separation, of which more presently.
After you've made all your choices and clicked Accept, the file is uploaded to AudioSourceRE's server. Annoyingly, on my system at least, the dialogue box remained pinned to the front while this happened, even when I switched the focus to another application; but at least this shouldn't take too long if you have a reasonably fast broadband connection. Once uploaded, the dialogue disappears and the progress of the cloud processing is indicated by a green bar underlying the track name in the Mixer section to the right of the user interface. This is not a snappy business, and for a typical three-minute track, I found that I usually had to wait nearly three minutes before I was able to hear what DeMIX had achieved. Fortunately, it is possible to send only part of a track for separation, so it's often easiest to try out settings on a small section before choosing the right ones to apply globally.
Getting Results
Once a separation has been completed, you'll see separate faders in the mixer area for each of the elements. So, for example, carrying out a Drums separation results in two mixer channels labelled Pitched and Drums, while a Vocal separation yields Vocals, Track and optionally Vocal Reverb, if you choose to extract this to a separate track. If you chose to run the separation over just part of a song rather than its full duration, you'll also see a Not Separated track containing the material outside the section selected for processing. Each of these tracks has its own stereo pan controls, mute and solo buttons, and an entry in the 'track list' in the upper right‑hand pane of the GUI.
One quirk of DeMIX Pro's design is that separations are not undoable actions. If you decide that a separation hasn't achieved anything useful and you want to try something else, you can instead choose to merge some or all of the separated elements; and because of DeMIX Pro's philosophy of retaining the overall integrity of the signal, merging all of the separated elements does in fact yield the original unmolested track, provided you haven't tinkered with them. Oddly, though, this merged track then becomes an element in itself, so when you perform your next separation, you'll be asked whether it should be applied to the original track you dragged in, or to the merged track generated by folding up the previous separation. I found it easier to close the window and start again.
There are, too, occasions when you might want to perform further separations on the individual elements of a track. For example, if you've carried out a vocal separation, in theory, the Track component will contain all of the other instruments, so you could in turn divide this up using a Pan or a Drums separation. Again, the integrity of the source tracks is preserved through multiple generations of separations.
You Hum It, I'll Track It
The first port of call for a Vocal separation would usually be the automated algorithm, but as with all such features, this can get tripped up, whether by mis-identifying other lead instruments as voices, failing to identify sections of vocal or, most often, by the presence of more than one voice within a track. Where this occurs, you need to turn to the Melody editor.
This displays your track in a monochrome spectrogram that looks to have wandered in from a piece of medical equipment in a TV hospital drama. However, this is not a conventional spectrogram, but one that can identify musical pitch and even vibrato. Melodic content such as vocal lines appears as traces of brighter blue running from left to right along the screen, and a piano keyboard on the left-hand side helps you to pick out melodies and harmonies. Two drawing tools then let you select a particular line that you want DeMIX Pro to treat as the vocal part to be separated: Free Draw allows you to manually follow a line with the mouse, while the automated Pitch Tracker is used to lasso a likely‑looking bit of screen, whereupon DeMIX Pro will automatically snap a line to the strongest pitch it detects within this region. In practice, this works well enough that I found little need for the Free Draw tool, and on most of the test material I tried, creating a pitch map of the lead vocal was a reasonably smooth task.
One limitation of the way this is implemented at the moment is that the only way to leave the Melody editor with your pitch map intact is to perform a Vocal separation. You can't save your work in the editor, or even switch to a different window and then return. AudioSourceRE are aware of this issue, and are planning to address it in an update soon.
The Acid Test
The question on everyone's lips will, no doubt, be: can DeMIX Pro really do what its makers claim? The answer is yes... at least up to a point!
Having tried out both DeMIX Pro and Audionamix's XTRAX Stems, I can report that the frustrations of cloud-based audio processing are common to both. I would have done more experimenting with different parameter settings if doing so didn't require a new separation each time, and the ensuing three-minute wait. The DeMIX Pro user experience is also slightly roughened by the user-interface niggles that are probably inevitable in any version 1.0 application. I've mentioned one or two of these already; elsewhere, actions such as merging separated tracks require that these be selected in the track list at the top right of the interface, but on my system, highlighting that was visible in this section didn't always seem to correspond to an actual selection, and clicking on a track sometimes appeared to do nothing. The contrast and brightness controls in the Melody view rarely worked properly, and there's no way to stop the leftmost part of this view being obscured by a keyboard.
Many of these minor issues are on course to be eradicated in future updates, however, and what is much more important is that I found DeMIX Pro impressively stable for a new software product. In fact, I can't recall it crashing at all during the review period. But how successful was it at doing what it was supposed to do?
DeMIX Pro absolutely does perform source separation, and often does so to a high standard compared with what I've experienced from rival products, but there are limits to its powers. If you are hoping to be able to extract a studio-quality dry vocal from a mixed track, that goal is still some way off. What you get from a typical vocal separation, for example, is clearly identifiable as the lead vocal, and sometimes remarkably clean: but not always, and never free from artifacts. I would go as far as to say that some 'a cappella' vocals I generated from test material would have been usable in remixes, at least if underpinned by reasonably dense instrumentation. Likewise, with some material it's possible to obtain backing tracks clean enough that if you replaced the vocal, relatively few artifacts would be audible; it's perhaps not yet at the stage where you could create a release-quality recording this way, but it would be usable for karaoke and so forth. What is crucial is that you optimise the separation parameters for the application you have in mind, rather than assuming that the settings which yield the cleanest 'a cappella' will also give the most artifact-free backing track and vice versa.
Separate Paths
Speaking for myself, though, the possibility of extracting a vocal track from a source recording and placing it in an entirely new context doesn't often meet a real-world need. What's frequently more valuable is the ability to rebalance a vocal against its existing backing track, and this can usually be achieved with a very high degree of transparency using DeMIX Pro. I'd go so far as to say that I can imagine mastering engineers using DeMIX Pro to tackle balance problems that might otherwise be intractable.
Another situation that presented itself during the review period concerned a live recording of a band where the performance had been excellent in most respects, but the singer's pitching had gone awry. The stereo mix itself was pretty rough and ready, and the separated vocal that DeMIX Pro delivered wasn't much to listen to — but, remarkably, it was clean enough that I could load it into Celemony's Melodyne and apply pitch correction. Artifacts were audible when the corrected vocal was recombined with the separated backing track, but they were relatively subtle, and the improvement was definitely worthwhile. By way of comparison, I tried the same trick with Audionamix's XTRAX Stems 2, and got nothing that Melodyne or I could recognise as a vocal.
To be fair, if you do need to perform the ultimate in hand-guided, labour‑intensive vocal extraction, the additional tools available in Audionamix's flagship ADX Trax Pro SP3 might give you more control. For most of us, though, DeMIX Pro's guided Vocal separation represents a pretty good trade‑off between ease of use and quality of results, and a particular feather in DeMIX Pro's cap is its ability to extract vocal reverb separately from the dry vocal itself. It's often instructive to hear how much reverb there is in mixes that don't sound overly wet to the naked ear, and the ability to manipulate this independently of the dry vocal signal greatly helps where subtle rebalancing is needed.
Finally, it should be noted that DeMIX Pro has some unique capabilities, the most striking of which is pan-based separation. I've only encountered this previously in Leapwing Audio's CenterOne plug-in, where it is restricted to three channels. DeMIX Pro takes the idea to several new levels, and with the right source material, can yield impressive results. In fact, where the original track was mixed from dry mono sources, decisively panned and not swamped in reverb, pan-based separation can sometimes yield amazingly clean instrument tracks. It's a little bit hit or miss and, perhaps unsurprisingly, doesn't work as well with distant or confused stereo images, but this is nevertheless a seriously powerful tool — as is the spectral editing, though this is not unique to DeMIX Pro.
All in all, this is a mighty impressive first product, and one which suggests that source separation is a technology with a bright future.
Spectral Editing
DeMIX Pro's impressive source separation is obviously its headline‑grabbing feature, but the program also implements another form of DSP wizardry that is only less remarkable because we are more used to it. The View button at the top of the screen allows you to switch between the default Mixer perspective and the Melody view that is used to carry out guided Vocal separations, but it also has a third option labelled Spectral. This, it turns out, provides access to a surprisingly well featured spectral editing window, along the lines of CEDAR's ReTouch or iZotope's RX.
As ever, this displays a coloured 'heat map' of your track, with frequency on the vertical axis and time along the horizontal axis. With a little practice, prominent musical features can be identified visually, whereupon you can use DeMIX Pro's selection tools to highlight them for editing. As well as the conventional rectangle tool that lets you lasso an area of the graph, these include a transient tool that can intelligently identify events with a predominantly 'vertical' characteristic, such as drum beats or handclaps, and a 'magic wand' that can pinpoint either individual harmonics, or notes that comprise a fundamental plus a series of overtones. There are various options for changing the behaviour of these tools, most notably the ability to change the 'threshold' of the wand tool so that it focuses more or less tightly on areas of high spectral energy.
Any selection you make acts as a constraint for the brush tool, though it's also possible to apply this without making a selection first. The brush is an ovoid eraser of user-adjustable dimensions, enabling you to surgically remove the desired material from your track. A full undo chain is maintained as you work, making it easy to step back through your actions — happily, selection is an undoable action, as complex multiple selections are often necessary.
When you've finished working in the spectral editor, you hit the Freeze button to immortalise your work. Assuming you've only worked on part of the song, three new tracks will then appear in the mixer. Sections of the source material to which no editing has been applied appear in the Not Separated track, while sections that have been edited are divided between Edits, which contains any material you've brushed away, along with any material that was selected when the Freeze button was pressed, and Track, which represents the material left over after editing. (As an example, let's suppose you'd used spectral editing to isolate some troublesome vocal sibilants in the first verse of a track. Then, once you hit Freeze, the Edits track will contain just these sibilants, the Track track will comprise everything else from the first verse, and the Not Separated track will contain the complete song, but with the first verse muted.)
The quality of the spectral editing seemed generally good to me, and I was particularly impressed by the ability of the 'harmonic wand' tool to lock onto notes and their associated overtones. A particularly nice touch is that if, as is often the case, it's the first or second overtone rather than the fundamental that is most clearly visible within the spectrogram, you can follow that and tell DeMIX Pro to look for one or more lower harmonics as well.
That said, DeMIX Pro's spectral editor is considerably less refined than, say, RX7's, being clunkier in the user-interface department and lacking features in comparison. You can't view or edit the left and right channels of a stereo signal independently, and although the selection tools are quite sophisticated, there isn't anything you can actually do to selected audio apart from cutting it out and freezing it to a separate track. There were also a few times when I found myself stuck in the spectral editor and unable to return to the rest of the program either by freezing or resetting my edits. Considered as a bonus feature that can help to get the most from DeMIX Pro's other capabilities, it's very handy, but it you wanted a spectral editor and had no need for source separation, you'd probably choose a different package.
Pros
- Impressive vocal extraction that easily beats preset-based rivals.
- Innovative pan-based separation can be very effective.
- Vocal reverb can be separated independently of the vocal itself.
- Spectral editing window is a useful feature in its own right.
- Choice of automated or guided vocal separation caters both to those who need results fast, and to those wanting the best possible results.
Cons
- Separation is cloud-based, can be slow, and must be redone every time you change a setting.
- The program's v1.0 status is apparent in some aspects of the user interface.
- Although what DeMIX does is remarkable, it still can't achieve separations that are free from noticeable artifacts.
Summary
Source separation technology takes another step forward with AudioSourceRE's innovative package. A few user-interface quirks notwithstanding, DeMIX Pro does a good job of presenting mind-bendingly complex DSP in a user‑friendly and straightforward way that lets you get results fast.
information
DeMIX Pro £575; DeMIX Essentials £139. Prices include VAT.
DeMIX Pro $749; DeMIX Essentials $179.