You are here

Mark Lippett & XMOS

Mark Lippett is CEO of XMOS.Mark Lippett is CEO of XMOS.

Most audio interface designs are based around technology from British innovators XMOS. What makes the xcore platform so ubiquitous, and what does it mean for musicians?

Which manufacturer’s products are found in more studios than any other? Whatever the popularity of Shure microphones, Apple computers or Behringer synths, the crown almost certainly belongs to XMOS.

Whether it’s a portable laptop rig or a sophisticated multi‑channel setup, nearly all musicians and engineers rely on USB audio interfaces. Yet although there’s fierce competition between many well‑established manufacturers, most designs are based around the same family of platforms. Lift the lid on a USB interface, and its beating heart will most likely be an xcore chipset from XMOS.

How did one company come to be so central to all of our music‑making? CEO Mark Lippett fills in the back story. “We spun out of the University of Bristol in 2005 with a novel processor technology that was designed to deliver the sort of flexibility to software engineers that hardware engineers had become accustomed to in FPGA platforms. Our processor arrays can emulate all of the types of compute that you encounter in embedded systems, including AI, DSP, I/O and control.”

Mark Lippett: The thing that really sets us apart is that the underlying processor architecture is fast enough and reliable enough to implement hardware in software.

Hardware In Software

“The thing that really sets us apart is that the underlying processor architecture is fast enough and reliable enough to implement hardware in software,” explains Mark. “For example, we can implement SPDIF, ADAT, I squared S: all of those protocols are actually software libraries to us, not distinct pieces of hardware on the chip. So, by deploying different software builds, you can effectively create different system‑on‑chip designs on an existing semiconductor platform, using software alone. Our objective was to give the embedded software community an efficient way of deploying software onto platforms in order to create fully integrated bespoke solutions with a rapid time to market.

“When you have a somewhat, dare I say disruptive technology, you’re looking for market discontinuity — points of entry for the technology in the market. The one that we discovered very early on was USB audio. Apple were going to stop putting Firewire into MacBooks, and they said ‘You’re going to use USB audio from now on.’ And the peripherals industry said, ‘That’s all very well, but there isn’t a chipset for that.’ Seeing the opportunity, we built a solution internally using our applications engineering resources. And the rest is history.”

Bridge Building

The XMOS chipset performs the task that’s most fundamental to any audio interface: it serves as a bridge between the various audio input and output streams, and the USB data bus. “Essentially, there’s a collection of I/O protocols that need to talk to each other in a certain sort of segment — we’re very good at joining them together. They might be at different sample rates and require some interim processing for one reason or another, but we can connect those things together. You are then able to select which combinations you want just using software. So, insofar as your PCB will allow you to do so, you could effectively do runtime changes to the I/O protocols that you’re supporting.”

This is the key advantage XMOS has over rival technologies: the xcore chipset can easily be configured to cope with whatever I/O streams the interface designer wants to include, simply by loading the appropriate software onto it. If no such product was available, interface manufacturers would have to cobble together multiple hardware chips each dedicated to one individual function, such as sample‑rate converters and ADAT transceivers, or employ field‑programmable gate array (FPGA) chips. FPGAs are similarly versatile and are used by some high‑profile manufacturers, but the barrier to entry is high as they are expensive and require specialist programming skills. By contrast, any software programmer with a knowledge of C or C++ can take advantage of XMOS’s library code.

“It’s a sort of chicken and egg situation,” says Mark. “When the company was founded back in 2005, FPGA hardware platforms were becoming higher and higher performance and more and more expensive. They were chasing communications applications and, consequently, people in the embedded space and the consumer space couldn’t afford them. And then there was no point in having an FPGA engineer on the staff, so FPGA engineers disappeared and now they can no longer program FPGAs. It was almost a self‑fulfilling prophecy that FPGAs were not that accessible in that part of the industry. The other way of looking at it is there’s probably 100 times more software programmers than hardware programmers. So if you want to make a very empowering creative platform available, make it available to the biggest community of creative engineers.”

One Chip

“There are various different ways of interacting with XMOS technology,” continues Mark. “You can take it as a processor and do programming ‘on the metal’. Or we provide a library — an SDK, if you like — that’s a whole stack of USB audio capabilities built around USB audio, and you can take the SDK and the tools and put things together in a Lego box style. You can deconstruct it and reconstruct it if you want, so you can switch on interfaces, you can change the number of interfaces and so on. It’s quite a high level of abstraction. The majority of our customers use that, and that’s where they get a lot of flexibility across their product portfolio by effectively rebuilding different configurations from the SDK.

“One of our internal mantras is to be the only chip in the box, and in many cases we achieve that. In some cases, for other reasons, there might be an application processor in there. If you’ve got a very sophisticated windowing display, or something that’s clearly going to lean on a lot of open source software, then generally speaking, you’re going to want to be running Linux and running an application processor. But if it’s a more basic user interface or a deeply embedded application, our ambition would be to be the only processor in there.

“There’s a wide variety of things that people do with the processing that they can get their hands on, which is actually all of it. If you choose to, you can just peel everything away. I mean, Ableton’s Push 2 has a fantastic display driver on it, which my understanding is driven by XMOS. And they did that. They’re a great bunch of very talented engineers and that, I think, was a ‘bare metal’ implementation.“

Mark Lippett cites Ableton’s Push 2 as an example of creative XMOS programming: not only its audio interfacing and mixing but also the display was driven from the xcore chip.Mark Lippett cites Ableton’s Push 2 as an example of creative XMOS programming: not only its audio interfacing and mixing but also the display was driven from the xcore chip.

There have now been several generations of XMOS chipsets, and the company have recently announced a migration of their technology to the open‑source RISC‑V platform. “The reason for that is not because we’ve changed horses and decided to just build RISC‑Vs. We’ve actually taken our existing architecture and made it RISC‑V compatible. So we’ve still got this array of processors, but each processor is now essentially an extended RISC‑V instruction set machine. It’s still very unique in terms of the way it behaves and delivers very unique benefits.”

Beyond USB

Although USB audio was the first commercial opportunity that XMOS exploited, their technology is equally applicable in other contexts. “We’ve also got customers building audio solutions in different markets. AI has come along in the last couple of years with tiny ML [machine learning] models that do things like keyword spotting, audio event detection, glass break detection, even gunshot detection. Those are audio or acoustic applications that are nothing to do with what we would traditionally have regarded as being an audio [market] segment. We’ve got all sorts of different applications for audio as a sensing technology, but actually interpreting it into metadata and using that metadata to enable other classes of applications.

“While there are a lot of legs in audio, you’ll also see XMOS devices in applications that span the consumer, industrial and automotive markets, because fundamentally, if you buy a piece of silicon from us, it’s completely uncommitted. There’s nothing really on there that determines which application you move into. It just so happens that for commercial strategy, we selected a form factor and a cost point that fits neatly into the embedded audio sector. And actually the performance is appropriate for processing audio. While we undoubtedly have a great audio solution, it’s not a dedicated audio processor.

“We tend to say ‘USB audio’ when we probably should just say ‘audio’, because in many cases our customers aren’t using USB. Many of our voice technologies don’t have USB. They’re I squared S and I squared C on the back haul. Back in the day, we had the first standards‑compliant Ethernet AVB interface. And that was an interesting one because we were in a head‑to‑head race with an FPGA company. We started three months later, and we beat them to getting the first compliant endpoint. And that just demonstrates the time‑to‑market advantage.

Stemming The Flow

One advantage of USB from the point of view of XMOS’s technology is that the USB2 protocol has an inherently limited data bandwidth. Because this bandwidth is kown and can be accounted for, the xcore chipsets can be designed to cope with any possible USB2 data stream. That’s not the case with more modern interface protocols such as USB3/4, PCIe and Thunderbolt, which are intended to permit massively fast, high‑bandwidth data transfer. In these cases, an additional device is needed to filter out unwanted or unusable data and reduce the bandwidth to a level that the chipset can accept. “With Thunderbolt, we would need an external physical layer and, depending on what you’re doing, the data rates might exceed our capabilities. We have had higher bandwidth interfaces, but we’ve always needed that external device to essentially ‘drink from the fire hose’ and just send the extracted data back to the xcore device. When you’re into very high bandwidth serial interfaces, there’s no way you can do it in a software pipeline in a processor. You need some dedicated hardware.”

AI & DSP

On most interfaces, XMOS’s xcore hardware doesn’t just handle bridging. It also performs real‑time audio processing, which is what the control panel applications supplied with your interface are controlling. The interface manufacturer is free to code their own signal‑processing algorithms, but many make use of XMOS’s own libraries. These include code that can handle audio mixing and routing as well as common processes such as compression, EQ and reverb. Most recently, the company have been focusing on integrating machine‑learning tools, which has knock‑on benefits for more conventional applications.

“With the third‑generation xcore.ai, we put a vector processing unit into the architecture, primarily because we wanted to run edge AI models. But AI decomposes down to multiply‑accumulate operations, with a couple of fancy things added on. So what we ended up adding was a very large SIMD (Single Instruction, Multiple Data) pipeline for AI that also works really well with DSP. So now we’ve got a strong AI proposition, but also the opportunity to use those same resources to do DSP. We’ve developed reverbs, compressors, you know, the sort of basic building blocks of some of these external sound cards. But now we’ve got much more horsepower and much more capability, more memory as well, which is important to start to really pull some of the more heavyweight DSP into the system.

“We’re seeing a lot of customers are experimenting with converting DSP algorithms or DSP functions into AI. We’re a very good platform for mixing and matching DSP and AI, because everything happens in the same place, so you don’t have to export a load of data to an AI accelerator and then bring it all back again, reducing complexity, latency and power consumption.

You can’t buy an xcore product off the shelf, but if you’re an interface developer, you can work with an evaluation kit such as this.You can’t buy an xcore product off the shelf, but if you’re an interface developer, you can work with an evaluation kit such as this.

“Our first sort of foray into that into that area was around voice processing: keyword detection, for example, on far‑field voice processors. And we are also now seeing growth in opportunities in automotive, because if you look beyond the current generations to driverless vehicles, you’re starting to see the cabin becoming an extension of a living room or office space, so again, there’s more interest in high‑quality, high‑definition audio as well in those contexts. There’s almost a new wave of audio applications happening now, a resurgence of something that we’re all quite familiar with, but also some new use cases like automotive platforms and industrial defect detection. But people are also interested in AI in what I might regard as being a more traditional audio space, for signal conditioning and noise reduction and things like that, and we’re a great platform for integrating that.”

Moving Forward

XMOS’s market dominance means that stable, versatile and affordable solutions are available to anyone who wants to build an audio interface, with no need to figure out the arcane lore of FPGA programming, code their own DSP algorithms, or integrate multiple chips to achieve the necessary functionality. But is there also a down side? Is there a risk that innovation is stifled when so many manufacturers are using the same platform? Mark doesn’t see it that way. For him, XMOS solves one set of problems and in doing so allows manufacturers to focus on innovating elsewhere.

“The technology industry is so interdependent, and I think it’s a question of picking where you want to innovate. XMOS may be doing all the USB audio bridging, and we’re starting to bring the DSP in. But we’re standing on the shoulders of giants as well. We’ve got tools companies that we use, we’ve got lots of open source technology that we use, we’re using TSMC’s silicon technology.

“There is a lot of innovation around bespoke DSP algorithms and bespoke AI algorithms, and we don’t do those things in‑house, but we’re a great target platform. There’s also the user interface that we’re one or two levels of abstraction away from. There’s significant creativity there. So I think it’s just a question of picking your battles as far as technology is concerned and figuring out where you really want to differentiate yourself and partnering for the rest of it.”

Drivers & Latency

From the user’s point of view, software drivers are one of the most opaque elements of audio interface design. The confusion is partly one of terminology — strictly speaking, an ASIO ‘driver’ is more than just a driver — but in essence, most of us understand the need for low‑level code that allows our audio software to talk to our audio interface. Generic driver code is built into the macOS and Windows operating systems, but it has its limitations. Core Audio on macOS offers acceptable low‑latency performance, but only supports class‑compliant USB interfaces and the AVB protocol. On Windows, meanwhile, it became standard practice in the late ’90s and early 2000s to install third‑party ASIO drivers to bypass the built‑in audio protocols, as the latter did not offer adequate performance. Although that has changed and Windows now has integrated support for the UAC2 multi‑channel USB2 class‑compliant audio format, ASIO remains the de facto standard.

So, if you buy an interface designed to work with music recording software, you’ll usually need to install an ASIO driver on Windows, and possibly an additional driver on macOS too. But further confusion arises because the driver is often installed alongside other software, most typically a control panel application that allows internal settings in the audio interface to be changed. These control panel applications are developed by individual manufacturers and often have a very different look and feel from each other. However, if your interface uses an XMOS chipset, as most do, the chances are that the driver code will actually be the same. You’ll either be using the class‑compliant drivers built into the operating system, or the ASIO driver developed by Thesycon to work with XMOS’s chipsets.

“We don’t actually develop drivers at XMOS,” says Mark. “We partnered with Thesycon and they did the UAC2 drivers for us and they had an established reputation for building drivers for that space. Nowadays, UAC2 is supported by Windows, so the challenge isn’t quite so great, but Thesycon did a great job of of bridging the gap for many years between people wanting multi‑channel UAC2 and the arrival of support in Windows.”

The widespread use of XMOS chipsets and generic drivers means that driver performance is perhaps no longer the yardstick it once was for choosing an audio interface. CPU overhead, and latency caused by input and output buffering, will be similar for all interfaces that use the same driver. However, there are other factors that can add latency, such as the implementation of any onboard digital mixer or routing matrix.