CUTTING EDGE

Hyper-threading


People + Opinion : Industry / Music Biz
 

Developments at Intel could have a big impact for computer-based musicians, including a way of potentially achieving dual-processor performance from a single-chip system.


Dave Shapton

It's a sign of the times when the biggest music-related announcements come from Intel, and not companies like Steinberg or Yamaha. In particular, it's an indication of how far the industry that most people think of as revolving around physical mixing desks and (although this is pushing it a bit) tape recorders has moved on. At the end of February I met with Dan Snyder from Intel Europe, Africa, and the Middle East to talk about the new mobile Pentium 4 processor and some other developments that might be of interest to Sound On Sound readers.

The Mobile Pentium 4

The Pentium 4 processor-M isn't the big news from Intel, but it's worth mentioning because notebook computers have started to look decidedly underpowered compared to their desktop counterparts, which have used Pentium 4 chips for about a year. Although first reports suggest that mainstream office applications don't benefit much from the new chip, software that can make use of the P4's SSE2 'multimedia' instruction set, which is likely to include music software, should see big gains over earlier Pentium processors. So while you can expect music and audio applications to run better than before, they'll still suffer from other notebook drawbacks, such as slow hard disks. However, it's important not to forget that most notebook computers now come with IEEE 1349 connections, enabling you to connect fast and cheap FireWire drives and record and edit with them directly.

The really interesting stuff wasn't to do with the mobile Pentium 4 chip, however, it was about Intel's plans for their processors in general, and how future optimisations might enhance music and video processing. The biggest revelation was hyper-threading, and no, before my visit to Intel, I didn't know what it meant either!

Hyper-threading

When you measure the amount of useful processing carried out by modern processors, you'll find that most of the current designs are surprisingly inefficient. It's not uncommon for a chip to process only 40 percent of the data you'd expect it to, and as chips become more complex, with multiple pipelines and extraordinarily complicated caching schemes, the very quest for speed can ironically lead to inefficiencies. What works well with one type of data can lead to inefficient processing of the 'wrong' type of information, with as much as 60 percent of the processing potential wasted, which is partly caused by what the chip-makers call 'dependencies'. Simply put, it doesn't matter how fast a chip is if it still has to wait for the result of other processes before it can continue.

Memory speed and cache inefficiencies can result in empty 'slots' in the data processing path, with the processor having to effectively 'miss a turn' and insert a wait-state (essentially a processing blank) until it receives the next clump of data. You can probably sense that this isn't good when you're dealing with digital audio or video, but, despite this level of inefficiency, modern Pentium chips do a remarkable job with real-time media production. However, consider how much more they could do if some of the processing 'down time' could actually be used — and this is exactly what hyper-threading aims to do.

Hyper-threading makes use of the empty processing slots so effectively that the result is literally like having an additional processor in the machine. Indeed, a hyper-threading-enabled processor actually presents itself to the system as though it's really two processors, and appears to the outside world as two 'logical' processors, just as a disk with multiple partitions appears to be several 'logical' drives. Taking this further, a dual-processor system with hyper-threading would appear as though it had four processors.

However, hyper-threading is not something for nothing and, while it should add little to the cost of the chips, applications have to be modified to make use of it. Intel admit that this isn't a trivial job, but the good news is that some of the big names in music software are already working with Intel to make their applications hyper-threading aware. Unmodified programs will still run, but with no speed increase, and applications can be recompiled with an Intel-supplied compiler that does its best to make the best use of hyper-threading. However, the only way to ensure a program will use hyper-threading to its full potential is to hand-code it from the ground up.

Hyper-threading & Digital Audio

Digital audio and video aren't like other types of data, such as a Word document or a computer program. Data files produced by a typical office application are highly structured, and predictable in a way that allows them to be compressed and uncompressed without loss of data, for example. The same goes for program files (such as those with the '.exe' file extension on Windows), which contain repetitive elements that a modern processor chipset can use to speed up the operation of the program. If a program contains a frequently called routine, the data describing that routine can be held in a cache — fast memory that's nearby or even onboard the processor. Modern general-purpose processors, such as Pentiums, also emulate dedicated DSP (Digital Signal Processors) in some respects, and can use some onboard registers (dedicated byte- or multibyte-length memories) to store the number of times a program element has completed a loop — the SSE2 instruction set, which I mentioned earlier when describing the mobile P4, is an example of this.

  Jargon Buster  
  To understand hyper-threading, we need to explain a few terms:

A thread is a process within a program that can function independently from the rest of the program, while still being part of it. (An easy example would be the print function in a word processor where you can print a 50-page report while you begin writing the next one.) You can think of threading as a kind of multitasking within a program, and it's possible for a programmer to have as many threads as they like. But the best thing about threads is that they lend themselves perfectly to multi-processing because, at a basic level, you could run different threads on different processors.

Multi-processing is where you have two or more processors on one motherboard, and both an operating system and applications that recognise the multiple processors — Microsoft and Intel call this Symmetric Multi-Processing (SMP). But writing software that uses multiple processors is not a trivial task. For example, let's say you want to calculate the result of 2+2 — you can't send half of the sum to one chip and the other half to the other, and it doesn't even make sense as a concept, never mind as a practical proposition. But what you can do is send a different stream (or streams) of audio to each processor, and any application needing multiple elements to be processed in parallel is likely to benefit from SMP. Multitrack audio is intrinsically parallel, because it exists in the form of 'streams'.

A stream is a file in motion, and to process a 44.1kHz audio stream you have to be able to process 44,100 samples per second — if you can't keep up, you'll get clicks and other nasty artifacts. So to emulate a digital mixer on a single chip (with reverb, EQ and other effects), you'll need to be able to process dozens of simultaneous streams. Just think how many signal paths there are in a conventional analogue mixer (including all the insert points, effects sends, and so on) and you can quickly get an appreciation of just how fast modern processors are and have to be.

 
While both of these techniques might help a program operating on digital media data to run faster, they don't help to move the actual data through the processor. This might not be an issue if only one or two tracks of digital audio are needed in real-time, but if you're emulating an entire studio of virtual instruments, reverbs and digital mixing devices, every last drop of processing power counts. To a caching algorithm, a digital audio stream just looks like random data.

The more processors you have, the more streams you can process in real-time, subject, of course, to bottlenecks in the rest of the system. So if hyper-threading can cause a single chip to behave like two processing elements, this has to be good for processing digital audio — well, maybe. I'm only speculating here because I couldn't get any hard statistics from Intel, so I'll have to resort to a simplistic domestic analogy.

Imagine you have to prepare hundreds of meals a day: you've got plenty of cooks but only one kitchen, containing one cooker, one fridge, and one sink. The best way to speed up the process is to build an extra kitchen — set-up two production lines and you can make twice as many meals as before.

Too Many Cooks Spoil The, Er... Allocation Overheads

Before we get carried away with this analogy, the reality is that dual-processor systems are nowhere near twice the speed of single-processor setups with similar specifications — allocating work to the two processors takes processing time itself, imposing a significant load on the system as a whole. Applying this reality check to our kitchen analogy means we have to imagine that, irrespective of how many kitchens we have running in parallel, there's only one person 'supervising' the whole process. If a query crops up in one of the kitchen units, it will lay idle until the supervisor can visit to sort things out. However, even with this restriction, parallel processing can lead to greater throughput.

So, can we apply this culinary analogy to multiple logical processors on a single chip? We can, because when you view a kitchen unit as a whole, it appears to be flat out — you can imagine people falling over themselves as they run to the fridge, peel the carrots, and throw pies in the oven. But when you take a closer look, you realise there are almost always some 'processing elements' not in use. For example, the fridge isn't always full, and the oven is empty when food is being served. If we regard each food processing element in the kitchen as an actual processing unit on a processor, it's clear that, even when the workload is at maximum, there are processing elements sitting idle; and by knowing this, hyper-threading can take stock of the free resources and direct a second program thread towards them. But there's no easy way to determine exactly how much additional processing will happen because so much depends on the state of the system at a given time. So while I doubt hyper-threading will provide the same kind of performance hike that multiple physical processors can, I do expect there will be a very worthwhile improvement.

Meanwhile, back at the ranch, clock speeds are still going up. Intel expect to be shipping a 3GHz Pentium before the end of the year, and have shown a working 4GHz prototype (albeit with some 'exotic' cooling). This is eight times faster than the slug I'm writing on, which, aside from music production, still seems perfectly fast to me. However, it looks like we're on the verge of a dramatic advance in the music capabilities of desktop computers — again.

 

Home | Search | News | Current Issue | Tablet Mag | Articles | Forum | Subscribe | Shop | Readers Ads

Advertise | Information | Digital Editions | Privacy Policy | Support

 

Email: Contact SOS

Telephone: +44 (0)1954 789888

Fax: +44 (0)1954 789895

Registered Office: Media House, Trafalgar Way, Bar Hill, Cambridge, CB23 8SQ, United Kingdom.

Sound On Sound Ltd is registered in England and Wales.

Company number: 3015516 VAT number: GB 638 5307 26

         

All contents copyright © SOS Publications Group and/or its licensors, 1985-2013. All rights reserved.
The contents of this article are subject to worldwide copyright protection and reproduction in whole or part, whether mechanical or electronic, is expressly forbidden without the prior written consent of the Publishers. Great care has been taken to ensure accuracy in the preparation of this article but neither Sound On Sound Limited nor the publishers can be held responsible for its contents. The views expressed are those of the contributors and not necessarily those of the publishers.

Web site designed & maintained by PB Associates | SOS | Relative Media