You are here

An Introduction To Forensic Audio

Detective Phil Manchester By Detective Phil Manchester
Published January 2010

Noisy, muffled, incoherent recordings are an audio engineer's worst nightmare, but all too often they contain vital evidence in criminal trials. It's the job of the forensic audio specialist to extract that evidence.

It is 1972. Richard Nixon is President of the most powerful country in the world: the United States of America.

  • 17th June 1972: 'Burglars' are arrested in the process of planting audio surveillance bugs in the Democratic National Committee offices at the Watergate Hotel, Washington DC.
  • 13th July 1973: Following a tip‑off from a Senior Federal Bureau of Investigation (FBI) agent, it is believed that the 'burglars' are in fact connected to the United States Government and, more specifically, to President Nixon. It is soon established that all telephone calls at the White House had been recorded and may contain information surrounding the alleged 'burglary' in 1972.
  • 18th July 1973: President Nixon orders the White House recording system to be disconnected and refuses to hand over presidential recordings to the Senate. Various Presidential staff resign or are fired by the President.
  • 1st March 1974: The 'Watergate Seven', former aides of the President, are indicted for conspiring to hinder the Watergate investigation. The grand jury also secretly name President Nixon as an unindicted co‑conspirator.
  • August 1974: An unknown tape from 23rd June 1972 is released, showcasing a conversation between the President and the White House Chief of Staff, HR Haldeman. Both men are heard discussing their intentions of using the Central Intelligence Agency (CIA) to claim, falsely, that the FBI is involved with the burglary. The tape is crucial evidence, identifying an obstruction of justice and potentially proving the attempted cover‑up of the burglary. The tape in question is known as 'the Smoking Gun' (for a transcript, see http://nixon.archives.gov/forresearchers/find/tapes/excerpts/watergate.php).
  • 8th August 1974: The 37th President of the United States of America, Richard Nixon, resigns. Recordings made of conversations between President Nixon (far left) and his aides were central in unravelling the Watergate scandal. The chain of events that begin with a break‑in at the Watergate Hotel (top, centre) would eventually lead to Nixon's resignation (right and bottom, centre).

The FBI, followed by the Senate Watergate Committee, House Judiciary Committee and the US press, established that the burglary was one of many illegal operations authorised and conducted by President Nixon's advisors and staff. The investigations also revealed on a large scale the abuse of office and the cover‑up of serious crimes which included illegal wiretapping, illegal break‑ins, sabotage, political espionage, improper tax audits, campaign fraud and the existence of a secret 'slush' fund. The fund, laundered in Mexico to finance such operations, was used to buy the silence of the seven men who had been indicted for the burglary on 17th June, 1972.

Behind The Noise

So where does forensic audio fit in? On one of the Watergate recordings seized, specialists identified an 18.5-minute section of noise. Upon inspection, this appeared to be the result of re‑recording of noise: specifically, electric network frequency interference, which was believed to be masking the original audio material. This section of 'noise' has widely been considered to hide crucial evidence concerning the burglary in 1972, or perhaps even something more sinister.

The original content of this section has never been recovered, but the problem of the authenticity of the Watergate tape recordings brought together some of the world's leading audio experts. These specialists were appointed to conduct scientific testing on the tape by John Joseph Sirica, Chief Judge for the United States District Court for the District of Columbia. This forensic process was well documented, and elements of it form the basis of how we conduct forensic audio analysis today. Some of the scientific procedures involved are detailed at the Audio Engineering Society's web site: www.aes.org/aeshc/docs/forensic.audio/watergate.tapes.introduction.html.

Many lessons were learned from Watergate, still the most significant forensic audio investigation to date, and some of the techniques and types of equipment applied are still utilised today. Other investigations have also played a significant part in forensic audio history, both in the US and the UK. These include Bruce Koenig's analysis of the gunshots that killed President John F Kennedy in 1963Audio analysis of gunshot recordings played an important role in the investigation into the assassination of President John F Kennedy. and Koenig's authentication and enhancement of tapes in the John Gotti trials. Alan French's work enhancing audio recordings helped to convict British serial killer Colin Ireland, dubbed the 'Gay Slayer'. Forensic specialists Koenig and Douglas Lacey helped to make intelligible cockpit voice recordings of United Airlines Flight 93, on September 11th, 2001. My own work has been critically important in 2007's Counter Terrorism Investigation — Operation GAMBLE, which convicted those responsible for the conspiracy to kidnap and behead a British Muslim soldier, and the supply of terrorism‑related materials to support the international threat of terrorist activity.

Forensic Audio Today

As with most music recording, the days when forensic audio involved racks and racks of equipment are gone. Software plug‑ins can almost always achieve what our invaluable analogue friends of the past did — or so a good friend of mine tells me! Today's forensic audio investigators can call upon sophisticated, specialised equipment. This is CEDAR Audio's new CCS 3000, an 'all in one' workstation designed to make the process of audio enhancement as simple and fast as possible.

However, although some analogue recording technologies are considered a thing of the past, they still have huge significance in today's audio forensics world. Analogue recordings may be less common within forensic audio, but a sound working knowledge of analogue functionality and an appreciation of its characteristics is crucial to understanding where we have been and where we may go in the future. Law enforcement agencies are gradually trusting the digital world in every aspect of forensics, and it could be said that even the audio forensics community are catching up with the professional audio industry. (Well, sort of!)

The remit of a forensic audio laboratory is to provide audio evidence in criminal or civil investigations. On a day‑to‑day basis, a forensic audio laboratory will deal with sensitive law‑enforcement recordings, 999 emergency calls, audio from mobile phones, DVD, video, CCTV, computers, solid‑state devices, memory cards — in fact, just about every type of recorded audio media there is and has ever been. Many of the tasks will at some point involve forensic enhancement audio for use as evidence at trial. However, general advice and guidance concerning the correct capture and subsequent review of audio material is also essential. This provides what is commonly referred to as 'best evidence'.

Ultimately, the responsibilities of the forensic audio laboratory are to present evidence that can be relied upon within a court of law. The role of a forensic audio practitioner is to serve the courts, as opposed to their tasking agency (ie. the prosecution or defence). This involves considerable audit of processes and procedures: our work must remain repeatable by other forensic audio specialists. The responsibility of the forensic audio professional is to act with impartiality and integrity in the pursuit of justice for all. It has been suggested by some that experts required by either the prosecution or defence are likely to consider on the side of their tasking agency. However, this is not my experience within the forensic audio community that I serve.

So what are the ways in which forensic audio can serve the interests of justice? To give you an idea of the scope of the field, here is a run‑down of the services a forensic audio lab can offer.

Authenticity

In brief, this is the science behind establishing whether a recording is original and whether it has been tampered with, either maliciously or accidentally. This task is not performed by all forensic audio laboratories, and requires very specialist skills and equipment. Advances in new technologies are allowing alternative methods, currently under close scrutiny by the forensic audio world for dealing with digital audio.

ENF, or Electric Network Frequency analysis, is a hot topic at the moment. It can offer an assessment of the true integrity of a piece of digital audio/video evidence, providing information as to whether a recording has been edited and helping to establish at what time and what date the recording was made. The technique involves the collation of ENF frequency data from the National Grid and is currently being conducted in both Europe and the United States.

All digital recording devices are susceptible to induced 50Hz or 60Hz electrical network frequency, which in turn provides an identifiable waveform signature within the recording. This is true of both mains‑powered units and portable devices, where the latter are used in close proximity to transmission cabling or mains‑powered equipment. The frequency of the alternating current supplied by the mains grid is nominally 50Hz in Europe and 60Hz in the US, but in both cases is subject to small fluctuations. These are consistent throughout the grid, and can be used to authenticate the date of recordings containing mains hum.

ENF within the UK at any given time (say, 12:00:25 on 1st January 2010) is exactly the same in London as in the Midlands, or Scotland, resulting in a consistent signature across the whole of the UK. The same is also true of both Europe and the United States. This frequency fluctuates slightly around the 50Hz (or 60Hz in the US) value, so at any given time it may be fractionally higher or lower; for instance, the ENF at 12:00:25 on the 1st January 2010 may be 49.8Hz, while at 12:01:25, the frequency may be 50.2Hz. These variations are recorded in a database, catalogued by exact time and date.

The process of extracting the ENF signature from a recording involves band‑pass filtering between the range 49‑51 Hz, without any resampling of the material, to separate the ENF waveform from the original recording. The results may then be plotted and analysed against the database to prove or disprove the recording's integrity and qualify when the actual recording took place, thus providing evidential and scientific authentication of the material in question. This scientific process is currently under review by many of the world's leading experts, and the conclusions thus far would indicate that it is both reliable and accurate.

Another technique involving 'forensic imaging' of digital audio material provides a digital analysis of the core data — ie. the ones and zeros that make up a digital recording. This, when used in conjunction with ENF analysis, greatly assists in establishing the integrity of the recording. There are also further techniques and processes that can contribute to the experts' findings, although there isn't space here to go into detail. What is important is that they exist: in the Watergate investigation, experts used many different procedures to establish their findings, and this good practice should continue for future digital authentication. If the results from many varied techniques point to the same conclusion, this can be considered true accuracy within evidential authentication of audio.

Forensic Enhancement

Enhancement is a process that involves the expertise of 'cleaning' or 'removing' of unwanted noise from an otherwise unintelligible recording. This can be described as 'audio archaeology': its principal task is to uncover evidence cautiously and without unnecessary damage to the original recording. This provides the listener with the opportunity to hear 'what is said', which is often sufficient to prove or disprove an individual's involvement in crime. Often, the 'enhanced' recording will sound cosmetically worse than the original, but 'what is said' is revealed. This is in complete opposition to the music industry, where cosmetics are everything! On a daily basis, investigations are turning to forensic audio enhancement as a final 'roll of the dice' when all other forensic practices and techniques have failed or are unavailable. Forensic audio alone continues to routinely solve high‑profile criminal investigations and convict serious criminals.

Identification of the noise problem is the key here. If the noise can be reverse‑engineered in some way, or it can be established how the noise became part of the original recording, the noise can be exploited and researched to allow for its subsequent removal or attenuation. There are established products and techniques, but the true skill here lies in the experimental element of the art. Much as in music, sometimes there are no rules! It is often the case that a technique or a particular product that is expected to work does not come up to expectations. An open mind, a willingness to try anything, and above all, perseverance is crucial in establishing what can and what cannot be done. Having said that, there comes a point when you need to know when to stop. Unnecessary damage to a recording can prove disastrous, especially if the speech starts to sound like a different word or phrase!

Forensic enhancement begins with critical listening of the original material: the complete recording is reviewed, in order to formulate a sound forensic strategy. Work is never conducted on the master recording, so clones; duplicates or forensic images are essential. Throughout the complete enhancement process, the original is constantly referenced against the original, unprocessed recording, thus preventing any over‑processing and pre‑empting issues that may be raised later within a trial. Established audit and working procedure allows for another independent specialist to repeat the same process and achieve the same results, without question.

Intelligibility

The noise‑reduction processes used in the music and video industries may make noisy audio sound subjectively cleaner and more pleasant, but they do not improve intelligibility. Indeed, the very best that one can hope for is that they do not damage the intelligibility too severely. Consequently, forensic audio investigators use a different set of filters (commonly referred to as adaptive filters) to extract voices and other meaningful information when obscured by noise and audio interference such as radios and TVs.

The intelligibility of human speech is governed by how a recording has taken place and what equipment was used. Evidential research that uncovers this information can help predict whether forensic enhancement will provide greater speech intelligibility. This research may involve identifying device specifications, the format of the recording and the acoustic properties of the space where it was recorded. Interpreters from many linguistic and cultural backgrounds also play a crucial role within forensic audio investigations, often providing vital insight into an otherwise misunderstood recording.

Many factors contribute to the intelligibility or otherwise of a recording, often coming down to equipment limitations such as restricted bandwidth. The single most important factor affecting both forensic enhancement and speech intelligibility is whether a recording has been subject to compression. Psychoacoustic data algorithms such as MP3 encoding is designed to 'throw away' low‑level information, but in forensic audio it is often the case that the low‑level signal information is precisely what we are interested in — the quiet speaker or the whispered conversation, not the loudmouth!

Forensic Phonetics

A screen from one of Intelligent Devices' in‑house forensic analysis tools. The upper half shows a formant analysis and the lower half a sonogram. The yellow graphs are FFT (Fast Fourier Transform) readings at the cursor position.

Forensic phonetics and voice biometrics play a huge part in forensic audio, helping to prove who someone is and often what they are saying when this is disputed. This is becoming vital in the pursuit of justice. Many years ago it was only relevant to serious and organised crime, but it is now becoming essential to the detection and conviction of lesser offences.

Forensic phoneticists and linguists, who are experts on the idiosyncrasies of the human voice, conduct this type of specialist analysis. They are usually from multi‑linguistic backgrounds, and more often than not are senior academics who act as legal experts in their chosen academic field, not only for the prosecution and the defence, but also as advisors to the courts.

One part of forensic phonetics involves what is commonly referred to as F1, F2 and F3 analysis. These are the first three formants of speech, which are resonant frequencies of the vocal tract, and differ from person to person. From these 'F' values the specialist is able to plot the data against a known sample of a person's speech. Formant data can be affected by transmission systems such as GSM, so caution must be applied when dealing with mobile phones and similarly transmitted speech recordings.

Different parts of the mouth, including the teeth, lips and tongue, together with the lungs, the vocal tract and vocal cords, are utilised to provide the many different sounds that we make as human beings. The lungs supply the air, the vocal cords excite this air and the mouth, teeth, lips and tongue shape the sound that is produced. Cultural upbringing, schooling and our social way of life also play a part in how we individually speak and communicate. The expert will consider all of these elements when trying to pin down a disputed utterance or unknown word for the uninitiated.

All recordings supplied to the specialists must be free from any processing or alteration, as these tasks hinder the scientific processes that the expert is reliant on and, more importantly, will change how an individual actually sounds. Although processing does not affect what is said, as long as it is conducted by a professional, it can alter the 'F' values and can cause issues for the forensic analysis of the material. Many cases have utilised forensic phonetics very successfully, including some of the above‑mentioned criminal investigations.

Voice Biometrics

Voice biometrics is a specialist area of forensic audio concerned with the modelling of the human voice, much like the coding of the voice for digital transmission in GSM mobile phones. This allows for the comparison and analysis of a known sample against an otherwise unknown sample, in an automated way. Often used in conjunction with other methods, this new science already features in everyday life in areas such as telephone banking, and although it is in its infancy within the law enforcement community, many anticipate that it will soon be the next big thing.

Voice biometric technologies utilise a process which, in simple terms, takes hundreds of 'snapshots' every second and extracts distinctive features from them, to create a model of the speech‑production mechanism of the individual. Distinctive features are used to compare the known against the unknown model. Within adults, the speech‑production mechanism holds a valuable key to our identity and its properties during adulthood, unless hindered by significant injury, remain largely unchanged. Voice biometrics is also language‑ and text‑independent.

The analysis and comparison are automated, but this technology is heavily dependent on a specialist to interpret the results. This technology permits comparisons to be made across multiple speakers and speech databases, allowing a specialist to compare literally thousands of samples or models in the same time it would formerly have taken to compare just one. Voice biometrics is also the only biometric that can be retrieved remotely without the knowledge of the individual in question, and therefore provides a very powerful scientific tool for criminal investigation.

In forensic casework, languages may be used where there are few experts who speak that language. Current practice in these cases is to have a forensic phonetician working with an independent translator.

Other Forensic Audio Topics

Other forensic topics surrounding audio include weapon signature analysis. This can help to identify particular weapon types, such as Glock or Berretta, and establish who fired first, which is crucial in the instance of a police shoot‑out. Likewise, vehicle signature analysis can identify which make or model of vehicle has been involved in a crime where the identification of the vehicle is paramount to the investigation. Forensic audio techniques are also used in air accident investigation, where analysis of aircraft engine noise can establish the cause of the accident.

Forensic Audio As A Career

What avenues are there for audio engineers to work in forensic audio? With only two police forensic audio laboratories in the UK, opportunities working directly for the police are fairly scarce, but they do exist. Other positions are available in the commercial arena working for the many companies that supply forensic audio services to law‑enforcement agencies and defence counsels.

If there are any engineers reading this who are contemplating a career within forensic audio, consider these final points. Professional engineers in the music business usually start with the best mics, the best rooms, the best recording kit, the best instruments and the best singers. In forensic audio we have no control over the room, the mics, often the recording kit and definitely not the vocal. More often than not the recordings have the worse possible signal‑to‑noise ratio, are highly compressed and sound like they have been recorded under a train wreck! Do you still want a career in forensic audio?  

About The Author

Covert audio specialist Detective Phil Manchester is the West Midlands Police Force Forensic Audio Specialist. He has conducted in excess of 450 major criminal investigations surrounding national security, counter‑terrorism, murder, kidnap and anti‑corruption. He is a forensic audio consultant to many UK Police Forces, the UK military and many other law enforcement agencies and associated forensic audio and speech specialists. He is one of the 30 technical specialists, and the only law enforcement specialist worldwide, sitting on the Audio Engineering Society's Audio Forensics Technical Committee.