What is Digital Audio?

Digital audio refers to the way that music and sound is stored on digital devices such as hard drives, iPods, websites (servers) and CDs. In order to understand the technicalities of what this means, we must first make sure we understand sound in its own right. There are separate pages for the physics of sound propagation in air and for the various types of audio storage formats.

Analogue Audio
The term Analogue (U.S. 'Analog') is defined as the following:


 * noun


 * 1. A person or thing seen as comparable to another: "an interior analogue of the exterior world"


 * adjective


 * 1. Relating to or using signals or information represented by a continuously variable physical quantity such as spatial position, voltage etc. "Analogue signals"

An analogue electrical signal uses a varying voltage to emulate the shape of a sound pressure wave in air. Analogue audio uses these signals to transmit a recorded/reproduced sound through an electrical circuit.

A comparison between a sound pressure wave in air, and an analogue electrical signal can be seen in Fig. 1. The green wave on the bottom is called a sine wave and in the context of an electrical system, it represents a changing voltage over time, which corresponds to the change in air pressure which makes up a sound wave in air. Signals like the sine wave seen in the diagram can be used to record, transmit and play back sound through a process known as transduction.

Analogue audio is used in many different places, and many different ways. The place we encounter it almost every day is in loudspeaker and headphone systems. A transducer is used in all kinds of analogue sound recording and reproduction equipment to convert sound into electrical signals, and vice versa.

A good example of a simple transducer can be seen in Fig. 2, in a dynamic microphone. In many ways, microphones and speakers work a lot like the human ear. They begin with a diaphragm, which is a thin piece of flexible material held in tension across the top of the microphone assembly, in the area known as the capsule. This is much like the human eardrum.

When sound pressure waves travel through the air and hit the diaphragm, they cause it to vibrate along with the wave. Imagine hitting a drum - this is essentially the same process. The drum skin is a diaphragm of a sort, and it is excited (vibrated) by the drum stick or beater. The difference with a microphone is that the diaphragm is not designed to produce a sound, but to capture one.

As the diaphragm moves, it drives a magnet in and out of a coil of wire, which is connected to the output of the microphone to go off wherever it needs to be. The movement of the magnet through this coil creates an electromagnetic field, and as it moves in and out this field creates a varying voltage which matches up to the pattern of the sound waves hitting the diaphragm. As a result, an analogue electrical signal is created in the wires leaving the microphone. Those signals can then be recorded and manipulated in a number of ways (see the post-electrical era of Analogue Audio Formats) which allows us to create and reproduce music for the rest of the world to hear.

Most microphones use an XLR cable to transmit the sound to its destination. This uses a balanced output signal to minimise noise and other unwanted effects. The voltage that leaves a microphone is generally very small, and must be run through amplifiers to be brought up to a level where it can be easily heard when run through a set of speakers..

For a closer look at the different parts of a dynamic microphone, see Video 1 below.

As mentioned before, this same process of transduction takes place in loudspeakers and headphones, only it operates exactly in reverse. In fact, all headphones and speakers can theoretically be used as microphones, although their design will not necessarily create a good output signal. The best way to try this out is to use a cheap pair of earphones, and plug them into the microphone input of a computer. However on the other hand, microphones should never be used as speakers, as the system is not designed to receive the kind of voltages used in speakers, and parts of it (especially the diaphragm and/or the transformer) are likely to be permanently damaged if used in this way.

The way in which a diaphragm moves (as well as some interesting information about modes and standing waves) can be seen in Video 2:



Analogue Audio Summary

 * Electrical signals of varying voltage used as an 'analogue' to sound pressure waves in air.
 * Analogue signals created and used by transducers, found in microphones and loudspeakers.
 * The most common type of microphone is a dynamic microphone, other types include condenser microphones and ribbon microphones.

Digital Audio
Having used many very successful kinds of Analogue Storage Formats for much of the 20th century, by far the most popular, widespread and easy-to-use storage formats are now digital. The term digital infers that the information being stored is in simple numbers (digits) as opposed to physical analogues like grooves on a vinyl record or magnetic patterns on tape.

Most modern digital audio formats use the PCM system. The process of recording sound to a digital medium using PCM involves, in short, taking an analogue electrical signal and translating that information into binary code which a computer system can understand and store. The PCM process is covered in detail on the PCM wiki page. PCM digital audio is made up of a large number of individual split-second samples.

Digital audio recording works in exactly the same way as a film camera - except instead of capturing frames of light, it involves capturing samples of sound.

The concept of Hertz is discussed along with Sound in Air, but for clarity it bears repeating here. Hertz is the unit of measurement for frequency. One Hertz (Hz) means that the frequency of an action is one time a second. One thousand Hz (or 1kHz) means that the action is occurring one thousand times a second. The Hertz system is used to measure the frequency of sound waves, which defines their perceived pitch, tone or note, but it is also used to measure and describe sample rate. It refers here, naturally, to the number of samples which are recorded each second.

Each sample is a snapshot of the sound for that very brief moment. The most common sample rate used in digital audio is 44.1kHz, or 44,100Hz. The reason for this particular number is explained on the page for Sample Rate, specifically in the section on Nyquist Theorem.

A sample rate of 44.1kHz means that for each second that the recording is taking place, 44,100 individual samples are being recorded. That is the lower limit to the kind of number it is generally necessary to record in order to trick the human brain into thinking it is hearing a direct, original sound source, and not a somewhat-distorted series of individual samples.

By capturing the amplitude of the sound at each of these given sampling points many thousands of times a second, a more-or-less accurate reproduction of the original signal can be recreated in a format which is nearly ready to be stored as binary, digital data. The PCM process continues by quantizing the samples to the nearest available values - a process which is explained on the bit depth wiki page. Finally, once each sample has been rounded to a nearby digital value, it is encoded binary using whichever audio codec has been selected. We are then left with a digital audio file which can be stored, played back and transmitted via computers, iPods, mobile phones and all other kinds of digital audio technology. In the production stage, it can also be used to further manipulate the sound for the mixing and mastering processes.

Now that the analogue signal has been successfully converted to binary code using the PCM process, it can be transmitted across digital networks throughout the world via the internet and streaming services, uploaded to programs and sites like iTunes, Spotify, BandCamp and others for near-instantaneous distribution to virtually any point on the planet. All the recipient needs at the other end to hear the same sounds recorded thousands of miles away is a computer or other hardware with the right software to read and understand the newly encoded information. Some of the various ways in which digital audio can be encoded and stored are listed and described on the Digital Storage Formats wiki page.

Digital Audio Summary

 * Sound initially captured as an analogue electrical signal can be recorded to a hard drive or other digital storage through sampling.
 * Sampling works in a similar way to a film camera, with many thousands of individual samples recorded each second.
 * Each sample must be quantized to fit as close as possible to a limited number of possible values.
 * Samples are then encoded to the binary language used in digital systems for distribution via the internet and other digital mediums like CD, DVD, flash memory (USB) and more.

Implications of Digital Audio
The fact that we now listen to music primarily via digital formats (although older analogue styles such as vinyl have recently made a resurgence) makes it important to understand the process by which they are produced, and the difference between them. Popular Digital Audio Formats come in many different types and they all have their own benefits and pitfalls. One of the greatest obstacles to audio fidelity is sheer storage space. However, as hard drive space and processor power increases exponentially, perhaps we will gradually see a shift towards higher-quality, larger and more detailed sound files than low bitrate mp3s. That being said, whether or not such a change is indeed necessary, beneficial or even desired is a question which begs a conclusive answer. The pursuit of this answer is what spurred the creation of this wiki, and in the execution of the following small study: C. Brown Listening Tests. Take a look at the results discussion, have a listen to the samples and decide for yourself!