
Fig. 1 - PCM sampling, converting an analogue signal into a digital format.
Pulse-code modulation (PCM) is a system used to translate analogue signals into digital data. It makes use of the binary language to store information about an audio signal in a digital medium, such as a hard-drive or CD.
Three Stages of Translation[]
PCM takes place over three stages:
There are two important factors to bear in mind with a PCM recording process, and these will begin to make more sense as the three steps above are explored. These factors are:
If you are familiar with film cameras or animation, then the concept of sample rate will probably be quite easy to understand. It is essentially the same thing as frame-rate when dealing with capturing and reproducing light, as with a film camera. In order to create the impression of one continuous moving image, many thousands of individual frames are captured by the camera, each of which is a distinct and separate still image. However, because of the limitations our brains have in terms of processing speed, it is possible to trick them into thinking that this very rapid series of individual frames is in fact one very real moving scene. A frame-rate of around 30 frames per second is generally sufficient for this piece of psychology to work.
Digital audio recording works in exactly the same way - except instead of capturing frames of light, it involves capturing samples of sound.
Each sample is a snapshot of the sound for that very brief moment. The most common sample rate used in digital audio is 44.1kHz, or 44,100Hz. The reason for this particular number is explained on the page for Sample Rate, specifically in the section on Nyquist Theorem.
A sample rate of 44.1kHz means that for each second that the recording is taking place, 44,100 individual samples are being recorded. That is the lower limit to the kind of number it is generally necessary to record in order to trick the human brain into thinking it is hearing a direct, original sound source, and not a somewhat-distorted series of individual samples.
The factor that each of these samples is actually recording is the amplitude of the signal. This is equivalent to its perceived volume or loudness, and in an electrical circuit is represented by voltage. Fig 1 illustrates the sampling process. The grey shaded pyramid-like objects represent the digital, sampled information. The red line represents the original incoming analogue signal.
Because of the limitations of a computer system [which are due to the fact that it operates on discrete, separate numbers [digits] rather than a continuous, organic signal or wave like in an electrical circuit or in the air], it is necessary to use these huge numbers of samples per second. Every digital system works on the binary language, and all pieces of software have a particular bit depth at which they perform. You may be familiar with terms like 32-bit or 64-bit operating systems, which generally has an effect on the speed of a computer's everyday operations. Digital audio recordings also have a bit depth, and the numbers along the Y/vertical axis of Fig. 1 each represent an individual bit. A bit is a single binary value, which can be either 0 or 1. More information on this topic is to be found on the binary language wiki page.
So, in Fig. 1 we can see that there are 16 individual values, between 0 and 15. This means that Fig. 1 illustrates a 16-bit system. Each of those bits is used to represent one possible amplitude, or volume, level. [In actuality, the number of potential values in a 16-bit system is much larger than 16, as described on the bit depth wiki page] It can clearly be seen in Fig. 1 that the digital sampling method only provides a (at this sort of detail level) noticeably rough approximation of the original analogue signal. There are lots of gaps where the wave does not exactly match up with the samples. This is a symptom of the computer limitation mentioned above - there are only a limited number of separate values to which the wave can be 'snapped.' This process, of rounding-off or approximating the signal to the nearest amplitude value is the second step of the PCM process - Quantization.
If you have worked with (and particularly, if you have composed) electronic music, you may already be familiar with the term quantization. It is simply a term which refers to the process of snapping to the closest possible value within a restricted system. In music production, quantization can be employed to take an otherwise out-of-time performance, and match it up with the tempo or rhythm of the track. Much the same thing happens with digital recording during the sampling stage, and indeed the potential resolution by which a performance can be quantized in a sequencer for a MIDI track for electronic music is also limited by the bit depth of the system, although one is highly unlikely to play as many as 44,100 notes per second.
For anyone else, try to imagine each sample as a pile of building blocks as a child might play with. If you were to attempt to create a perfect curved archway using only flat-sided cubes in stacks, then the arch would end up looking jagged and rough in a very similar way to how samples come out in digital audio. It is the extreme number of times the quantization process is repeated per second which means that the effects of these quantization errors are minimised. If the sample rate (or the bit depth) is too low, then it can produce unfortunate and unwanted audible effects such as aliasing, which is discussed on the bit depth wiki page.
The third and final stage of the PCM process is encoding, which involves some relatively complex mathematical work where the Analogue-Digital converter effectively records these thousands of new samples and values as binary code.
Colin Yao goes into more detail about the PCM process, as well as how it relates to voltage, in Video 1.
What is Pulse Code Modulation (PCM)
Video 1 - Colin Yao on PCM
PCM Summary[]
- PCM is one method of translating an analogue electrical signal into binary language for a digital system
- PCM works in three stages: Sampling, Quantization and Encoding
- Samples are snapshots of the incoming signal which record the amplitude of the signal at that given moment
- Quantization rounds those amplitude values to the nearest available value in the digital system, based on its bit depth
- Encoding is the final stage where the newly sampled audio information is written to a hard drive or other digital storage medium in a given format to be used elsewhere