One of the things I get asked often is “what is the best sample rate for my project?”
I'll try to answer the best I can, backed by measurable facts, why I make the choices I do when choosing a sample-rate.
For those of you that came here because you are preparing your sessions for web transfer, maybe because you are about to prepare a session to be sent for mixing, or to be sent for some satellite recording, I will quickly point down what sample rates you should be working before I venture into a more in depth analysis:
- 44.1kHz – if you aim for CD quality, and you don’t think that extreme processing and editing will be necessary. However, I always recommend bumping up to 48kHz, as even light editing (crossfades, elastic audio, pitch shifting) will generate audible artefacts.
- 48kHz – is the sample rate I use mostly and I would highly recommend you to convert your session into 48kHz after you finished tidying up. (Want tips on how to organize your session for transfer? Click HERE!). This is a good compromise between size, functionality and quality. It will allow for a noticeable improvement over 44.1kHz, whilst still keeping your files small and manageable, and allowing you to use the full array of I/O from your interfaces (as opposed to double and quad speeds of higher sample rates).
- 96kHz – generally used by recording engineers that aim for a very high quality recording, usually in classic settings, or similar situations (eg. where detail of acoustics and transparency is a must). Also used by film sound designers as extensive editing is usually required, eg. vocalign dialogue to film. This will generate big audio files and is recommended to convert your project to 48kHz if work over the cloud is going to happen and you do not have dedicated servers (eg. digidelivery, etc.). However, if you feel your work fits in the above categories feel free to contact me and I will make sure I can accomodate your files over the cloud.
- 192kHz – is NOT recommended. Not only most systems will not playback such high sample rates, but most gear will reduce the available I/O by a factor of 4. Also, please look at this brilliant article from Lavry Engineering for more information on how sample rate converters actually induce distortion at such high sample rate frequencies – here
Bit Depth directly relates with the dynamic range available for your audio material. There are a lot of misconceptions flying around about what you actually gain from increasing bit depth. This are some pointers on what the real benefits behind increased bit depth are:
- 16-Bit offers 96.33dB of dynamic range. From 0dBFS (clip point) to -96dBFS. This is the bit depth used on your audio-CDs (Red-Book standard). However, I recommend using 24bit for recording and processing. When engineers were recording with these systems, it was usual to record pretty hot to avoid quantisation noise. It can be rather tricky to keep the signal from clipping or dropping too low into the noise floor. With the advent of 24-bits, recording became a much easier task. You can record at lower levels without worrying of clipping or burying your signal in noise.
- 20-Bit was used by some pro audio gear, now largely discontinued. Not recommended due to compatibility issues.
24-Bit offers a dynamic range of 144.49dB. This is more than enough for modern recording requirements for the simple reason that no system/converter has a dynamic range that is that wide. In fact, almost no profession equipment has a noise floor lower than 120dB. And remember: the threshold of pain for human hearing is 130dB.
- Note: the 56bit fixed point depth used by software such as ProTools is there so the headroom can accommodate almost anything. The idea behind it is that you could sum 128 tracks of full-code signal with the fader up to +12 and never clip the internal bus of the console.
For those of you more tech savy, let’s delve into what is sample rate, and why does it matter. Historically, Sample Rates available were pretty low and increased as storage and processing power allowed for more data to be manipulated:
“Single-Speed” digital audio:
- 8000Hz – used in old telephones, budget wireless microphones and walkie talkies. Suitable for human voice, helps controlling sibilance and other noises.
- 22.050Hz – used in (older) games. The idea was to achieve maximum compression. Most consumers were using very cheap desktop speakers that wouldn’t reproduce correctly above 6-9kHz.
- 44.1kHz – Used in CD and most Mp3 reproduction formats. More than adequate for music reproduction (see oversampling below).
- 48kHz – Some professional recorders use this as default sample rate for recording. This is also the format used by DVDs and SA-CD. This frequency was selected as it can handle 22kHz in all frame formats (24, 25, 30 and 29.97FPS)
“Double-Speed” digital audio:
- 88.1kHz – double the speed of 44.1kHz. Used by default for some professional equipment when working with higher sample rates is necessary but the end result is usually 44.1kHz
- 96kHz – used for DVD’s, SA-CD and others.
“Quad-Speed” digital audio:
- 192kHz – used for DVD’s, SA-CD and others
- 352.800Hz – used for recording/editing DSD material as 1bit is not suitable for editing
- 2.882.400Hz – used for 1bit Sigma Delta modulation
What do all these numbers mean for an engineer? Very simply, there are some points to note:
- Higher sample rate means higher the storage requirements (exception for 1bit systems);
- Higher sample rate will need higher processor power;
- Higher sample rate not necessarily translate into more definition, in fact, it may result in distortion and other artefacts;
- Higher sample rate MAY result in a higher bandwidth, but is it worth it? (see below Oversampling and aliasing filters);
- Diminishing returns apply, limited to the bit depth. You cannot increase the Sample Rate without increasing bit depth. Because the voltages are rounded to the nearest bit value, increasing sample rate would simply result in the addition of more of the same value to our string of discrete signals, instead of adding more steps taken from the continuous (analogue) signal;
Whilst the first 2 are pretty self explanatory, let’s try and understand the other points shown above. If you haven’t yet, please read Lavry Engineering awesome article here.
The first goal of high sample rates is to clear the Nyquist Frequency. This is the maximum frequency that can be converted before Aliasing artefacts occur. Aliasing artefacts sound like ghost signals. They are extremely noticeable and alien to the original sound which will ruin almost any recording.
The Nyquist frequency is half the Sample Rate frequency:
- 44.1kHz -> Nyquist frequency 22.050Hz.
- 48kHz -> Nyquist frequency of 24000Hz
Both sample rates are more than enough to cover the range of frequencies used by the Human ear (which ranges from 20Hz – 20000Hz, but in practice the average 30-40 year old male will start rolling off somewhere between 13-15kHz).
But – we know that some instruments are capable to generate sounds above 20kHz. Wouldn’t these be picked up by the microphones and into the converters? Yes, you are correct. And if these signals are allowed to pass through our converters without any sort of control or processing, Aliasing would occur. That’s why there are Aliasing filters on our converters. An Aliasing filter is, simply put, nothing more than a low pass filter – silencing any content above the Nyquist frequency.
For those of you that used filters extensively (equalizers, low cut filters/high pass filters, etc.), you will remember that with very steep filters you will get artefacts such as resonances and distortion. If we simply tried to apply a filter for Aliasing purposes, we would soon realise that the filter would have to be pretty steep to bring a signal down 144dB in less than 1/10th of an octave. This would be, in fact, too steep to be practical. It would introduce severe resonance peaks, distortion, and cripple the signal-to-noise ratio as you would experience signal loss.
The answer to aliasing is pretty simple, and it’s been in place in most contemporary converters: Oversampling! Most converters actually run at 11x or more faster than the indicated sample rate (eg. 44.1kHz). In fact, modern professional converters run 64-512 times faster than the indicated sample rate. (This is 16 to 128 times faster than a 192kHz Sample rate). This allows for a very soft aliasing filter curve to be applied to the signal, making the requirements for our low-pass filter much easier to meet.
But the honest, cold truth about it is that there is no energy above 40kHz (instrument wise), the speakers cannot reproduce it and the human ear cannot hear it. Unfortunately, Anti-aliasing is still required as some (poorly) designed gear might introduce very high frequency artefacts due to self-oscillation or other design issues. This is pretty common with high gain processors that weren’t design correctly. So Aliasing is still present.
I hope this shed some light on the difference between Sample Rate and Bit Depth. Please feel free to leave any comment or start a discussion. If anything is not clear, I will make an effort to clarify or even introduce some illustrations.