What Is Dithering in Audio?
In this piece you’ll learn:
- The simple answer to the age-old question: “When do I need to dither?”
- How dither actually improves dynamic range
- Why dither isn’t masking anything
But first, I’ll answer the question I pose in the title: What is dither? Dither is simply noise. It’s noise added to a signal when changing bit depth to make quantization distortion less noticeable.
Give it to me straight, doc
Ok, you Googled “how do I dither audio” (or something to that effect) and just want the straight, simple answer. I get it. Ready? Here it is:
Apply dither any time you reduce bit depth. If you’re reducing to 24 bits, the type and strength of dither almost don’t matter. If you’re reducing to 16 bits (or less), a low to medium dither level with some noise shaping is probably best.
If you’re curious as to why I’ve made these recommendations, or how it is that dither can actually improve the dynamic range of a signal, let’s keep going! I promise I’ll keep things to the point and free of math.
What the heck is dither even doing?
Dither is the solution to one of the fundamental problems in digital audio, so if we want to understand what it’s doing, we first need to understand the problem. In a nutshell, the problem is one of amplitude resolution, or how accurately we can measure the level of a signal using ones and zeros.
When we try to measure an infinitely variable analog source (our audio) using a finite number of digital values (those ones and zeros), there are bound to be some errors. Sometimes the analog level will be a little above the closest digital value, while other times it will be below. You can imagine this is sort of like trying to measure someone’s height using a measuring tape that only displays feet: sometimes you’ll need to round up to the nearest foot, other times you’ll need to round down.
In digital audio, this rounding error is known as quantization distortion. Using a 32-bit floating point system—as nearly all modern audio editors do—renders the resulting distortion so low in level that you really don’t need to worry about it. However, as bit-depth is reduced, the level of this distortion creeps up. As you approach 16 bits, it can start to get rather noticeable and nasty sounding in reverb tails, fade-outs, and other quiet sections.
Without going into too much detail, this is because the number of bits dictates how many discrete values you have to store levels at. To go back to the measuring tape analogy, you could think of it like this: if 8 bits let you measure only in feet, 9 bits would give you 6 inch increments, 10 bits would give you 3 inch increments, etc. Every time you add a bit, you double how accurately you can measure. Going the other way, this means that every time you lose a bit, you double the potential rounding error.
Let’s quickly inspect what this quantization error looks like. First, here is a 1kHz sine wave at -96 dBFS, represented at 32-bit floating point.
1kHz sine wave at 32 bits
Next, here is the same sine wave, but reduced to 20 bits, without dither (I skipped 24 bits because the difference is very difficult to appreciate visually). A little worse for wear, but still more or less recognizable.
Finally, here it is reduced to 16 bits, again without dither.
Finally, here it is reduced to 16 bits, again without dither.
1kHz sine wave at 16 bits, no dither
Yikes! What’s going on here?! Two things:
First, only the very peaks of the sine wave were high enough in level to get rounded up to the smallest value a 16-bit file can represent, while the rest were rounded down to zero. Second, depending on where the peak of the sine wave fell in relation to the sample timing, either one or two samples were rounded up.
Clearly, this won’t do. Dither to the rescue!
OK, now will you tell me what the heck dither is even doing?
Yes, yes I will. At its heart, dither is simply noise, and noise, by virtue of its very nature, is random. Back in the early days of digital audio, some clever engineers realized they could use a random noise signal to their advantage. By mixing it with the signal being quantized, they could add enough variation that the original signal could be preserved.
The key here is that the dither noise needs to be completely unrelated to the signal that you’re quantizing, sometimes stated as being “decorrelated.” When this condition is met and the level of dither noise is correct, any given input sample has a chance of being rounded up or down in a way that depends on the incoming signal value. Not only does this help preserve the signal, it actually removes the distortion that is tied to its frequency content.
Let’s look at a couple more visual examples. First, here’s a 1 kHz sine wave that fades from about -92 dBFS down to -116 dBFS. Notably, at the tail end, that’s lower in level than we would expect to be able to capture with 16 bits of resolution.
1 kHz fade at 32 bits
We can see from the spectrum analyzer that it’s a very pure 1 kHz tone, and I’ve also added a marker at the point where the signal falls below 96 dBFS, the theoretical lower limit of 16-bit audio.
Now let’s reduce this to 16 bits, without dither.
1 kHz fade at 16 bits, no dither
As we’ve come to expect, the signal drops to zero once it passes -96 dBFS, and we can see the distortion products that crop up while signal is present.
What if we add dither before reducing to 16 bits? Am I really saying that adding some very low level noise to the signal before reducing the bit-depth will mathe-magically fix this? You bet I am!
1 kHz fade at 16 bits, with dither
There are three rather remarkable things you should notice here:
- The original signal no longer abruptly cuts off at -96 dBFS, but instead smoothly fades into the dither noise.
- This results in a signal-to-noise ratio increase of about 16 dB.
- The previously present distortion tones are gone. Not masked or buried below the noise, but actually removed!
I really want you to let these facts sink in, especially that last one. Another way to think about this is that you’ve replaced tonal distortion with noise, which, in a way, is its own form of distortion. That said, a consistent, evenly distributed bit of very quiet noise is sonically preferable to harmonic distortion tied to both level and frequency.
All this by introducing a little random variation into the mix.
Odds and ends
Before we wrap up, there are a few specific cases, along with a couple pernicious myths, that I’d like to address.
Self-dither: From time to time you may hear someone mention that you don’t need to apply dither if you’re using this or that plug-in, because it will self-dither. While technically this can be true in some very specific cases, it’s a risky assumption to make. Believe it or not, all noise is not created equal, so unless the device you’re using has a specific dither setting, you should be adding dither if you’re reducing bit depth.
24-bit and noise-shaping: One thing we haven’t really touched on is noise-shaping. In a nutshell, it’s basically like applying EQ to the dither noise to make it less audible. At bit depths of 8 or 16 bits, this can make an appreciable difference. At 24 bits though, the dither noise is so quiet that at normal listening levels it’s inaudible, even without noise-shaping. Still, it will remove quantization distortion which, due to its tonal nature, has a much higher chance of being audible. As such, a flat, TPDF-type dither is really fine.
Bouncing, flattening, freezing: No, we’re not talking about some obscure food preparation method. Different audio workstations operate in different ways, but most offer some method to commit a complex audio effects chain to a file. If you’ve not explored the options for doing this in your DAW, it may be time to give them a look. When possible, 32 or 64-bit floating-point are your best options, but if you’re forced to use 24-bit, check to see if there’s an option to enable dither.
Hopefully this helps you understand why dither is so crucial to digital audio, how and why it works, and when it should be applied. Now, you’ll never have to dither about dithering again. If you’re reducing bit depth, whether from 64 or 32-bit floating-point to 24-bit fixed point, or from 24-bit down to any lower fixed-point value, add dither! It will always do more good than harm.
If your interest has been piqued and you want to dive even further into the topic, we have a full guide available here. It references some older products, but the fundamentals haven’t changed and the information is as good today as it ever was.