Q&A with Andreas Koch
Can you give us an overview of what DSD is and how it differs from PCM?
The term Direct Stream Digital (DSD) was coined by Sony and Philips when they jointly launched the SACD format. It is nothing other than processed Delta-Sigma modulation first developed by Philips in the 1970’s. Its first wide market entry was not until later in the 1980’s when it was used as an intermediate format inside A/D and D/A converter chips.
Figure 1 shows how an analog source is converted to digital PCM through the A/D converter and then back again to analog via the D/A converter. The A/D internally contains 2 distinct processes:
- Delta-Sigma modulation: the analog signal is converted directly to DSD with a very high sampling rate. Various algorithms are in use depending on the application and required fidelity. They can generate 1-bit DSD or multibit DSD oversampled at 64x or 128x compared to regular CD rate.
- Decimation filter: the DSD signal from the previous step is downsampled and converted to PCM. Word length is increased (for instance 16 or 24 bits) and sample rate reduced to CD rate or a low multiple of it for high resolution PCM formats.
- the PCM signal is upconverted to a much higher sample rate.
- then converted to DSD via the Delta-Sigma modulator (to reduce word length)
- then converted to analog
"...we can say that since about the late 1980’s we have been listening to some form of DSD without even knowing it."
While DSD is used at a sample rate of 2.8224MHz (64 x 44.1kHz) with 1 bit per sample mostly for SACD production, recording equipment has also been used at double that rate at 5.6448MHz (128 x 44.1kHz). Often studios use this format to archive their library of analog recordings. Recording equipment for this double rate DSD is available relatively inexpensively at great quality so that consumers can use it to archive their beloved vinyl and tape recordings onto a digital format and then play that back directly via an audiophile grade D/A converter (such as any Playback Designs product) in the comfort of their own listening room.
The theoretical frequency bandwidth of a DSD signal with a sample rate of 2.8224MHz (64 x 44.1kHz) is 1.4112MHz. Compare this to a 96kHz PCM signal which has a theoretical bandwidth of 48kHz, or 192kHz PCM signal with a bandwidth of 96kHz. However, this wide bandwidth comes at a price: pure Delta-Sigma signals are quantized to 1 bit and, therefore, do not have a great dynamic range by themselves. That is why Delta-Sigma converters need to incorporate a process called “noise shaping” that increases the dynamic range in the usable audio range (0-20kHz) and then slowly decreases it over higher frequencies. It is this noise-shaped delta sigma signal that is then called DSD. Fig.2 below shows the typical dynamic range of a DSD signal sampled at 2.8224MHz and at 5.6448MHz. The slowly rising noise floor at higher frequencies also follows to some degree our hearing threshold for transient signals that have been proven to be audible up to 100kHz.
Of course, DSD at double the rate (5.6448MHz) has an extended audio range of 0-40kHz above where the noise floor then starts to rise gently.
Fig.2 also shows the theoretical dynamic ranges of high resolution PCM signals at various sample rates. Note the steep brickwalls that PCM signals typically have. It is those brickwalls that can generate very audible side-effects such as pre-ringing, if not processed with special algorithms (such as in all Playback Designs products). By design DSD signals do not generate these side effects.
As we can see from this, DSD is characterized by the following:
- great dynamic range in the audio band (0-20kHz)
- slowly rising noise floor in higher frequencies (no brickwalls)
- extended frequency range into MHz
You have been involved with SACD from the beginning. Can you provide us with an overview of your history with SACD/DSD?
I have been involved in the creation of SACD from the beginning while working at Sony and was leading a team of engineers designing the world’s first multichannel DSD recorder and editor for professional recording (Sonoma workstation), world’s first multichannel DSD converters (ADC and DAC) and participated in various standardization committees world-wide for SACD. Later I founded AKDesign which designs and markets OEM products incorporating a number of proprietary DSD processing algorithms for converting PCM to DSD and DSD to PCM, and other technologies for D/A conversion and clock jitter control in DACs. In 2008 I co-founded Playback Designs to bring to market my exceptional experience and know-how in DSD in the form of D/A converters and CD/SACD players.
Some manufacturers, including Gordon Rankin of Wavelength Audio, have pointed out that there are no current production DAC chips that handle DSD natively. If this is the case, are all DSD DACs that use current production chips converting DSD to PCM internally?
Most DSD DAC chips, if not all, lowpass filter the DSD signal to get rid of the high frequency noise (see Fig.2) before the signal gets converted to analog. The resulting signal behind this lowpass filter (and before the actual analog conversion) may still have the same sample rate as the original DSD signal, but it is no longer 1 bit. So can this still be considered DSD?
"Sometimes it is more useful to distinguish DSD from PCM in the frequency domain and look for the characteristic behavior in the higher frequency bands..."
It is all a matter of definition: DSD, or Delta-Sigma Modulation, can be encoded with more than just 1 bit, and PCM can have a very high sample rate. When looking at the criteria of word length and sample rates only, the boundary between DSD and PCM can become fuzzy. Sometimes it is more useful to distinguish DSD from PCM in the frequency domain and look for the characteristic behavior in the higher frequency bands, as pointed out in Figure 2 above.
Since I believe that the source of the sonic difference between DSD and PCM lies in the difference between how these signals compare in their behavior for higher frequencies, I also believe that filtering a DSD signal with an aggressive filter to flatten the upper frequencies will make it behave and sound more again like a PCM signal.
"The reason why chip manufacturers like to add an aggressive lowpass filter at the input to their DACs is simple: the analog output measures better. Whether it sounds better with real music signals instead of measurement tones is an entirely different question."
The reason why chip manufacturers like to add an aggressive lowpass filter at the input to their DACs is simple: the analog output measures better. Whether it sounds better with real music signals instead of measurement tones is an entirely different question. Similarly, most audio manufacturers and even end users who do not understand DSD are mostly concerned with Signal-to-Noise performance even at high frequencies and would not choose a chip with a frequency response that is not completely flat and optimally low all the way up to Nyquist.
With that, the answer is yes, most if not all DSD DAC chips convert to PCM before converting to analog.
That opens the door for discretely built DACs that don’t have to follow the criteria of measurements with sine waves, but rather the listening experience with real music signals.
How do people know if their "DSD capable DAC" is able to handle DSD natively or not?
Many manufacturers define “DSD capable” as being able to receive DSD signals natively via their digital input. What happens to the DSD signal once it is inside their converter is an entirely different question. If you want to find out more details on that question you need to find out from the manufacturer of your DAC, what chip is being used, or what kind of algorithm in the case of no off-the-shelf chip is used. Most DAC chips have a publicly available data sheet that you can download and study, but sometimes they are not so easy to read for the technically less inclined.
Unfortunately, this question is not so easy to answer for many users. But in the end, shouldn’t we use our own ears to be the ultimate judge for what sounds great and what sounds not so great?
"But in the end, shouldn’t we use our own ears to be the ultimate judge for what sounds great and what sounds not so great?"
Gordon Rankin goes on to point out that the DoP (DSD over PCM) protocol introduces overhead in the encoding/decoding process. Can the DoP protocol be improved upon and if so will these improvements result in better sound quality?
The overhead associated with the DoP protocol is for the identification of DSD signals while being transmitted in “PCM containers”. It has no bearing on the actual bits of the sound signal. The argument surely couldn’t be that overhead negatively impacts the sound quality, because then I wouldn’t know why USB generally can sound so good. The overhead of USB is huge compared to more traditional audio transmission formats.
DoP is a compromise solution for applications that do not allow a dedicated DSD signal transmission (for instance between Mac computer and external DAC). It was created with the contributions from a number of manufacturers. Of course, such solutions are never ideal and can create bigger or smaller headaches for certain manufacturers depending on their existing architecture.
Like in anything in life, there is always room for improvement and that is certainly true for DoP as well. But whether any improvement will also improve the sound quality is quite questionable.
One byproduct of DSD is unwanted ultrasonic noise. Can you talk about why we should or should not be concerned with this?
The real question here is whether this ultrasonic noise is unwanted or not. Our human hearing is a complex and very dynamic process. We often make the mistake of trying to describe it with a frequency response resulting from measurements with sine waves. We have to understand that this is only a very rudimentary approximation and doesn’t explain at all how the process works with dynamic signals such as music.
In order to only begin to understand the complexity of the hearing process we have to go into the deeper psychology of it. The physcial process of transforming sound pressures to nerve signals, as performed by the cochlea, is the easy part. What happens next in our brain is the least understood and therefore quite controversial part.
Without going further into details on this subject, I want to just point out that it has been shown that our hearing does not stop at 20kHz, but goes much beyond that for dynamic transient signals. The dynamic range above 20kHz gets gradually reduced, but never shows a sharp edge. Just like naturally occurring sounds that never show a sharp drop off either. Now look again at the graphs in Fig.2 and tell me which curves may be most similar to the human hearing thresholds. Wouldn’t you pick the DSD signals?
The ultrasonic noise of the DSD signal is first of all uncorrelated to the music signal. Our hearing algorithm is very good in “tuning out” uncorrelated signals – this is how 2 people can have a conversation in a noisy environment. Then for the most part this noise is very low in amplitude, often even below the hearing threshold.
"...a flat PCM-like response in higher frequencies (with the potential of negative side effects) may be “overkill”, because our ear and associated psychology already perform the function of noise removal."
By counting on the mostly misunderstood capability of our hearing as a filter we can avoid certain algorithms in the way we design our technology and therefore avoid some pitfalls associated with certain algorithms. In that sense a flat PCM-like response in higher frequencies (with the potential of negative side effects) may be “overkill”, because our ear and associated psychology already perform the function of noise removal.
At the recent RMAF 2013, I heard a number of attendees asking about double rate DSD. What are the benefits of double rate DSD?
Double rate DSD pushes the noise shaper up in the frequency domain as shown in Fig. 2 above. That is most interesting for recording and post production when the intent is to release the product in DSD, because DSD2x gives the extra headroom that recording engineers need in order to record and edit without causing any degradation when releasing their final product in single rate DSD.
It may also be interesting for hobbyists who for instance want to archive their analog music library to a digital format. In such applications you may not care about the extra storage space that is required and you certainly wouldn’t be bothered with bandwidth bottlenecks when sending DSD2x files through the internet.
However, as a delivery format from studio to end user single rate DSD seems to offer an optimal combination of sound quality, bandwidth and storage space.
A number of people including myself and Stereophile's Stephen Mejias have commented on the sonic qualities of DSD playback. Stephen said it very well in his RMAF show report, "There’s an overall smoothness and effortlessness, combined with wonderfully natural and powerful dynamics." Is there some technical aspect of DSD that you could point to that accounts for DSD's superb dynamic capabilities?
We talked about that already above a little. I think it is DSD’s lack of any “sharp edges” in its characteristics that make it sound superior. Wherever there are sharp edges in nature funny things happen. That is true with sun light hitting the corner of a house, radar dish antennas and also with audio encoded in PCM with a brick wall.
"I think it is DSD’s lack of any “sharp edges” in its characteristics that make it sound superior. Wherever there are sharp edges in nature funny things happen."
Some people appear to be reluctant to get involved with DSD because of their experience with SACD, essentially buying into a technology that was more or less abandoned by Sony. Is DSD different and if so, how?
Encoding formats generally don’t disappear, it is usually the physical delivery format that ages and then disappears. DSD is an encoding format and is no longer tied to a physical carrier.
The Playback Designs MPD-5 DAC can handle up to 6.1MHz DSD through USB. Why 6.1Mhz DSD?
6.1MHz or 128 x 48kHz is the theoretical limit that the input receiver accepts data in all PBD DACs. The actual D/A converter runs at an even higher frequency (built-in future proofness).