Q&A with Charles Hansen of Ayre Acoustics Page 2
"Basically we want to do as little as possible inside the DAC chip."
What elements of a DAC chip do you like to see implemented in the chip and which do you like to see implemented outside the chip?
Basically we want to do as little as possible inside the DAC chip. There are two reasons for this:
a) Any things happening like including a digital filter will cause both radiated RFI and power-supply conducted RFI such that the digital signals will contaminate the audio signals.Can you discuss the cause and effect of Pre-Ringing and Pre-Echo?
b) Everything in the world of consumer electronics is done for the lowest possible price. This means we can always do a better job by doing it ourselves outside of the DAC chip. External current-to-voltage conversion is absolutely critical for good sound quality. We can implement any kind of digital filter we want in an external FPGA, including minimum phase, with whatever rolloff and window we want. Simply put, we can just do it better ourselves.
It's really quite simple. Any filter steeper than 6 dB/octave (first order) will ring when a transient event comes along. The ringing can be minimized to any arbitrary degree by making the transition as gentle (as opposed to a sharp transition) as desired.
"...by the time you get to quad-rate sampling (176.4 kHz or 192 kHz), the compromises are practically non-existent. One can have flat frequency response to 40 kHz or 50 kHz, and still have a filter with little or no ringing..."
It is not practical to have a gentle transition with single-rate audio (such as found on CDs). You only have 2 kHz (20 kHz to 22.05 Hz = Fs/2) to get at least 96 dB (16 bits) of attenuation, so there will always be a lot of ringing. We can minimize it by letting the rolloff start at (say) 18 kHz instead of 20 kHz. Already that cuts the problem in half as now you have twice the bandwidth to perform the filtering, which means half as much ringing. There are additional compromises that can be made, but they will always be compromises.
On the other hand, by the time you get to quad-rate sampling (176.4 kHz or 192 kHz), the compromises are practically non-existent. One can have flat frequency response to 40 kHz or 50 kHz, and still have a filter with little or no ringing (in the case of the moving-average filter).
When a linear-phase filter is used, there is no phase shift at all, but then half of the ringing occurs before the impulse, and the other half occurs after the impulse. This never happens in the real world. In the real world there are always echoes (sound reflections from nearby objects), but they always occur after the impulse. It is impossible for echoes to occur before the event. This is one reason that standard digital technology tends to sound un-natural.
On the other hand, a minimum-phase filter has the same total amount of ringing, but it is all moved until after the impulse. This type of filter does have some phase shift, but it is very small and only at very high frequencies. It is like moving your head 1/2" or so closer to or further from the speakers.
There seems to be some confusion between Upsampling v Oversampling. Could you explain the differences?
No, I cannot. The standard technical term is actually interpolation, which simply means to calculate interpolated data points between the original data points that were captured during the original recording. In video this process has traditionally been called upsampling, and in audio it has traditionally been called oversampling.
Then about a decade ago a company that made sample-rate conversion boxes found that if they used this box between the transport and the DAC box, that is changed the quality of the sound. They called this "upsampling" because there was already a digital oversampling filter in the DAC box.
Nobody ever came up with an explanation for why this made any difference in the sound, because it is impossible to add any actual new real data (ie, real resolution) once a recording has been made. And adding another oversampling filter to the existing one is nothing new, as virtually all oversampling filters are a concatenation of several 2x filters in series. For example 99.9% of the time an 8x filter is made from three 2x filters in a row, simply because it is the cheapest way to do it.
Probably the best explanation is simply that if you "upsample" the data by (say) 4x and then "oversample" the data by (say) 8x, you end up with 4 x 8 = 32x oversampling, and changing the oversampling rate changes the sound. Technically, the higher the oversampling ratio, the easier it is to filter out the "image" frequencies.
"What this means in real life is that the "steps" in the stairstep waveform output of the DAC chip are smaller, which means the non-harmonically related frequencies represented by the steps are at a higher frequency and easier to filter out from the desired original signal."
What this means in real life is that the "steps" in the stairstep waveform output of the DAC chip are smaller, which means the non-harmonically related frequencies represented by the steps are at a higher frequency and easier to filter out from the desired original signal.
Over the years, some people have decided to call interpolation to a non--integer multiple of the original rate (eg, 44.1 kHz to 96 kHz) "upsampling" and to call interpolating to an integer multiple of the original rate (eg, 44.1 kHz to 88.2 kHz) "oversampling". But it is all just marketing terms and not technical terms, so people can call it anything they want.
In general, interpolation by a whole integer (eg 2 x 44.1 kHz = 88.2 kHz) sounds better than interpolation by a non-integer (~2.176870748 x 44.1 kHz = 96 kHz), probably because it is a simpler operation with less error in the calculation.
Some people claim that Non-Oversampling (NOS) DACs have a distinct sound as compared to DACs that employ oversampling and digital filters. What is your experience with the NOS approach and are there benefits to non-oversampling that cannot be achieved in any other way?
The lowpass playback filter in a traditional DAC (technically called a "reconstruction filter") filters out the steps in the waveform, and therefore more accurately recreates the original waveform. But since most digital audio is played back at the CD sample rate of 44.1 kHz, this requires a very steep filter with a lot of ringing.
A non-oversampling DAC removes the reconstruction filter entirely. This obviously eliminates any chance for the filter to ring, but it also leaves the large "stair steps" in the playback waveform. Quite often the analog stage of this type of DAC will incorporate an analog low-pass filter in the form of a transformer.
"But it doesn't matter whether a digital filter is used or an analog filter is used -- there is always a tradeoff between the sharpness of the filter (amount of ringing) or the gentleness of the filter (amount of "aliasing", or leakage of stairsteps)."
But it doesn't matter whether a digital filter is used or an analog filter is used -- there is always a tradeoff between the sharpness of the filter (amount of ringing) or the gentleness of the filter (amount of "aliasing", or leakage of stairsteps). The only difference is that an analog filter is always a minimum-phase type, while a digital filter is almost always implemented as a linear-phase type.
The Ayre QB-9 DAC has been very well received and employs "single-pass" 16x oversampling and a Minimum Phase filter. Can you talk about the benefits of this approach?
These are two completely independent parameters with two completely independent benefits. Performing all of the calculations in a single pass (rather than a concatenation of 2x stages) requires more computational "horsepower". We use an FPGA (Field Programmable Gate Array) with hundreds of thousands of gates that can be easily configured to suit our needs. In the past this would have been prohibitively expensive, but today the prices for these parts are quite reasonable (although still far too expensive to be used in a mass-market product).
"...a minimum-phase filter has all of its ringing after the impulse. This is the only way that echoes occur in nature, and therefore a minimum phase filter generally sounds more natural than a linear phase filter."
The advantage of performing the computation in a single pass is that there are always rounding errors at each oversampling operation. If you perform it all at once, the rounding errors are minimized. But if you perform 16x oversampling as a series of four 2x stages (the normal way, as it is the cheapest way), then the rounding errors are compounded four times.
As mentioned before, a minimum-phase filter has all of its ringing after the impulse. This is the only way that echoes occur in nature, and therefore a minimum-phase filter generally sounds more natural than a linear-phase filter.
I think that linear-phase filters became popular because around the time that digital audio was being commercialized, there was a trend to develop linear-phase loudspeakers and therefore linear-phase was perceived to be a generally good thing. Since it is trivial to make an FIR (Finite Impulse Response) digital filter either linear phase or minimum phase, and somewhat cheaper to use a linear-phase design, linear-phase filters have dominated since the beginning of digital audio.
You've mentioned that Ayre will be adding DSD capabilities to the QB-9. Do you have a planned rollout date and how will existing QB-9 owners implement this change?
We almost never have planned rollout dates. We do have an order in which we will tackle various projects, but the thing is that all of our projects incorporate new technology that has never been used before. This is our greatest strength and also our greatest weakness. It means that when we release a product it is always a leading edge design that breaks new ground and will have a long lifetime in the market, but at the same time usually takes longer to develop than we anticipate. So we only end up developing two or three products a year, and they almost always take longer than planned.
"All of our USB DACs already have DSD-capable DAC chips, so it will be relatively easy to add this to our products via firmware updates."
When your "new product" consists of the same old technology in a new box, it is very easy to predict how long it will take to complete. But when you are doing something that has never been attempted before you continually run into problems that have never been solved before. Sometimes you can find a solution in a few days and sometimes it takes a few months to solve a problem. The end result is that we rarely work to a strict schedule.
In the case of sending DSD signals over the USB connection disguised as PCM, Gordon Rankin and I first discussed this several years ago when we found out that Sony had introduced a format called "DSD-Disc". This is basically an SACD, but without the copy protection so that it can be played on any computer.
We discussed that conceptually it would be quite easy to implement, but in the end decided that there wasn't any real reason to do so. At that time the only way to obtain DSD files was to (illegally in this country) rip a copy-protected SACD to your computer. But in the interim several small recording companies have introduced DSD music downloads. dCS was the first company to announce a standard for what is now called DoP (DSD over PCM). They made it an open standard and Andreas Koch, now with Playback Design, but one of the original developers of the Sony SACD recording hardware worked hard to perfect the standard, with a lot of input from Gordon Rankin.
All of our USB DACs already have DSD-capable DAC chips, so it will be relatively easy to add this to our products via firmware updates. We may or may not make other upgrades at the same time; we haven't decided yet.
"The advantage of DSD is that there is no filtering, and therefore no ringing on the record side."
This leads us to the question of DSD and PCM playback. Do you see a benefit to one approach over the other?
The advantage of DSD is that there is no filtering, and therefore no ringing on the record side. There is some relatively gentle filtering on the playback side with minimal ringing. The disadvantage is that above 20 kHz the noise climbs very rapidly, so that at 100 kHz the S/N ratio is only about 30 dB. Well, in a recording of music (as opposed to a recording of, say, bats) there is absolutely no signal at 100 kHz that is within 30 dB of the full-scale output.
So the effective bandwidth of DSD is really only 30 to maybe 40 kHz. It also requires completely new equipment, not only for playback, but also for the entire recording chain.
In my opinion, the best solution is to use quad-rate PCM (176.4 kHz or 192 kHz) with very gentle filters that exhibit little or no ringing. It is that lack of ringing that gives DSD its sonic benefits and this can be obtained with quad-rate PCM, but there is no problem with out of band noise, and standard equipment can be used for recording and playback. This is what we have done in our new QA-9 A/D converter.
In the old days, to play a high-res recording you had to use either an DVD-Audio player or an SACD player. There were a few audiophile-grade SACD players made, but almost no audiophile grade DVD-Audio players. So there wasn't much point to buy the discs, because the proper playback equipment wasn't available. But with the advent of computer-based digital audio playback, high-sample-rate PCM is trivial, and even DSD playback is possible. So the entire game has changed. The format war is a thing of the past and it is easy to download high-res audio files with a broadband internet connection.
"It would be a lot of fun to make some $50,000 "statement" monoblock amps, and it would be just as much fun to make some entry-level equipment that more people could afford."
Can you talk about any new products that are on the horizon for Ayre?
Gosh, at any given time we have at least a dozen or two products to choose from. We are finishing up an integrated amplifier that sounds incredible, we would like to make a proper headphone amplifier, and maybe later we will make one that includes a DAC for those that don't already have a computer audio system. We still get requests for loudspeakers ever since I left Avalon twenty years ago, we really want to make a multi-input DAC and we have some great new ideas about how to do that. I personally would like to make a set of headphones, as I have some different ideas about how those could be improved. It would be a lot of fun to make some $50,000 "statement" monoblock amps, and it would be just as much fun to make some entry-level equipment that more people could afford. This list is practically endless, and the only question is what order to do them in.
For more on Ayre's Minimum Phase filter, check out their - MP Filter white paper.