Q&A with John Swenson. Part 3: How bit-perfect software can affect sound

In this third installment I will cover possible methods by which bit perfect software can affect sound quality in a DAC.

The traditional Holy Grail of playback software has been to be “bit-perfect”. The software does nothing to change any bit of the samples on their way from file to DAC. There have been a lot of people making the assumption that if all the bits are the same, then different software players MUST all sound identical. This article will attempt to show how different bit-perfect players can sound different.

In order for this to make any sense you need to understand the information in the first two articles (see Part 1 and Part 2). These will provide information on how what happens inside a computer CAN work its way across a cable and into a USB DAC running asynchronous transfer mode.

First I am going to cover differences transmitted through USB packet timing. Remember from article two that the arrival of each USB packet in the receiver generates a burst of noise on the power traces and ground plane. Variations in the timing of these packets will significantly change the spectrum of the noise. Note: the bits still make it across correctly and the AVERAGE data rate stays the same. I’m going to call this “packet jitter”.

Packet jitter is most frequently caused by software. In many systems the time at which each packet is scheduled for transmission is computed in software. If that software is late in doing its job, the packet timing will change. This software that does the packet scheduling is almost always interrupt driven. The exact time this interrupt routine is called can be affected by other software on the computer, in particular process and thread priorities. The kernel scheduling protocol also has a significant affect.

"So not only is packet jitter affected by the thread structure of the player software, it is also significantly affected by other programs and services running on the computer."

So not only is packet jitter affected by the thread structure of the player software, it is also significantly affected by other programs and services running on the computer.

If the player software uses multiple threads the thread priorities can significantly change the interrupt behavior of the code scheduling the packets.

The next category is just general purpose noise on the power and ground planes. This can come from many sources, many of which are influenced by the software running on the computer.

"I have heard many people talk about “processor load” when talking about this, but all my tests seem to show that just raw processor activity has little correlation with the noise."

I have heard many people talk about “processor load” when talking about this, but all my tests seem to show that just raw processor activity has little correlation with the noise. It certainly does generate noise, but it is hard to make much correlation.

"What does correlate fairly well is main memory accesses."

What does correlate fairly well is main memory accesses. Every time an access is made to the main memory a huge number of wires on the motherboard switch all at the same time. These lines have to run fast, so they are driven hard. Thus every time memory is accessed huge voltage drop and spikes occur on the planes. The spectrum of this noise is VERY sensitive to the exact flow of the program, not just raw processor activity.

A major part in this memory access noise is what is called cache performance. Every processor includes a “cache memory”, this is memory in the processor which operates in sort of a shadow mode. Once the processor accesses a particular memory location, it stores that data in the cache, so the next time it tries to get that location, it doesn’t have to access main memory. There is only a fairly small amount of cache memory available, so an access does not always get the value from cache, this is called a cache miss. Cache is used for both instructions and data.

A program can influence the cache performance in many ways. One of the easiest to understand is just plain size of code. If the main loop of a program is small enough to fit entirely in cache, the program doesn’t have to do any main memory accesses for its instructions.

Branch complexity is also important, a program testing this, calling that, going off to there all the time is going to have a much higher cache miss rate than a program that stays in fairly tight loop.

Special instructions that do a lot with just one instruction can also significantly decrease cache misses. But not all processors support all the different types of these instructions, so few programs really make use of these.

The way in which data is moved through the system can also make a big difference. Many programs and OS level code make use of “hierarchical” programming with different “layers” that do different things in the system. This makes for nice modular programming, but it also means that the data usually gets moved from buffer to buffer inside the different layers. This can dramatically increase the amount of main memory accesses.

One interesting aspect of this is compiler optimizations can significantly increase the cache misses. The compiler optimization options usually perform operations that either make the code run faster, OR take up less total room in memory. Unfortunately these optimizations almost always take nice simple loops and break them out into much more complicated code, which usually causes cache misses. Thus fairly frequently using optimizations makes more noise.

"Thus fairly frequently using [compiler] optimizations makes more noise."

Over the last several years there have been reports that lower latency programs sounded better. My take is NOT that the latency itself is important, but that code that is low latency tends to have small tight loops, and THIS decreases the cache misses.

"My take is NOT that the latency itself is important, but that code that is low latency tends to have small tight loops, and THIS decreases the cache misses."

Obviously this is far from an exhaustive list, there are many things that can change the noise generated by a computer, but I hope this gives a taste for the type of things that can significantly change the noise inside of a computer, that can ultimately end up changing the sound of the DAC.

Please remember that nothing I am talking about actually changes the bits. This is all a byproduct of the particular way a particular program goes about its task of getting the bits to the DAC.

CG's picture

Great stuff!  Thank you for taking the time to put it into words we all can read.

Steven Plaskin's picture

Thank you John for this excellent information. For those of us that do hear sonic differences in bit-perfrect software, this explanation provides a good theoretical foundation for this much debated issue.

John Sully's picture

As a former engineer for SGI, I've done a fair amount of cache performance analysis, and have done my fair share of hardware design work and very low level hardware support code. I've also written several hard disk controller drivers, ethernet drivers, serial drivers and some more exotic hardware drivers.

The real cache killer is the sequential access to the DMA buffers of your data. You will NEVER get a cache hit on the data your are decoding/transferring, so worrying about cache locality in the executable is worthless.  The noise on the computer from code cache misses will be swamped by those from the data buffer misses.

There is a lot else wrong with this article on the computer side but a lot of it depends on specific hardware implementations. Bearing in mind that I know very little about the USB protocol, I even have a hard time giving much credence to "packet jitter". Such a problem can be solved with a minimal amount of buffering in the DAC and as long as the system is not bogged down (that's a technical term) interrupt latency should be on the order of microseconds.

Richard Dale's picture

Thanks for a very interesting series of articles.

"Packet jitter is most frequently caused by software. In many systems the time at which each packet is scheduled for transmission is computed in software. If that software is late in doing its job, the packet timing will change."

My understanding is that a player program writes a buffer of audio data to the USB device driver via the USB sub system, where it may be queued (on Linux at least). The device driver will subsequently handle actually sending the data to a DAC over the bus when requested by the DAC if it running is running as a master in asynchronous isochronous mode. So I wouldn't expect the audio player software to directly affect the timing of when packets are sent.

CG's picture

Isn't a device driver software?  These descriptions get hard to follow...

John Sully's picture

A device driver is a piece of software which runs in a privileged or kernel mode. Running in this mode allows it to run at a heightened priority. This means that if it wants to do something it can take all the time it wants, as long as a higher priority task does not come along. User software tends to run at the lowest priority.

CG's picture

I get that.  The point that is that it is software.

I believe that John S. is trying to explain how variations in various hardware processing, as instructed by software, causes noise.  The spectrum of this noise is variable as is the amplitude.  A lot is determined by the software at various levels, from drivers to the operating system to the application software.

At least that's what I think he's saying.  (Don't want to put words into his mouth.)

John Sully's picture

Yes, you are correct that a program writes a buffer of data (will generally point the USB subsystem to it) and it will then be read via DMA (or byte by byte, depending on hardware).  However when more data is needed an interrupt will occur which will awaken the client program, allowing it to write again -- the write is said to be blocking. The program could also use async writes which return immediately and allow the program to perform other tasks, think readying the next buffer,   and the interrupt will result in the program handling the end of write notification.  

UpTone Audio's picture

The factors John discusses are not only experienced audibly by users of various premium player software, but are also issues that the developers of the best players are focused on.  I know for a fact that Damien Plisson (Audirvana Plus) considers and adjusts/tries-to-control many of parameters John outlined (some of which account for the great advances in SQ A+ has made this year), and others such as PeterSt. (XXHigh-End) and Miska (Signalyst HQPlayer) are also quite conscious of these issues.


John's research (and it is that--including h/w design--he is not an armchair theorizer) also shines a light on the many other variations in sound quality that can be heard by changing things we do not usually think of as making a difference.  In the past couple of months I have been experimenting--and prompting others to do the same via my reports--to great result, with things like drive interfaces, for both operating system and music storage.  Although one might think a memory player software program should not be affected by location of music data, in a revealing system the differences are surprisingly large.  Copying tracks to a RAM disk, while inconvenient, has proven to be the gold standard for realism, followed by SD card (as long as it is not on the USB bus), direct SATA SSDs, and then trailed by various external interfaces (each of which have ways of introducing sonic compromises).

[For those of you are interested in the active discussions and progressive real-world SQ discoveries a group of us have been sharing lately, here are links to three threads over at the other friendly computer audio forum:

EDIT: Looks like the AudioStream site does not allow me to post the links.  My username is Superdad (no 62) over at ComputerAudiophile if you are interested in the reports and discussion.]


By the way, John Swenson will be coming to stay and work/play with me next weekend.  He will be bringing two of his recent cutting-edge DAC creations (done for other firms), as well as the pre-production boards for a project we are working on together.  I have known and worked with John for more than 8 years (since tapping him for co-development of an ahead-of-its-time network digital music player/DAC while at Hovland Company), and I am excited to once again attempt to play "Jobs" to his "Wozniak."  He is a gifted, generous, and very kind man whom I feel blessed to have as a close friend.  Sorry to go on like this.  It's the last day of the year and I'm just feeling thankful for the sparks that music and audio continue to ignite in me. Guess it's in my blood.


--Alex Crespi

John Sully's picture

But as I said, if memory access noise is such a huge factor, the sequential accesses to the i/o buffers are going to dominate any cache locality issues associated with the code. That is just a fact. When a filesystem block is transferred from the disk into memory it uses DMA which doesn't, (in general in an Intel architecture machine) prime the cache and causes noise on the bus. When the program starts copying bytes from the i/o buffer for decompression/decoding it causes noise on the bus as the corresponding cache lines are read. The program then writes these decompressed/decoded bytes out to the output buffer. When these are flushed to memory for i/o purposes, they cause more memory bus noise. Then when DMA moves them from the output buffer to the USB controller for output to the DAC more memory bus noise occurs. This is all w/o even considering the memory accesses which may be needed to fetch code if cache locality for the code is poor.

Sequential access to data is a cache killer and it can't be helped in application such as audio. Code would have to be almost criminally poor to get to the point where code in the inner loops would be stepping on itself.

I've spent more time than I care to think about working on cache issues and control code for Silicon Graphics workstations. I know how this works, and I see no evidence than John is a computer hardware engineer in his bio in part 1. I am a software engineer who worked at the hardware/software interface for more than a decade. I know how this works, and I understand the implications for performance.

deckeda's picture

Cue Michael's image of the unicorn for the mythical, it-would-fix-it-all Ethernet DAC!

John Sully's picture

If you are worried about interpacket arrival times, ethernet is awful.  Of course it is packetized, so buffering and clock generation should do the trick....

UpTone Audio's picture

To address your question about John Swenson's background with computers:

He is an electronics engineer working for a well-known semiconductor company for 30 years (he prefers not to say which one in public).  He lays out integrated circuits, mostly large high speed digital with sensitive analog circuitry as well. His specialty is power distribution networks in those chips, providing low noise power and ground networks to provide very low jitter on the interface between the digital and analog sections.

JS has also been designing and building DACs for about 10 years.  His hardware testing is very empirical, and he also has an excellent ear (he plays large hand bells and sings in a church choir).

I don't think his article was meant to be a comprehensive assessment of all the factors associated with SQ variations in computer audio software/hardware.  Rather it was to explain how "bit-perfectness"--output of the same ones and zeros--is not the only important goal to getting the best output from the computer.

Mr. Sully brings up a number of other areas where noise can be injected into the system during different stages of I/O, and I don't think there is any doubt that those too affect SQ.  In fact, I think some of what you say may support and explain why I (and others) hear such dramatic differences between various storage interfaces--with the simpler and quieter ones (such as SD card) sounding best.

Also, if you look at some of the better music player software, you will see that one of the designers' goals is to bypass as many of the OS's layers as possible to get smooth and exclusive output to the USB port.  Some aspects of that have been made easier with new OS versions and drivers, but the player designers still dig in to further optimize that.

I just know what I hear with my own ears…

John Sully's picture

I suspect the main reason, if indeed SD cards do provide better sound, is latency.  I/O path should be basically the same, in that they hang off the disk controller and look like a disk to the OS.  There will be less noise associated with the motors which position the head.  To write an article which focuses on second or third or fourth order effects and ignore the elephant in the room is silly.  

I can see how noise in the computer power supply could induce noise in a USB DAC, especially if the DAC is pulling power from the USB bus, but I would like to see some data.

jim tavegia's picture

It seems from all of this that getting USB to work for streaming audio was not that hard for most MP3s, but now with 24/192 and DSD in the mix, would not have FireWire been the way to go?  I would think that staying away from all the "daily" uses for USB for keyboards. your mouse, and your printer would have been a good idea. 

The industry is full of very smart people as this discussion shows, but it seems like an awful lot of hard work is going to be needed to really make this all work well. Maybe it is a miracle it works as well as it does now. 

Michael Lavorgna's picture

Gordon was having problems getting this posted so I did the honors.

John, Michael;

First a little clarification, USB packet jitter is the timing between packets. USB Jitter is the movement of framed data and EYE pattern (the ability to decern the difference between a 1 and 0) from ideal. Audio related jitter error is the re-serialization of the received audio data and put into an audio format like I2S.

John this is my crazy USB test set:

MacBook Pro (dual boot) <==USB Analyzer==> DAC--->Prism dScope III

Symmetricom jitter/phase noise analyzer

Tektronixs MSO4 Scope with Audio Module and USB Module. Audio allows me to look at I2S framed data, USB module allows testing of USB EYE patterns and data in Full Speed and High Speed networks.

Two years ago I posted that USB Packet jitter really didn't happen. With the USB Analyzer I can see the variation of packets sent and it's less than 1ns on High Speed links. FYI... Full Speed we receive data frames every 1ms. High Speed we receive 8 packets per every 1ms (every 125uS) microframes.

I sent a email to John Atkinson and Charlie Hansen saying I give up what's making the applications sound different or for that matter file type sound different. To me my research showed that more processing made for worse sound. That is why an ALAC or FLAC file will not sound as good as a flat PCM file like AIFF/WAV does.

The data found in the files and on the USB link were identical with identical timing. The data received at the DAC and verified on I2S was the same. The Symmetricom found the bit clock jitter to be the same.

To me I think we need to look harder at what is going on here. I also think the term JITTER to reflect all digital errors to be really lacking at this point.



Firewire is not that hot. Since it does not have asynchronous modes then the PLL has to be recovered and the I2S stream has to be cleaned. Still not a great interface for streaming.


John Sully

SD cards a much worse as they use 4 bit interface which takes much longer and the technology is "not smart" meaning there is no caching or buffering which means it will take longer to access.

~~~ Cache and DMA...

Guys I write device drivers and do hardware all day long. Cache has nothing to do with DMA. When a device driver gets audio data it is strung together in protected memory of the device driver with other packets going to that device or other devices. The USB controller takes the data that the DMA sends to it and delivers that to the DAC and other devices.

This is why I don't think packet jitter (referred to in most text as frame jitter) has any relevence there.

I think that maybe RFI/EMI, power spiking is a more relevant area that should be looked at.



John Sully's picture

I was thinking SSD drives. Acronyms :-)

Caching and I/O are intimately related. For Intel architecture chips, the cache snoops the bus during I/O read cycles and invalidates matching lines in the cache w/o software intervention. For I/O write operations any data in the cache is flushed to memory. For low to mid range SGI machines, software invalidates/flushes the cache prior to the I/O occurring (the supercomputers used bus snooping). The reason this is done is because one doesn't want stale data returned from the cache once a read operation is complete and you don't want stale data copied by the device on a write operation. All of this is completely invisible to the device driver writer for both Unix and NT flavors. I haven't written Linux drivers, but for Intel architectures this is taken care of by the hardware, so no software intervention is necessary.

Unless you've got some magic hardware that can move data across the bus by divine intervention, DMA and caching will play a role, especially if you need to process (decompress an ALAC or FLAC file or decode an MP3 or AAC file) the data. If you are dealing with .wav or .aiff you should be able to just hand a pointer to the data buffer off, but there will still be memory accesses.

CG's picture

"I think that maybe RFI/EMI, power spiking is a more relevant area that should be looked at."

Hear, hear!

This is easier said than done.  Much of the noise is really a current that is hard to measure except by differentially voltage probing across a power or ground plane, or by using an actual current probe.  The levels aren't really high, either, so you pretty much need to use a high frequency spectrum analyzer instead of an oscilloscope. 

This really isn't as simple as turning up the volume control, sticking your ear next to your speaker's tweeter when nothing is playing, and declaring that your system is noise free. Or the woofer in the case of ground loops.

At the risk of sounding like a broken, ahh, wav file, many of the problems that people are trying to address and attack are system level issues. If you get down to it and perform serious engineering analysis of the various current paths in an audio system, just how they might be imperfect, and what the effects of these imperfections might be, you get into the endless battles of how "that doesn't matter" and so on.  Again, it isn't simple once you get beyond a certain point. It's not only engineering and physics, but how the human auditory system works.

Wavelength's picture


Actually I am thinking broader than the effects on the DAC, but the system as a whole. Ok so if we have opto isolated USB so the ground of the computer system can be isolated from the audio section.

Then why do the applications still present a different presentation?

I am not a fan of upsampling, but some 80's tracks got me thinking and I upsampled some 16/44.1 stuff to 88.2, 176.4 and 352.8. Each of course is causing the processor to increase the amount of code and demand on the system. The 176.4 sounded the best and the 88.2 was actually much better than the 352.8.

Well 352.8 has more diminishing returns as even the best dac chips are now working out of the optimized area and everything is doubled.

Note to all you people asking for more, more, more... just because you can do it doesn't mean it's going to sound better. Better to listen than just go out and buy something because it does something you currently have.



Wavelength's picture

John Sully,

You are thinking of external cache which was popular in the late 80's and 90's. Now the cache is central to the CPU unit itself and not external memory which DMA works off of.

They found years ago that DMA and cache screwed up processing as the code loop would be violated by a DMA access. Cache is only helpful in processing, not moving memory around so DMA has no influence or effect on the processors cache.



John Sully's picture

Intel has been using internal caches for a while (1989 with the 486), same with MIPS processors. In those days both had external L2 caches. Internal or external, the same things have to happen to guarantee that your i/o operations are coherent. Without it, you get the wrong data on reads or the device gets the wrong data on writes (assuming you use the processor). It is possible that Intel has added instructions which allow read/write operations to bypass the cache if data is from/to an i/o buffer, but that really doesn't make a whole lot of sense since it would slow down both reads (slightly since you would be reading words, dwords or quadwords instead of cache lines) and writes (since you would be skipping the cache).

When we were developing the SGI VisualWorkstation line in 98 and 99 I went over and over this with our hardware team, who were mostly MIPS people and used to software controlled caches. I eventually won, which saved the company a ton of money. Unless Intel has made a huge breaking change to their architecture, I would expect things to be quite similar, although the L1 cache might be quite a bit larger than they were back in those days (it was 2-way set associated physically tagged/indexed) and the L2 caches were typically much larger and 4 way or more associative. Now they are using unified L3 caches of several Mb.

I just took the time to skim a paper on the newest(?) Nehalem (i7) architecture, and while there are (large) changes to optimize interprocessor communication, it remains much the same in terms of the coherency protocol, an additional line state having been added to minimize bus snooping on memory accesses.  The protocol is a variant of the MESI protocol which has been used in Intel (and most other hardware coherent) cache architectures for decades..

Link: http://rolfed.com/nehalem/nehalemPaper.pdf  See figure 1.

In that figure note where the i/o bus controller sits and that it is able to communicate with the QPI, which in turn communicates with the cache and the memory.  Figures 1 illustrates a NUMA architecture, but Figure 4 (left diagram) shows a more conventional desktop implementation.  Notice where the i/o bus sits in both cases.  It is interesting to see some of the same issues SGI was dealing with in it's supercomputers 15 years ago being dealt with on the desktop now, although not on as large a scale (we were working with several hundred processors on some of the larger Origin based systems).

pisymbol's picture

...cache coherence like anything else that deals with shared data and is a concern on writes not reads.

i.e. You and Gordon are talking over each other.

This discussion is within the context of playback software in which DMA is armed with user buffers of LPCM data to a HCI which will schedule the packet for transfer on the USB bus.

The flow is entirely outbound with respect to main memory so as Gordon points out, cache coherence is not really at play here.

But your comments about QPI and hierarchical memory are spot on. DMA addressable memory is volatile and there has to be some synchronization methods to maintain cache coherence when external (external to the CPU's mem controller) devices write to it.

Regardless, this article doesn't explain how ANY of this effects sound!

Gordon definition of packet jitter and his then subsequent measurements seem to indicate there is zero difference between software with respect to frame delivery through the cable to the DAC. None. I have not read one author of any of these pieces of software explain why they believe their bit-perfect player is better other the other guys other than hand waving explanations like "we use best software practices." Silly and insulting to the guys at foobar2k or the myriad of well written free players out there.

FURTHERMORE, almost ALL competitent modern DACs buffer the incoming data anyway which means any timing issues or "jitter" will be intrinsic to the clocks used within the DAC itself, nothing related to outboard spikes on your motherboard. A few nanoseconds of packet delivery jitter to the inboard buffer of the DAC should be inconsequential to any timing artifacts by the time the recon filter gets at it. As every engineer worth their grain in salt is aware of that recon filters resample the data at extremely high bit rates internally to do things like deal with quantization error more effectively (I'm not getting into this).

NOTE: I bought Audirvana for the Mac. I like its design. It has features that a lot of the other players do not, like on the fly sample rate switching. It works with my all of my equipment flawlessly. Damien is a great developer and very responsive. Do I think it sounds better than any of the well written pieces of software? No.

Gordon, what I'd like to know from you is this:

Do you still believe Adaptive USB implementations always sound inferior to Asynchronous even if today's DAC effectively buffers and reclocks the data? If so, why?


Michael Lavorgna's picture

A number of people have reported problems posting comments. We're looking into it. In the mean time, this response is from CG:

I will not pretend to speak for John, or Gordon, or Michael, or even my dog.

My understanding of this article is that the processing of files containing audio information creates electrical activity within the computer.  Depending on how the software processes the data, and what processing is done by the application software as well as the rest of computer operations (drivers, operating system, etc.), this electrical activity varies in level and in spectral content.  Different application software coding as well as algorithms can produce the very same bit perfect data stream, but with different electrical activity within the computer.

Compared to an analog electrical representaion of an audio waveform - which is what actually drives the loudspeakers and headphones in an audio system - this electrical activity is noise like.  Noise in the electrical engineering sense, not just hiss in the loudspeaker.

This electrical noise can, and probably does, migrate downstream in the audio system through the ground connections, the signal connections, the power connections, as well as the regular analog connections like interconnect cables.  Not only by direct conduction, but also by electromagnetic coupling of currents between these system components.

The noise might contaminate the conversion clock itself.  Or the clocked convertor component in the DAC circuit.  Or the analog processing following the DAC.  Or the preamp stage (if there is one).  Or the power amplifier.  Or some of the above.  Or all of the above.  None of the above is probably unlikely.

Please note that even if this contamination is outside the normally considered audio band of 20 Hz to 20 KHz, it still can cause degradation.  Obviously, well more than half of the circuits working to convert the data stream into analog signals operates outside that band, so if it is somehow damaged, the desired output will be less ideal than we might want.  Even in a power amplifier designed specifially for use only in the audio band, the circuitry is still subject to out of the audio band energy.

But, every engineer worth their salt knows all this, so I'm just outlining for the folks who might not be engineers.  These sorts of problems manifest themselves not just in audiophile music systems, but in just about every other electrical system there is. Naturally, what matters in one situation may not matter, or matter much, in another, so you have to examine each problem closely to see what can be ignored and what shouldn't be.

pisymbol's picture

Michael, I respect your efforts to try to explain this but it simply doesn't fly, in anyway...

I won't even go into the fact that if what you said was true that transferring files from an external USB disk would be quite an interesting undertaking depending on the memory access spikes as this original article seems to elude.

It's just wrong.

As John Sully and Gordon explained, the original author of this article does not jive with how computers work, how motherboards and power supplies are designed, etc. etc.

I love him to DBT the software players based on the number of cache hits (we can count them via special registers on the CPU).

The bottom line is this (in as laymen terms as I can muster right now):

Audio data (PCM) that is contained in a WAV file or decompressed out of a FLAC file are encoded into USB packets (or frames) that are transferred through a wire as 1's and 0's. These 1's and 0's are represented by voltage/current spikes through the copper in the USB cable, clocked at a certain rate (Hi-Speed vs Lo-Speed etc.). How these frames are layed out in a continuous stream as well as the electrical characteristis of the cables and ports are governed by the USB spec (physical layer in the OSI model).

Any margin of error caused by noise leaking through as you say would still can not have any effect that would cause the USB spec to be violated, i.e. Manufacturers must design components that deal with noise in such a way that they are still within the margin of error that keeps them compilant. This is true for all modern bus protocols (SATA, SCSI/SAS, Ethernet, etc.).

Gordon already talked about the margin of error in which the rate these packets are transferred and delivered to the receiver as jitter. Read his post, he could not find any noticeable differences. There are other examples of folks testing various USB chipsets at FULL LOAD and can't find any real differences. Do you have any idea how much intrinsic noise you are going to cause running stress tests like this?

Moreover, and even more importantly, all the data that is streamed over via the USB transfer will be BUFFERED. That's right, your outboard DAC or USB/SPDIF converter will buffer the packets and RECLOCK them with its own high performance crystal oscillator. So even if all the packets in the USB stream that represents your favorite gangsta rap track didn't arrive at exactly the same rate, as long as the arrival buffer is full enough to re-clock it, any jitter caused by this process is inconsequential. I am pretty sure Asynchronous USB was developed by Gordon in an effort for the receiver (the DAC) to control the rate in which this buffer is filled to make reclocking it a much more controlled process instead of the Host dictating to the receiver when these packets should arrive. He can correct me if I am misrepresenting it or adding anything else he feels is necessary.

John Sully's picture

If they are, and it appears so from a brief look at Wikipedia, shouldn't common mode noise rejection be pretty good?

CG's picture

They are mostly balanced for high speed mode, less so for full speed mode.  But, that only helps for the signal voltage being sensed by the USB data "receiver".  Noise currents still flow in the "ground" the rest of the system is attached to.  That's where the problems originate.

It's possible to minimize this noise, but it isn't especially cheap or easy.  

Great question!

pisymbol's picture

John S., that's right. It has to be otherwise non-audiophiles would be very, very upset.

The idea that memory controller spikes caused by cache misses which then somehow cause packet delivery "jitter" which somehow mystically degradiates an outboard DAC's ability to do its job is well ...  let's just say dubious at best.

Anyway, I'm not going to defend my position or try to explain to the guy who says I completely missed the point of the article because he's right, I did, since its non-sensical ... the author of this article needs to define and correlate his noise "threshold" to audiobility otherwise its just more FUD.

Put simply, how much noise needs to be generated by the host (computer) to where audible artifacts can be detected? How many memory controller spikes or voltage swings are needed before audible artifacts exist? THEN, he needs to explain how these artifacts manifest themselves and cirumnavigate the MANY MANY provisions put into the USB specification to deal with noise.

EDIT: In no way was I blaming Michael for the contents of the article, I would just explaining my position, perhaps too abrasively. Mea culpa.


CG's picture

The noise manifests itself as a current on the ground connection. A good discussion of this can be found at:  


What level of common mode noise in the ground might be a problem?  A high speed USB signal is nominally 400 mV.  Imagine that there is only 1 mV of common mode noise within the audio frequency band riding on the two data lines.  That isn't a lot and should not cause any trouble with the USB data detection, right?  The eye diagram would be pretty clean, at least with regard to noise. 

Now, what is the noise current level in the ground connection with 1 mV of noise? There must be some noise there, or nobody would've specified a balanced receiver to minimize common mode noise intereference on the USB data lines.  The common mode impedance for high speed USB at the receiver is the two 45 Ohm termination impedances in parallel, or 22.5 Ohms.  Let's call it 20 to make the arithmetic easy. The noise current is therefore 50 microamperes, by Ohm's Law.

If all of this noise manages to pass through a 50 Kohm input impedance of a typical preamp (or power amp), that then becomes 2.5 Volts of noise at the input.  Compared to the typical 2 V rms full scale audio level output from a DAC, that's not so appealing. 

Of course, the entire noise current does not flow through the input impedance of the preamp, but, just how much rejection is there?  It will depend on the equipment design and the way the pieces are really connected together.  Don't forget, unless everything is battery powered and entirely independent from the AC mains wiring, there is a conduction path there as well.


My point is that noise generated in one part of system does not magically go away just because we choose to change what we call it.  Noise currents will continue to circulate within closed loops unless the loops are broken or the noise power is either radiated or turned into heat by lossy elements. Sure, the USB receiver might be configured in a way so that it has pretty good rejection of common mode noise for the task it needs to do, but that does not break the noise current loop for the rest of the system.

In addition, that simple example was just for noise within the audio band. More likely, the computer born noise is higher in frequency.  Even the frequency response of most audio electronics rolls off dramatically above a few ten's of kiloHertz, its linearity goes straight into the toilet.  This is why you hear stories of AM radio stations or shortwave broadcasts bothering audio gear. So, even though the noise current may be even lower at high frequencies, the audio circuitry might also be more susceptible to problems at these frequencies, too. 



For those who'd like to read more technical information on the general topic:


I want to say that I'm not in any way trying to pick on anybody here. But, over time I've watched as various simplifications (what I believe the argument here is about) and other outright deceptions (I DO NOT believe that is true in this discussion) have found their way into common belief. That's not right or good for any person who is just trying to make an audio system they can happily listen to.

cundare's picture

>The idea that memory controller spikes caused by cache misses which then somehow cause packet delivery "jitter" which somehow mystically degradiates an outboard DAC's ability to do its job is well ...  let's just say dubious at best.

OK, I was becoming increasingly agitated as I read through this article and the seemingly endless stream of replies.  As an electrical engineer for nearly 30 years, I found the article to be gibberish and, finally, I see another reader had the same reaction.  A detailed explanation of the ways that timing may vary in an asynchronous communication is, um, kind of irrelevant.  Async  communications is *supposed* to have variable timing.  That's its strength. And that's why, as you say, async comms interfaces are buffered.  As for voltage spikes on buses causing degradation in the digital domain, well, if that's the author's (tenuous) hypothesis, that's the issue he needs to support, not an explanation of why asynchronous packets don't arrive at the same time as synchronous packets.

This piece was just silly.  The most interesting revelations here are about the technical backgrounds of the people on this list masquerading as audio gurus.  Anybody seriously discussing the nits of this article as though it means something stands revealed as, um, not knowing all that much about communications protocols.  For shame.

John Sully's picture

Depending on what was done with the buffer prior to the read (if the line is in the cache for some reason) the line needs to be invalidated.  That's the only reason I meantioned reads.

CG's picture

First, do not blame Michael for that post.  As he wrote right at the very top, he was just relaying a post I was having technical difficulty posting.

Second, I think you missed the entire point.  Completely.  In fact, so much so, that I am probably not the right guy to try to help out here.

judmarc's picture

pisymbol, two things re buffering:

- First, I see where you've asked Gordon Rankin about async USB versus adaptive "even if today's DAC effectively buffers and reclocks the data."  If the DAC uses adaptive mode, then the clock in the DAC after the buffer will "adapt" to (i.e., be controlled by) the clock back at the source (player/computer), so it isn't reclocking in the sense of clocking separately from the source.  Thus the buffer doesn't remove any problems involved with the clock signal in the source or getting that signal to the DAC, it just means you're not going to drop bits.  If the clock in the DAC is indeed reclocking separately from the source player/computer, that's the definition of asynchronous mode.

- Second, I think it was CG who mentioned potential effects of noise on the DAC clock itself.  Since the DAC clock is timing data *out* of the buffer, the buffer won't help with that sort of problem.  Something else that I believe may have an effect on timing after the buffer, and even after the DAC clock, is the following: When the DAC chip evaluates the signal to determine whether it's seeing a 1 or a zero, it does so by comparing signal to ground.  Let's say there's electrical noise on the signal lead that elevates the value of the signal slightly with respect to ground.  Then a signal on its way up from 0 would get to the "zero crossing point" and a value of 1 slightly sooner, while a signal on its way down from 1 would get to the "zero crossing point" and a value of 0 slightly later.  Electrical noise on ground would have just the opposite effect.  So you've got small timing variations looking a lot like jitter, taking place right in the DAC chip, after the buffer and after the DAC's clock.