Further Adventures in FLAC

During the course of my video interview with Scott Wilkinson of Home Theater, a viewer asked a question regarding Async USB DACs and uncompressed FLAC playback and I did not have a good answer. So I did some research. And I still did not come up with a compelling answer. So I sent off an email to Gordon Rankin of Wavelength Audio since he was the first to implement the Async USB mode in a consumer audio DAC and he's done some listening tests comparing FLAC to WAV files. Here's what I asked Gordon:
I did some listening to compressed FLAC v WAV and found I could hear a difference between them (and I preferred WAV). The most popular theory I've seen offered to explain this difference in playback is that the extra processing to 'unpack' compressed FLAC may introduce timing errors.

The question that was raised went something like this - if an Async USB DAC offers near-zero levels of jitter, wouldn't the DAC essentially correct these timing errors caused by FLAC playback?

And here's Gordon's answer:
Actually this does not introduce jitter or timing issues. I wrote about this 4 years ago and still have no better understanding of why this is the case. We did case studies during 10 shows with a MacBook Pro and bootcamp and doing AB FLAC/ALAC and AIFF/WAV/WAVB and always found that the flat PCM file (AIFF/WAV) always sounded better than the lossless file. To take this further I setup the following test:

MAC(Bootcamp Windows & OSX) <==USB=={USB Protocol Analyzer}===>DAC--->Prism dScope Audio Analyzer

Off the DAC internally I hooked up my jitter analyzer and my Tek Scope which has an I2S decoder on it. I therefore could look at the File, USB DATA Stream and the I2S DATA stream and compare them.

Nothin...

As a programmer I can tell you the process overhead with converting the FLAC/ALAC to a flat PCM file which is what is required for play back is really heavy duty.

I mean the same can be said of different applications. Ok we have six OSX programs and when they are all bit true, then why would they sound different? Got me... I really cannot tell you why.

Ok maybe EMI/RFI is getting into the system. Well we did all the testing on the Cosecant which is optically isolated and considering my lab is a mess then it should be worst case. No matter....

So the idea that compressed FLAC playback causes an extra load on the processor thus introducing timing errors appears to be put into question by Gordon's test. You can see some recent discussion of this very topic on Audio Asylum. Here's an interesting proposition from that discussion put forth by Steve Nugent of Empirical Audio:
I believe that the reason why FLAC sounds so bad with USB and Firewire is that the behavior of the execution of the FLAC code is radically different than when streaming networked, partly because of the real-time nature of the former, but also because the core audio stack undoubtedly causes higher latency and multi-thread issues with the CODEC. There may be synchronization problems created by the core audio stack because of the jerky nature of its execution. It may be that running a stream to the network allows the CODEC to run contiguously.
And this from ItemAudio:
We've also run numerous sighted and unsighted audition tests, in house and with customers, and only in a minority of cases have listeners not favoured WAV/AIFF over FLAC/ALAC. The difference is subtle, but remarkably consistently described. Of all the idea thus far advanced, none seem wholly satisfactory, but there's clearly something going on.
And this from Tony Lauck:
Many, myself included, have heard differences playing FLAC vs. WAV files, at least with certain software. Without specifics as to the equipment and software involved it's hard to draw any conclusions. Also, "numerous tests..." is not a source of authority without details of how the tests were conducted. Those of us who have been around for a long time recall many occasions where we have fooled ourselves, have been fooled by others and in some cases, as much as we'd like not to admit it, have fooled others.
As with other topics in hi-fi such as the audible effects of cables, dimensionally-challenged room treatments, LP demagnetizers, and so on there exists a clear break between what is perceived and what is felt to be possible or accounted for using the "scientific method" or some may suggest just using common sense. While many people believe that without a rational explanation and supporting documented testing to confirm that there is (or isn't) a difference between uncompressed and compressed playback, it is irresponsible to suggest that there is. From my point of view we are talking about a hobby whose ultimate goal is the enjoyment of music so a perceived difference that consistently adds to our enjoyment is as valid a measure of effectiveness as any other.

In this particular instance we are not talking about investing in a piece of kit that may or may not in fact do what it says it does. We are talking about file formats and the only difference relevant to this debate that we can all agree on is that uncompressed formats take up more storage space. On the other hand, most reasonable people seem to agree that there may in fact be a notable difference in the playback of uncompressed and compressed formats but the reason for this difference may not be limited to or caused by the specific file format in question. In other words, something else in the playback chain or some combination of things may be causing a perceived difference.

One of the issues that makes determining difference in this case so difficult is the sheer number of possible combinations of things that need to be addressed in order to come up with conclusive evidence one way or another. And since there's no financial angle in play, I can't imagine anyone devoting the time and money to take on this task. So my guess is we'll never know for certain one way or another.

On the bright side comparing file formats in your system is free and easy so the most sensible answer to this question appears to be just try it and decide for yourself. That's what I did.

COMMENTS
Patrick Butler's picture

I've had the exact same experience.  Initially I was skeptical that there should be any difference between apple lossless and wav.  Then I listened.

crowfax's picture

Without the results from blind ABX testing that prove the listener can tell the difference between the two identical sound files, one in wav and one in FLAC, beyond all statiscial doubt; this article holds almost zero weight as a piece about FLAC compression and decompression.

If anyone would like to do their own tests then I'd reccomend using Foobar2000 with the ABX Comparator plugin on Windows. I'm sure the Mac has something similar with which you can do actual scientific results so you don't just spout nonsense about which format you "prefer".

Michael Lavorgna's picture

But you'd be no closer to determining the cause of any perceived difference.

Preference, in a hobby concerned with the enjoyment of music, is hardly nonsense. It’s actually common sense.

usernaim's picture

Unfortunately for our purposes here, the ABX comparator creates a temp .wav file for each sample, and that is what you are hearing, since it assumes no difference in real time decoding.

whell's picture

When reading posts in web forums or comments in artciles where folks claim to her a difference between FLAC and WAV, I'm never sure if they're referring to fully uncompressed FLAC in the comparison, or FLAC with some level of compression.  In fact, some folks seem surprised to learn that FLAC compression levels can be adjusted at all, since several of the more freqiuently used coversion or encoding programs come with FLAC compression preset at "5", and the compression level adjustments are somewhere hidden in the software's menu (dbPowerAmp for example).  Therefore, I tend to reject references to "hearing" differences when I read them from participants on web forums.

Michael references uncompressed FLAC as the comparison source in the article, but the quotes in the article do not specify whether their anecdotal examples used compressed or uncompressed.  I'd "assume" uncompressed, but then there's that old saying about "When you assume...."  

Other factors which are absent from the artcile - specifically in the quotes - are the software and operating system used.  I'd suspect additional differences in "processing" heard during playback are derived from the operational efficiency / overhead consumption of the software or OS.  Whether the comparison is conducted using MPD on Linux or J River on Windows 7 or Mac OS with iTunes, I suspect, matters signficantly.  

That said, I'd love to have this matter sorted out once and for all.  As the poster cited above, ABX comparisons with different playback software and operating systems involved in the test would be ideal.  Not sure how realilstic it is, however.

Michael Lavorgna's picture

Which was the main point of the comparison - compressed v. uncompressed.

It seems clear that due to the number of variables involved, the most practical solution is for those interested in exploring this issue to do so for themselves.

whell's picture

OK, so maybe I'm confused.  My observation was based on the initial question that you  posed to Gordon Rankin which referred to "compressed" FLAC.  It didn't specifiy uncompressed FLAC.  I read that entire article in that context.

"I did some listening to compressed FLAC v WAV and found I could hear a difference between them (and I preferred WAV). The most popular theory I've seen offered to explain this difference in playback is that the extra processing to 'unpack' compressed FLAC may introduce timing errors.

The question that was raised went something like this - if an Async USB DAC offers near-zero levels of jitter, wouldn't the DAC essentially correct these timing errors caused by FLAC playback?"

If the comparison is based on uncompressed FLAC, then I'd assume there's far less "unpacking" to do, and far less opportunity for timing errors or other artifacts.

Michael Lavorgna's picture

I think it may help to clarify that there are different compression levels for FLAC files, as you have pointed out, which are typically represented by numbers (0 through 8 for example) although some software like XLD uses a scale from "None" to "Normal" to "High". The file I used for the comparison was a compressed FLAC file using a compression level equivalent to “5”.

dBpoweramp uses the terminology “Lossless Uncompressed” for its uncompressed FLAC format as opposed to “Lossless Level [0-8]”.

If I had used an uncompressed FLAC file for the comparison I would have said something like - I did some listening to uncompressed FLAC v WAV

I hope that clears things up.

crowfax's picture

But you'd be no closer to determining the cause of any perceived difference.

Can we see the ABX results that prove you can tell the difference between the two files?

Michael Lavorgna's picture

And you already know the answer to your other question so it isn't really a question.

You are making the point that without the use of a valid and verifiable ABX test, I cannot prove to you that I was able to perceive a difference between the two files.  And I agree.

When I’m ready to prove something I’ll let you know. In the mean time be forewarned - you can expect that I will continue to talk about listening and preference.

crowfax's picture

The whole reason I'm subscribed to your RSS feed is to listen to your opinions, because I have no doubt that you know much more about audio equipment than I do. Speakers, amps, DACs, headphones, they're all things that are subjective. Liking them and disliking them is based on preferance, that's why I'm here.

But when you're discussing two file formats that can reproduce bit-perfect copies of the content that was on the CD, and you're saying you can hear a difference, I need to see test results otherwise there's no way of knowing if it's just placebo or not.

Michael Lavorgna's picture

As I mentioned in my initial post, I listened to the same track repeatedly throughout the day and into the evening. This track was playing on my MacBook Pro which sits on my equipment rack and I cannot see or read what’s on the monitor from where I sit.

I left the room a number of times throughout the day and evening so when I came back in I had no way of knowing which file was playing (unless I walked over to the MacBook and looked). By early evening I heard differences between the two files, which I talk about in the original post. Once I became aware of the differences, I could easily and quickly hear them and I correctly picked which file was playing every time. I did not count how many times I identified the correct file but after about 10 out of 10 I felt confident that I could in fact hear a difference and I could, without knowing which file was playing, correctly identify the WAV from the compressed FLAC file.

It's important to note that I am not questioning whether or not FLAC is in fact lossless, rather suggesting that something happens during playback that causes an audible difference in my system between compressed FLAC and WAV. As to the exact cause of this difference, or whether or not it will be the case in every system, I have no idea. The number of possible causes and system configurations is so great that it just makes sense to me that anyone interested in this issue listen for him or her self.

Vincent Kars's picture

It is hard to find any documentation on the Internet about how Foobar ABX comparator works.

I understand it convert all audio to 32 bit float PCM files first and then randomize them.

Great, you have 2 WAV files you can’t identify so indeed a blind test.

Not so great: if the hypothesis is that differences in sound quality between WAV and FALC are due to the computational effort required during conversion to raw PCM, this ABX test won’t help as you are comparing bit identical  raw PCM to raw PCM

http://www.stevehoffman.tv/forums/archive/index.php/t-265346-p-2.html

Michael Lavorgna's picture

Thanks for sharing that information.

PNCD's picture

This is totally uninformed "opinionering"!

First, Mike you could indeed test Uncompressed FLAC versus WAV and see if you get the same effect as with the 5 compression level (still lossless but with more processing).  If it is distinct then it suggest something about file format rather then data content.

Second, the actual audio stream being passed to the DAC is not FLAC but native PCM so USB, Firewire, DAC are out of the loop.  Just the processing by a computer to unwrap a FLAC format versus raw WAV and send it down a wire.  I would have expected the typical communication de-jitter method used by the DAC to handle jitter in the WAV-originated PCM to be just as effective at handling the, potentially, different jitter that might come from the FLAC-originated PCM.

Probably all wrong on my part.

bobvin's picture

In your article you mention the processing load to decompress the FLAC file as a potential source for the sound difference you claim to be hearing. I have a small windows computer dedicated to just serving FLAC via JRiver to an ARC Dac8. I have trimmed down just about every non-essential service in the Windows OS, leaving 8Gig of memory to serve just the JRiver app and little else. The app, when idle, allocates only about 6meg of memory, and when playing a file the memory use jumps to about 75meg... BUT, the processor (in this case an Intel i3) seldom shows more than 1% usage. (And spawns only 3 additional threads when a file is played.) Perhaps decompressing a FLAC file is more of a burden on a lesser processor, and an i3 is just slightly more robust than a core 2 duo. So on any relatively modern chip (even a Celeron) the task of decompressing the file is not heavily demanding of the processor. And since the DAC sees only the decompressed bits, you have to rule out anything downstream of the application. Now you're into some heavy computer science to determine if accessing the data from storage in WAV format vs FLAC and getting it to the bus can possibly result in a sound difference. I wonder if your perceived sound difference will appear more or less pronounced depending on playback software?

And as a newb to all this, how with a WAV to you maintain the metadata, which is a vital component of the computer audio experience?

Michael Lavorgna's picture

 

While I do refer to processing load for compressed FLAC playback, I also say this:

“So the idea that compressed FLAC playback causes an extra load on the processor thus introducing timing errors appears to be put into question by Gordon's test.”

My best guess is there are a number of factors that contribute to this perceived difference which is why I recommend people try this comparison for themselves.

Re: WAV - I recommend ripping to Uncompressed FLAC not WAV. I get into some detail on this subject here.

milosz's picture

Anyone who says he hears a difference between format A and format B is entitled to beleive he hears that difference.  And I am entitled to beleive that he JUST THINKS he hears that difference unlees he can prove to me with statistically valid data that, in fact, he CAN hear the difference.  That means blind comparisons.

So, until I see some decent data from decent comparisons, when I say that these folks JUST THINK they hear differences, my assertion is JUST AS VALID as their assertion that they hear these differences.

 

So, I assert:  NO ONE can hear a difference between FLAC and WAV; but many THINK they can.

 

Prove me wrong.

Michael Lavorgna's picture

Have you ever seen Roman Polanski’s film “The Fearless Vampire Killers”? There’s a scene where a vampire breaks into a woman’s bedchamber. Startled, she holds out a crucifix at arm’s length to keep him at bay. To which the vampire responds, “Boy, have you got the wrong vampire.”

But back to your point, I would say….oh damn. I think I hear someone calling me. 

deckeda's picture

... is that previous explanations for any sound differences have been guesses*. That doesn't make any perceived differences "fake" but will forever invite criticism and in my opinion, more than a little unproductive argumentation. And if that last sentence includes me as well, then ... case in point.

So I'd press those that have heard the differences to press on and discover why they exist.

* Gordon mentions "process overhead" --- again, without testing on a supercomputer (or whatever ... where's that limit again/ What kind of monster computer would one need to do it right, if the computer is the weak link?) Gordon says upfront it's not timing issue.

* Steve Nugent thinks it is a timing issue ("latency") AND a too-weak computer problem ("multi-thread issues") And then talks about the codec being streamed (more "latency") --- All of that is basic stuff that should be handled with aplomb via adequate CPU, clear bus paths, adequate buffering etc. If Steve is right, he's advancing a theory that there are some basic software transcoding or translation issues that the software and hardware engineers aren't aware of. I wish him well.

Run the question my Gordon again but with uncompressed FLAC as the bogey. Let's learn if FLAC remains an issue, or if as he supposes, all uncompressed is OK. If that's right, and uncompressed FLAC gets a clean bill of health, Steve Nugent needs to at the very least shore up is hypothesis regarding the FLAC codec not streaming properly.

Vigna ILaria's picture

Here is how my playback software ("BitPerfect") handles playback.  The source file is opened, read and decoded (converted into a raw PCM data stream), and stored in RAM.   For the biggest, most bit-heavy, sample-rate intensive files, this takes all of five seconds or thereabouts.  Meanwhile, playback starts as soon as the first few chunks of the PCM data are available.  The ONLY parts of the whole playback process which are in any way, shape or form impacted by the format of the source file, are done and dusted after 5 seconds max.  Once the PCM data is in RAM there is not one line of code that executes at all differently according to the format of the original source file (until near the very end of the track, when it will start to pre-load the next track).  This from the guy who wrote the code.

I know that Audirvana works more or less the same way, and I would be surprised if other playback software was not essentially similar.

I hear no differences whatsoever between formats.

markg's picture

Vigna's post caused me think a little harder about "how", even within his software architecture, things may not play out as described. Depending on exactly how a specific piece of software is written, what we think we wrote, and how the computer executes that code, may sometimes differ. This is particularly true in modern multithreaded operating systems.

The more code execution exhaustive a program is (i.e. large program size as compared to physical instruction cache size), the more the OS will experience instruction cache misses for the opcode fetches. This will cause the CPU's caching hierachy to kick in, and old or new instruction memory pages will be brought into lower levels of cache. This may (usually) even cause "paging" where memory sections are swapped on and off of physical disk.

The key point is that the same holds true for data memory consumption as compared to physical data cache size. If a program is using a much larger amount of data memory (as is the case when anything is decompressed), there will likely be cache misses and multiple page faults. Thus data that the programmer may have thought was in a physical memory buffer has really been swapped out to disk. Then during the playback phase, as the CPU is reading deeper and deeper into the decompressed buffer, the OS is swapping pages (i.e. the data) back into physical memory from disk. All of this would cause additional bus loading and RF noise within the PC itself. All of the above happens transparently to the application.

Now, the above is the simplistic model. Many OS's do allow the programmer some limited control over how and where memory comes from. But even in the Linux environment, this can be limited (unless the execution is happening in the kernel itself). It's possible that audio software programmers have already taken paging and OS memory management into account in their designs, or maybe they are just letting the OS do its thing. If they have dealt with these memory managment issues, then we have to keep digging. If they haven't, then possibly (only possibly) there is some room for incremental improvement.

Of course this is just scratching the surface. We'd also have to examine how operating system provided device drivers are designed and written, etc. etc. It's possible that the only way to truly control all of these variables is with purpose built hardware, running a purpose built (or hand tailored) operating system (e.g. no paging). Then ensuring that platform is then executing an application program which itself exploits the hardware and OS design features provided. Of course a specific OS kernel could be provided (or tailored) for off the shelf hardware too.

Until then, it's too much guesswork.

Michael Lavorgna's picture

For taking the time to share this thought-provoking and well-considered post. It's very much appreciated and I hope it serves to keep any further discussion moving in a positive direction.

markg's picture

There exists a body of observations where it is reported that reducing and eliminating unnecessary applications on a given computer has resulted in improved sound quality.

If we examine how an OS achieves the illusion of multiple concurrent application execution, we will see some of the same OS behaviors I described above. Specifically, "paging" and "increased buss utilization".

A single CPU can only execute as many applications as it has execution pipelines. Multi-core CPUs provide at least one execution pipeline per core. That is until one of the pipelines is stalled by some contention. The point I am making is that multiple concurrent applications tax the computer system in order to provide the illusion of true application concurrency. It works great if the computer isn't be used as a DAQ (Data AQusition system) with "hard real time" constraints.

Under the OS hood, there is a scheduler deciding which applications get how much time to execute and on which cores that execution will occur. The more applications that are in the "ready to run" state, the more context switching that is performed by the OS within the "system". Each context switch (i.e. application swap, or thread swap (depending on application design)), causes the OS to swap the old application out. On many common CPUs, this is simply done by "paging out" the old application, and "paging in" the new application.

Within the computer system, this would lead to the same issues implied in my earlier post (i.e. greater noise and jitter in the whole "system").

I think that companies like Auraliti, Bryston, Olive, Naim and their ilk, who are building dedicated audio server platforms have probably realized a lot of these issues and their correlation to audion quality. The solutions won't be truly simple, but they are easily attainable if we are willing to acknowledge that the solution requires an overall design that encompasses hardware, operating system and application, all working in concert.

In a few years, I expect that computer audio will be way ahead of where we are now. Which is way way beyond where we were just two years ago.

Shroud's picture

This stuff about metadata is just not accurate from a pragmatic view.

That foobar screen shot with the missing info for wave is funny.  You can easily get the album artist info displayed if you view by folder structure and have set up your ripper to rip to a folder structure that has the artist and album in separate folders.  In Foobar you just set the artist and album to show as : $directory(%path%,1)  where 1 is the album and 2 is the artist. Pretty simple, and most players support browsing based on a directory structure, which your flac files are probably in anyhow.  For instance if you just view in directory structure then Foobar will show albums and artists.  It is only when you play them that you need to do the directory trick.

Foobar and others also support cue sheets which will also show all the info and are much easier to do, not that the directory structure is hard.

This is simple.  Just use wav and a player that can handle them properly (the best sounding ones do anyhow, though I dont count Foobar in that group) and there is no need to worry or to mess with uncompressed flac for that matter.

There is no need to miss album info and artist info with wave files and the right player!!

Michael Lavorgna's picture

FLAC encoder wording changed, also includes a FLAC Uncompressed encoding option (which stores audio uncompressed, for those who want WAVE PCM but with better ID Tagging).

I referred to the "...with better ID Tagging" part and I still agree with it. FLAC offers better tagging because it uses embedded metadata and it is universally supported. Sure you can get tags to work with WAV files in a given scenario but I’m interested in a ripping and tagging method that works in most scenarios as well as one that is portable.

Besides, there is no advantage to ripping to WAV as compared to uncompressed FLAC while there is an advantage to ripping to uncompressed FLAC. Sounds like a pretty pragmatic choice to me.

Shroud's picture

--Sure you can get tags to work with WAV files in a given scenario but I’m interested in a ripping and tagging method that works in most scenarios as well as one that is portable.

I found waves to work nicely in foobar, jriver, winamp, and jplay and cplay.  xxhighend also has wave support.  For sound quality these are pretty much the best players.  What other players compete?  Squeezeboxes also do waves nicely.

You do raise a great point with portable players.  I never had any luck with getting any kind of good sound with the portable players I have tried, so I dont see the need.

I'll have to listen to uncompressed flac and see if it sounds as good as wave, though I am skeptical.

Michael Lavorgna's picture

I look forward to hearing about your uncompressed FLAC/ WAV listening comparisons.

coleco's picture

Try recording an output with a digital loopback interface then inverting and mixing with the original wav file. In the case of bitperfect output the resulting file should be all 0s. Often this can settle the 'it's different' debate once and for all. If you're really paranoid then uncompress all your FLACs back to WAVs, there by ending the need for any debate at all.

Michael Lavorgna's picture

I prefer curious.

And here's a relevant qoute from the article:

From my point of view we are talking about a hobby whose ultimate goal is the enjoyment of music so a perceived difference that consistently adds to our enjoyment is as valid a measure of effectiveness as any other.

Shroud's picture

That test wont settle anything.  It just captures 0s and 1s.  I dont think anyone thinks flacs dont have the same data.  The data is the same though the processing isnt, the electrical situation isnt, rfi and emi is different, etc. and those are what can cause a difference in sound.

---If you're really paranoid then uncompress all your FLACs back to WAVs, there by ending the need for any debate at all.

EXACTLY what I am saying.  Just use waves and there is no need for debate!!!!

X