Brent has been a professional audio journalist since 1989, and has reviewed thousands of audio products over the years. He has served as editor-in-chief of Home Theater and Home Entertainment magazines, contributing technical editor for Sound & Vision magazine, senior editor of Video magazine, and reviews editor of Windows Sources magazine, and he also worked as marketing director for Dolby Laboratories. He's now on staff at Wirecutter.
What do audiophiles hate the most? Well, there's Bose. And the late Julian Hirsch. After those two, it would probably be blind testing, specifically ABX testing. Why? Because the results of ABX testing tend to conflict with much of what audiophiles believe. The topic of ABX testing may be poised for another look with the recent emergence of the Audio by Van Alstine AVA ABX, which to my knowledge is the first commercially produced ABX box released in more than a decade. In this article, I'll discuss what ABX testing is, explain the criticisms of ABX testing, and get a little into my first experiences with the AVA ABX.
When I found out about the AVA ABX from reading the comments section of this website, I immediately contacted Audio by Van Alstine's namesake, Frank Van Alstine, to see if I could borrow one to try out and then buy if it met my needs. I was attracted to it not for its ABX capabilities, but because it looked like a well-made and versatile switcher I could use in my reviews. I have a good switching system that I designed for this purpose; however, like most hand-built, one-off electronic products, it's not very reliable. I could see from the interior shot of the AVA ABX that it was built the same way my switcher is--with high-quality relays, minimalist controls for level matching, and a switching system. But the AVA ABX was designed by an experienced audio engineer, Dan Kuechle, who has the knowledge and resources to build products with professional-grade reliability.
What is ABX testing?
I've been using the AVA ABX for level-matching and switching in my reviews for a few months, but I hadn't actually experimented with the ABX function until recently. Here's how ABX testing works: The ABX box presents two audio signals, A and B, plus a third, X. X is either A or B; the assignment is random, and it changes (or doesn't change) with every trial. So you listen to A, listen to B, listen to X, and then decide whether X is A or B. Then you or the test administrator activates a function on the ABX box that displays whether X was A or B for each trial.
Random guessing will, after enough trials, result in correct selections 50 percent of the time. So, to prove there's a significant difference between A and B, you'd have to correctly identify X somewhere between 50 and 100 percent of the time. Even someone randomly guessing might get 6 or 7 out of 10 right, so the results aren't meaningful unless you can do even better than that. For a 95 percent confidence level (a typical standard for statistical significance), you'd have to have correct identifications on 23 out of 24 trials. That's three test sessions on the AVA ABX, which provides eight trials per test session--quite a high hurdle.
A and B can be, well, anything: two speakers, two amplifiers, two preamps, two cables, two types of digital music files, etc.
What's the problem with ABX?
It seems pretty straightforward, right? The assignment of X is random, so neither the test subject nor the test administrator knows whether it's A or B until someone goes back to check. Thus, there's no chance that the brands, appearance, or prices of the components under test will influence the results.
The problem for audiophiles is, ABX testing has, to date, rarely revealed differences in sound among audio electronics components. This is why the debate about ABX testing became so fierce when the process emerged in the 1980s.
On one side, we have millions of audio enthusiasts and professionals who report hearing differences among audio electronics, between hi-res files and standard-res files, etc. And 50,000,000 fans can't be wrong, right? They present many scientific (or at least scientific-sounding) reasons why ABX testing is invalid. Some of those reasons are obviously questionable, as I'll detail below. And of course, audio writers are unlikely to embrace a methodology that may cast doubt on their previous written statements, and that might threaten their status as opinion-makers; and enthusiasts who just spent $5,000 on an amplifier don't want to hear that it's no better than a $300 receiver.
On the other side, we have a small group of scientifically oriented researchers, enthusiasts, and writers (many now retired or deceased) who insist that ABX testing proves such differences cannot be heard. When I read their articles (which are hard to find unless you have a stack of Stereo Review magazines from the 1990s), I sometimes get the sense that their work began not as an effort to find the truth, but as an effort to prove audiophiles foolish. Of course, there are many ways to set up a blind test to "prove" two products are alike in their performance and characteristics. You can use test material and conditions that make the differences among products hard to distinguish. Or you can get panelists who aren't particularly interested in the subject, or who have already made up their minds. To take an extreme example, I wonder if my late father could have passed an ABX test with Led Zeppelin's "Immigrant Song" as A and Deep Purple's "Highway Star" as B. When he heard tunes like these as I played them on the 8-track, it sounded like nothing but noise and screaming to him. So, if he couldn't have reliably identified which tune was which, does that mean they're indistinguishable?
Considering that both sides in the debate have an axe to grind, and that both seem so convinced of the correctness of their positions, I don't find either side persuasive. That's why I decided to take a fresh look at ABX. My hope is that, as a writer who doesn't fit into any particular camp of the audio world, I can weed through all the invective to find some honest, unbiased answers.
Criticisms of ABX
I probably couldn't pursue this without the AVA ABX, which I believe addresses some of the criticisms many audiophiles have of ABX testing. Let's examine the key criticisms here:
1) ABX boxes are poorly built and degrade the sound quality of the components under test.
2) ABX testing places too much stress on the test subject, whose performance will thus be impaired.
3) Judging the quality of audio components requires long-term listening.
4) Blind testing employs the left side of the brain, but art can be appreciated only with the right side of the brain.
None of these statements is, to my knowledge, verifiable or supported. I suspect that's in part because most of the critics of ABX have little or no actual experience with it. Here's how I respond to the above contentions:
1) I have never seen any audio writer present specific criticism of the supposed technical flaws in ABX boxes. You can see the guts of the AVA ABX box in the picture included with this article; tell me what the technical flaw is. And what would be the technical flaw in the ABX testing plug-in for the digital music player software Foobar2000?
2) Based on my experience with the AVA ABX so far, I can confirm that ABX testing is difficult and requires great concentration, but so does a serious comparison of any two products that are fundamentally similar. It's stressful only if you're worried that you won't get the "right" answer. And if you think there's a "right" answer, you're using your own bias as the standard to judge the validity of the results.
3) The idea that long-term listening allows audio components to be more easily and reliably distinguished is one that a lot of audio writers throw around, but I've seen no actual research supporting it. Fortunately, by owning the AVA ABX, I can make my comparisons as long as I wish. And how does that long-term listening test work, anyway? Let's say you've been listening to amplifier B for a month, and you think, "Wow, this thing really does seem to throw a bigger soundstage than amplifier A that I was listening to last month." No one's acoustic memory is anywhere near good enough to remember the subtleties of something you heard a month ago, so you have to go back to amplifier A to confirm--and then you're making short-term A/B comparisons again.
4) When you're judging amplifiers, you're judging the technical qualities of an electronic component, not art. Judging art would involve, for example, listening for how melodic or lyrical or rich or smooth or original a tenor saxophone player sounds. I can easily measure an amplifier's performance; no one can measure a saxophone player's performance.
The beauty of owning the AVA ABX (besides the fact that it makes my reviews more accurate and much easier to set up) is that it gets past so many of the criticisms noted above. In most cases, there's nothing but a relay (i.e., a switch) in the circuit, plus a simple volume control circuit. It lets me test products at my leisure, with whatever music I want, for as long as I want; I can do an ABX trial with a six-second snippet of music, the complete works of Gustav Mahler, or anything in between. I can "engage my left brain" and listen closely and repeatedly to a single element of a recording (such as a cymbal crash or a vocal phrase) or just let the music play and "engage my right brain" to form a gut-feel assessment of the sound.
I've gotten enough experience with the AVA ABX to know that I need a lot more experience with it before I make any proclamations about the validity of ABX testing. And that's just what I'll be doing in the coming months. I'll be testing various categories of products, and I'll bring in outside listeners to add their results to mine. Maybe, just maybe, we can get past some of the old debates about ABX testing--and some of the attitudes that have, in my opinion, calcified the craft of audio writing.
• CES Delivers Higher-Quality Audio at Lower Prices at HomeTheaterReview.com.
• Who Is on Your Audiophile Mount Rushmore? at HomeTheaterReview.com.
• Do You Need to Love Music to Be a True Audiophile? at HomeTheaterReview.com.
As anything else in audio gear, cables need auditioning, desirably on your personal audio system. For myself, I use our own bespoke design Malbru silver cables for analogue and Malbru CX version for digital coaxial signal, sonically miles away from other highly prices options I had and heard. In your search for a really good digital coaxial cable or any other cables please do consider some small brands as you may be surprised what you get. We could change components etc. but good performing cables are there to stay as an essential part of our audio gear. Keep your options open and DO audition on your system or at your local Hi-Fi dealer and let your ears be your judge.
ABX is only one way to use the "double blind" concept. Double Blind simply means that neither the experimental subject, nor the experimenter know what is being tested, so both will be completely neutral. The ABX test is double blind and asks a question in such a way as to ascertain if a difference is perceived or imagined. The test cannot prove there is no difference, but can prove there is a difference. Of course, failing to prove a difference after many trials certainly suggests there is none. In a wine tasting test, you might set it up so that there are two different wines, for sure, but neither subject nor experimenter knows which is which. You can then ask the question: "Which do you like better?", or, "which is the better wine?" Both tests are double blind, but answer different questions.
what a bizarre argument against testing. "I don't want to see automobile safety tests because it might make me look bad for buying an unsafe car." Yeah I suppose it might. But isn't the real purpose to help people make better decisions going in? i just bought a Hegel H80 and KEF LS50 speakers. how much should I spend on speaker cables? $20? $50? $130? $320? I don't want so spend a lot of time auditioning cables because it's hard to do and I don't live in the U.S. so the cable company won't ship me loaners. Should I be safe and spend $300+ or by doing so am I just throwing away $200? I support testing to prevent the kind of mistakes you are co concerned about.
how about if we all put up some prize money and see if he can do it blind? I'll put $100 in. anyone else?
why would that be? People are actually unable to tell the difference between the wines being tested? As explained earlier, ABX testing is not about determining which is "better" but only about determining if a difference can reliably be distinguished.
Just buy the 800 dollar Emotiva xpa2 and youll never need another amp , unless you want something that looks better. they all sound the same.
In the past, the issue was not so much from the minimal values of resistance and such from the connectors and relays - it was the measured blending of electrical characteristics between the DUT. Such was pointed out years ago by Frank Van Alstine himself! Others have suggested doing interconnect comparisons via a Y-adapter from a single source to multiple inputs on a preamp. While that sounds perfectly reasonable, what you find when you measure the results is that you end up comparing nothing. The values for inductance and capacitance are summed and measure the same at either input. You compare A+B vs. A+B. I gather this box, unlike those used earlier, has finally solved the problem of not using shared grounds while eliminating potentially damaging switching transients. I see its only drawback is that it does not support balanced connections.
Terry I'm not totally sure of your point, but this would not prove a memory effect. Creating fake switching and showing random chance guessing (the same as the null test failing to reject the null hypothesis) only proves that the individuals could not tell the difference between groups better than chance. Since no switching actually took place you still can't conclude that its a memory effect because its still also possible that no difference existed (which was in fact the case). The experimental design needs to isolate all other possible explanations first, which this would not. You need to switch the signal in a way that actually does create an audible difference but would theoretically be masked by the memory effect (I don't now how to do that, but I would guess we could alter a signal slightly in a way that we believe is interpreted by the brain in some more complex way, something that is known to have significant masking potential, for instance, adding and removing distortions thought to typically be masked by the signal and varying the time to see if shorter times are masked while longer times are not). In AB testing scenarios of sound, it is my understand that we are yet to be able to really measure and understand if such a memory effect exists. As I understand it, there can be optical illusions created with certain strong bright colors switched rapidly that are said to be a sort of memory effect, and because its repeatable and respondent descriptions are consistent, we believe we understand it. No such parallel test exists in sound that I am aware of. I think it's also worth noting, in the grand scheme of the scientific field, sound quality differences are not on anyone's radar. There are few if any academic institutions looking to pursue the quantitative measurement of qualitative differences in sound. It's considered unimportant and an issue for marketing. Rather, where the field studies this at all, it tends to around quantitative measurements believed to be correlated with qualitative differences in sound quality. As such, the likelihood that anyone is really going to move this field forward in any meaningful way or ever really end this debate is null. There is no money in it and there is no academic value.
You may have developed the ABX testing in audio, but ABX testing as a method dates back far more than the 1980's and was developed outside of the audio community. I have texts written in the 1960's extolling the virtues of double blind experimental testing. Null testing is just another name for any test which compares against a "null" hypothesis, so this is still a null test, since a probability distribution is being applied. I think people need to remember, a tests inability to show significant differences between A and B does not prove no differences. That is in violation of the scientific experimental approach. Rather, we conclude that we fail to reject the null hypothesis. In simpler terms, the experiment lacked the resolving power to show that differences exist. While it may be true that no difference does exist, we haven't and cannot prove that. All of these tests are not proof of anything either, a very important point. I am a social scientists, a federal government consultant, and nationally recognized expert in evaluation design. In the field I work in, we are careful to always remember that failing to reject the null situations basically means we have no evidence to support the notion of a difference (In my field we are looking for benefits, so not just differences, but effect sizes as well). If even one well constructed experimental test shows a significant difference between treatment A and Control B, then a difference exists, no matter what other tests have shown. To say that ABX has rarely shown differences in audio components is somewhat erroneous too. Improperly designed tests using improper experimental conditions often fail to show differences, but carefully constructed tests performed by Harman Labs, Earl Geddes and Colleagues, even some of my own consultation work have in fact shown differences. I'm not talking cable differences or amplifier differences, but things like audibility of distortions introduced by circuit designs or driver designs, things that can be tightly controlled and easily replicated. The loosest design I've ever seen that still showed differences was by Early Geddes, His wife, and his research team looking to see if there were audible differences and preferences to compression drivers from JBL, B&C, and TAD. The point is this, the biggest criticism of any test which routinely fails to find differences between A and B is that it seems to lack the necessary resolving power. Certainly, if enough tests are performed and all sources of error that impact resolving power continue to never show differences, then we can start to feel that it is likely that the resolving problem is in the human subject and not the experiment itself. My problem with this is that in very controlled AB testing conditions, we do find differences audible that should replicate in things like circuit design or more often, speaker design, yet often we see common ABX tests fail to show this. The example I've given recently is the recent work showing that group delay in the midrange is audible at unbelievably small amounts when experimentally manipulated in the signal. Designing two speakers or two amplifiers that are identical in every way but the group delay is impossible so certainly no experiments have shown this yet in that regard (nor have they been performed). I think we need to be very careful in how we interpret the results of ABX testing. I'm still convinced that we need to work on the resolving power of the test.
The lessons learned in blind wine taste testing are fascinating, and probably have many applications to blind audio testing. The most interesting is that people can actually taste the exact same wine (without knowing it), but if told that the wines are priced differently, their brains react differently: https://www.wired.com/2011/04/should-we-buy-expensive-wine/ But, yeah. Non-blind testing is really like the guy who could identify the music on records by looking at the grooves. Sure, he could still do it if the label was left on the record. But the only way to know for sure was to take the label off.
I agree totally. Unless you're comparing, for example, two speakers that are the same except for the types of capacitors they use, there's no point in ABXing speakers. No one's contending that all speakers sound the same. At least I hope not.
Mark, I'm sorry to be so long-winded and contentious about this, but I'm tired of seeing unsupported statements like these repeated so often that they are considered fact. Just for a minute (or perhaps forever), forget everything you've read in high-end audio magazines and look at it this way: Just for the sake of conversation, let's say the ABX box does make a difference, something on the order of -0.1 dB at 20 kHz. Then let's say we're comparing amplifier A, which has been reported as sounding slightly bright, and amplifier B, which has been reported as sounding slightly soft. The ABX box's -0.1 dB error will make amplifier A sound more neutral, and amplifier B sound even softer. It will not make them sound the same. To make them sound the same, it would have to low-pass amp A and/or high-pass amp B. It would also have to know which amp needs which type of filter. This is the huge fallacy in statements like yours (which, to be fair, originated with high-end audio writers who should have put more thought into the subject). A slight difference in sound caused by the ABX box (if such a difference exists) would not diminish the magnitude of differences between components. It's no different from moving your speakers 1 inch closer to the wall. Moving the speakers might slightly change the system's sound, and might slightly favor one amp over another, but wouldn't erase the differences between the two amps. For the ABX box to make both components sound the same, it would have to either magically apply just the right filtering to each component so their sound matches, or it would have to introduce gross coloration that would obscure any differences -- such as maybe adding a 2nd-order low-pass at 8 kHz, or running the source signals through 32 kbps MP3. Of course, the ABX box doesn't do either one. This idea that ABX boxes, through some technical flaw, magically make everything connected to them sound the same is exactly what I'm talking about when I refer to the calcification of audio writing. Like your statement that "less connections are better," it's borne of suppositions rather than technical knowledge or research. Yet it's written and repeated so often that for many audiophiles, it is now considered to be fact.
When double blind wine tasting has been done, the results are often embarrassing to the expensive wine producer.
Why are you concerned with the respondent's memory capabilities during the ABX test, and not concerned with that variable when someone pronounces that component A sounds much better than component B. I see the memory variable as identical in both situations, and hence not relevant to the discussion of difference. How do you propose to do your "null" test? We tried that early in our research and found it nearly impossible to set levels such that perfect null was maintained. During ordinary listening, levels matched to within 0.1 dB are close enough that differences in level cannot be perceived. However this is far from close enough when attempting to electrically null all signals. And, of course, this won't work with speakers, or phono cartridges, or even CD players
I don't use "cheap" cables, but not because they "sound different" when working properly. They do clearly sound different when the contacts become oxidized and make a bad connection, or when flimsy connectors bend or break, making a bad connection.
Actually, we did that (test for ABX box audibliity). Nobody could reliably hear when it was in the circuit or not although several members thought they could hear it. They were right 50% of the time.
Hi Brent, I wasn't arguing, I was simply stating a fact that less connections are better. You appear to be saying that the ABX box would be undetectable in any system & make no difference what so ever... So I'm assuming you use Radio Shack cables as interconnects, or am I wrong? Well heck there are less joints in a couple of RS phono cables. Yes I understand where you're coming from with the complexity of active components, but they do tend to (generally) have negative feedback which does seem to level the ballpark so to speak.. Somehow I don't think you use Radio Shack cables, so quite how you can say the ABX box makes no difference I don't honestly know. If it didn't then you (just like myself) would use cheap cables & be quids in...
Much like homeopathy, dowsing, acupuncture, and other pseudo-sciences, many of the more “magical” aspects of high end audio are being forced to the fringe. It makes sense, considering people have much more information at their fingertips today than ever before. As long as someone in the market for whatever piece of audio equipment goes into their research without any preexisting biases, it is really not that difficult to conclude that purchasing 1000 dollar speaker cables is akin to buying an "Isotope isolator that aligns unregulated negatrons in order to eliminate suborbital quarks disturbances within the speaker cone’s plane of existence" or whatever cobbled together techno-babble is out there vying for your hard earned dollar. It’s getting harder and harder to fool the customers with the audio equivalent of the idiomotor effect. There will always be those who buy into magical thinking when it comes to their audio passions, but like those that still think an invisible hand can reach up though hundreds of feet of soil and yank a hickory stick earthward when the holder passes over an aqueduct, their numbers are dwindling as science continues to enlighten our race. ABX testing is science. Science is good. Science is real.
Halleluiah! Mark. Someone who actually gets it. Spend time (and money) on the things that make an audible difference (speakers, cartridges if you still listen to vinyl, amplifiers if you have difficult loudspeakers). Don't waste your time and money where it will make zero or almost zero difference such as pre-amps, DAC's and dare I say it, interconnects.
Every medical student knows about the 'placebo effect'. To prove ABX testing's independent variable is short-term memory... Create a BOGUS test by eliminating the *switching* variable: Only one playback set-up is used, even though respondents and the facilitator are told there are TWO power amps for trial comparison! Have any number of trials. A perfect score would be ALL "A" or ALL "B" for every "X". What do you think the outcome will be? Having demonstrated same among those who are either arrogant or ignorant, the outcome is the same... No one sees the Emperor's new clothes!
A *Null* test [1 + (-1) ≠ 0] can do the same to tell the DIFFERENCE between two. Its long been used for Q/C testing in manufacturing. When is SOMEONE going mention that ABX testing is a test of respondents' memory--short-term or otherwise.
If you could identify the difference between non-malfunctioning USB cables in a blind test, you may be able to qualify for the James Randi Prize (as well as re-writing the physics of engineering of digital signal transmission, possibly qualifying you for a Nobel Prize). This is especially odd because the Benchmark DAC2 uses asynchronous USB control, so the DAC is controlling the timing and clock information of the audio coming over the USB. The cable is an irrelevant link as long as it is transferring the signal; all the timing and buffering is being done after the cable! If the cable makes a difference in how the data is getting to the DAC, then that might indicate a serious problem with the timing control and buffering in the DAC. This would seem to be an ideal case for blind testing; no one is doubting that you can tell the difference when you know which USB cable is being used. The real question is whether you can tell the difference when you don't know which cable is being used.
You're obviously right; a single ABX test with a single user doesn't tell us much beyond what the capabilities of that single user were (or weren't). But I can't imagine a supporter of ABX testing ever suggesting a single user would be an ideal test case. Obviously, you would want as many different users as possible, with different variables introduced as needed. All of this testing can only provide valuable information. But as Shike points out, if someone makes a claim regarding the audibility of something (be it expensive cables, a unique DAC, etc.), at the very least their claim should be tested and placebo effects need to be isolated. It's just basic psychology. That doesn't mean additional non-blind tests can't be performed, but at least the manufacturer (or reviewer) could be honest and say "In 40 blind ABX tests, none of the participants could reliably distinguish between the $400 speaker cable and the $20 one. In a non-blind test, 40 of the 40 could, and all preferred the $400 one."
I'm not saying it makes any sense. I purchased a $75 3' Audioquest USB cable to go from a Windows 7 PC (2nd gen i5, Intel board, 8GB, SSD) to a Benchmark DAC2 HGC from Amazon because, as someone who has owned an IT company for 30 years, I knew it couldn't make any difference and would be returned. Half right. I did return it but purchased the $130 version instead. As far as power cords and SATA cables - nope, just the standard variety. Analog interconnects to my power amp are $40 4' cables from BlueJeansCable. (Tried some cables from MIT and they were horrid) Speaker cables are 20 year old Tara Labs Prism BiWire ($5/ft at time of purchase). I've tried others but like the sound of these best. Playback software is JRiver which, to me, blows the doors off Foobar2000 and everything else I've tried. Music is stored as .wav files because there is a clear difference between the .wav and .flac codecs. I can transcode back and forth all day long but playing the .wav file always sounds better. I agree with you 100% that a USB cable between the PC and DAC can't make any difference. I disagree with you, however, since it does.
I'm looking forward to some results! Soon, I hope. Thanks for your efforts.
This is the same old argument: that the extremely tiny and probably unmeasurable effect of a couple of wires. a relay and a few solder joints in an ABX box is enough to completely erase the differences in performance between two components that are vastly more complex and have readily measureable differences. Embarrassingly, many high-end audio writers make the same obviously flawed argument.
How much burn-in is recommended for the wires in the ABX?
I think ABX isn't particularly suited or is instead overkill for the application. A simple level matched blind listening test with someone to arrange to make sure the sweet spot doesn't get killed would probably be sufficient. You likely don't need to tell if you do/don't hear a difference with speakers as they usually will have a good degree of differences especially when you bring the room into it, just know which you like better at some point. Having some blind element would be good though.
What were you testing with the USB cable? When data is digitized, compressed, packetized and buffered, and when the entire signal path through the computer is taken into account (including the traces on the motherboard), I can't imagine how the cable is supposed to affect the sound. Do you also buy expensive SATA cables for the hard drives on your computers?
But even with those limitations, ABX can still be preferable to any other alternative for comparing speakers. At the very least, it could be one part of an overall evaluation. If we know what the brand, price and reputation is of a pair of speakers, it can have a huge influence in how we perceive the sound. Eliminating these influences can have a huge benefit, even if there are still differences in placement and room acoustics between speakers.
So why not ABX the ABX box itself? If you think it's introducing noticeable degradation, then test it. See if you can hear the difference between a signal path with the ABX box, and one without it. The resistance to ABX testing has always mystified me. There is literally nothing to lose, and everything to gain (and money to save). If you compared a $20 speaker cable and a $1,000 speaker cable and discovered you could not discern a difference in the sound between the two, you just saved $980. What's wrong with that (unless you are a manufacturer of $1,000 cables, of course)? If you ABX'd a pair of Pioneer SP-BS22 ($127/pr) speakers against some Dynaudio Focus 160 speakers ($2900/pr) and discovered that you preferred the sound of the Pioneers, what is the downside? And if you ABX something and find that the much more expensive option is actually much "better" and preferable, again, what have you lost? Nothing.
I have been with audio for many years, and I believe a lot of the claims for products are exaggerated. I have always felt that the thing that made the largest difference in the quality of the music was the speakers. I do have a sensitive ear and years ago could describe exactly what people should listen for when auditioning a set of speakers and I could tell them the flaws to look for in the particular speaker. Amplifiers seem to be the next link. Reserve power of the amp tends to be a large factor and slew rate. When an amp can react fast to the music and reproduce the sound with impact. After that things start to get tricky and very subtle, I would love to see your panel pass the speaker cables are different test. Phono cartridges are hard but yes there is a definite difference. And the difference between a 800 dollar turntable and a 40,000 dollar turntable, just not going to happen. I think the audio business would be so much better with this ABX box, then we could concentrate on spending our money in areas that WILL make a difference and then leave all the voodoo behind. As an aside to this I was at CES on the third day of the show people in the room were listening critically to some 70k speakers and as soon as I started listening I could tell they were out of phase! I told the exhibitor and his reaction was one of fright that I might be right. He checked it and I was right. Not everyone has a clue what to listen for or even how to listen. It becomes a curse, you're never satisfied with your system.
I don't believe that ABX tests are legitimate for speakers. However, for cables or electronics they are completely valid. For the test to be accurate, you need to use the same material when listening to A, B and X. Anyone who suggests that you are listening to different material in a different way does not understand the methodology of the testing procedure. It is hard for some to understand that many types of electronics such as amplifiers may sound similar if their basic measurements are the same when asked to perform within their own parameters. I can remember my own shock when purchasing a $ 175 NAD 20-watt amplifier and plugging into a system as a joke in 1980 that was using a pair of 100 watt AR tube amplifiers and hearing almost identical results and actually better damping for the bass. This was probably five or six years before I had heard of double blind testing or participated in the same. I enjoy listening to music and keep a separate 2-channel system for music separate from my home theater. I have not participated in double blind testing in over 20 years. It always amazed me that the ones that were sure something was rigged were usually the ones with the most to lose in the tests such as shop owners selling $ 100 per foot cables or the ones who had just invested 20k in electronics. I have always enjoyed Brent's reviews and look forward to hearing of any tests that he performs with the box. And by the way, I am not saying that all amplifiers sound or perform the same. My mono tube amps were purchased to drive quad electrostatic speakers. Later I needed a lot of horsepower for Carver Amazing loudspeakers. Sometimes you need an amp that will handle highly reactive loads with strange impedance curves or just plain inefficient. If you do your A/B/X testing with two dissimilar amplifiers at a loud volume into one of these speakers, you will more than likely be able to pick out the more solidly built product. If you take the same amps and connect to the two inexpensive speakers I am now playing with (Kvart & Bole Sound Sommeliers) at a reasonable level, you probably won't hear any difference.
So you think a myriad of connections in a system will show differences? Lets just take this ABX box for instance... Where you might normally connect a pre-amp you have the input connectors to this box. Then there are the soldered wires to the phono connectors, followed by god knows what cable & that connects to a PCB (more soldered connections) then through PCB tracks to a relay... More soldered connections, relay contacts & yet more soldered connections to more PCB tracks & more soldered connections to god knows what cable (once again) & to more soldered connections & the output phone sockets. That's a lot of connections... I think we try & minimise things as much as possible... Anyone has no way of maximising the quality of this ABX box, it might be classed as professional but what's that? Is this the ultimate quality? I very much doubt it.. I'm not a cable freak but I can see a whole load of problems given the introduction of so many new connections into any system. Anyone that disagrees might like to see if their system sounds the same with a few thousand connections. I know you wouldn't consider it, the least the better, so introducing this box will no doubt in one way or another reduce the sound quality... ABX or not.
To say that an ABX box doesn't degrade the sound is laughable given that, tested with multiple people in a single-blind test, I have achieved the funniest faces and 100% accuracy by swapping a $130 USB cable for several 50c versions. Note that EVERY victim expected to hear no difference (including myself) and none of them had a dog in the race. Several years ago I made a copy of a CD using Exact Audio Copy and had someone swap disks; I was wrong 100% of the time which proves I did hear a difference consistently. I tried the ABX feature of Foobar2000 and failed miserably but, then again, Foobar2000 isn't the best sounding playback software on the planet to start with. NPR posted a test a couple of months ago comparing .wav with several compressed files and I was hard pressed to consistently tell a difference (I think I got 5/6 right). There was a massive difference between their .wav files and a CD of the same recording played back on the same computer through the same DAC, amp and speakers, however. In order for the test to work you actually have to start with decent source material. If JRiver or some other comparably good software had an ABX feature I would love to try it but to suggest that inserting ANYTHING into the signal path, analog or digital, doesn't degrade the sound is nonsensical.
"Unfortunately, this is an invalid conclusion. Just because 1 listener didn't perceive a difference doesn't entail or even imply that there are no differences" For the user taking the test, there is effectively no differences. They could be deaf or there legitimately may be no difference, but all that matters in the test is what the user is able to hear in an ABX and thus to them the differences they choose to go on will not be based on sound. You spoke in regards to a single user, thus my response wording was in relation to your example. Please don't move the goal post. If we want to speak globally, we have yet to see a positive case of similar measuring electronics being positively identified in a controlled ABX. While lack of evidence is not evidence of non-existence, it's a good indicator of burden of proof e.g. Russel's teapot. In such a way it's not upon me to prove the lack of something, but instead the one asserting existence to prove it. So I will happily say that electronics operating in similar manners of linearity, within their power and operational constraints, will sound no different. It's the null hypothesis. "Additionally, there is absolutely no information entailed or implied as to why that listener failed to hear them." The inability to hear a difference is all ABX is built to indicate, and if you fail to hear them without a positive sample as to the contrary arguing more research in needed without a proper hypothesis to even test is a fool's errand. "Concluding that the do not exist requires more." Which is why you could throw them through a measurement barrage and see they usually perform extremely close if they are properly designed and not driven past their limits in relation to electronics. If you wish to assert these aren't sufficient evidence as to why grounded in facts is required.
"If these are enough to make it so you cannot identify a piece of equipment in an ABX, then clearly there is little to no difference worth discussing as far as sound is concerned." Unfortunately, this is an invalid conclusion. Just because 1 listener didn't perceive a difference doesn't entail or even imply that there are no differences -- only that that listener didn't hear them (which is usually why these tests are run with large pools of subjects). Additionally, there is absolutely no information entailed or implied as to why that listener failed to hear them. Concluding that the do not exist requires more.
I would never use ABX for speakers honestly, there's much too many variables and an ABX doesn't have to be used for things widely different in measuring. Dispersion characteristics alone that excite different room modes and create different reflections can create a huge impact.
"Am I *really* listening for the same things in trials 4-7 that I was in trials 1-3? What if my attention wanders? What if the doorbell rings? What if I'm thirsty or it's a humid day or my power grid is under particular load that one part of that testing period?" These are all normal conditions when you're listening without the ABX as well. Are you always listening to the same thing each time you listen to a track? Do you know if you'll be thirsty that day? Etc. Trying to blame the test for normal conditions outside of the test is entirely missing the point. If these are enough to make it so you cannot identify a piece of equipment in an ABX, then clearly there is little to no difference worth discussing as far as sound is concerned. This does not mean that all equipment is the same, as you can discuss company support, build quality, aesthetics, etc. Do you want to go through multiple replacements of an amp at three to five years based on its lower cost, or have something that only requires service in twenty years for caps? Does it look and feel cheap enough to make you dislike it generally? Are you wanting to hide it because you can't proudly display it on your rack? These are valid concerns for some, and it's not my business to tell you you're wrong for valuing these. But don't claim it's about the sound when it isn't.
Thank you. Thumbs up!
The ABX box has no ax to grind. It is impartial. Audiophiles may hate it because it reveals that their listening skills are generally not that special, or at least not special enough to reliably discern a difference between two similarly performing devices.
I am one of the six people who developed the ABX test and formed the ABX company in the 1980s. We developed the instrumentation and the algorithm to help settle the question of whether certain components sound different from one another. Our audio club had spent many hours arguing such issues, and a medical researcher in the group (me) suggested a double blind approach as being least influenced by non audible factors. This was embraced by the club with enthusiasm and many such tests were performed. And many differences were proven to be heard. Speakers always sounded different, small level differences were reliably heard, and often misinterpreted as something else; small frequency responses were reliably heard, Amplifier overload characteristics differed as did noise levels. etc. etc. There are a lot of misconceptions about ABX testing out there, some of which have been well covered by Brent. Here are some more: 1. ABX testing is to determine which is "better". It is NOT for that purpose. It is to tell if two components sound "different". If they sound "different", then one may or may not sound "better". And that is for a different test. 2. An ABX test determines that two things sound the same. It does not. It can determine that two things sound different. Failing to prove that they sound different does not prove that they sound the same. However, if on repeated short and long term listening with the ABX algorithm you fail to prove a difference, it is certainly reasonable to conclude that if there is a real difference it is so small it probably doesn't matter. 3. An average listener who enjoys his system need not concern himself with ABX. However, someone who reviews products, or someone who wants to spend megabucks on wire would do well to prove those products being reviewed or purchased do indeed sound different. And having confirmed that they sound different can then intelligently decide if they are worth the cost. And if YOU can't hear the difference in an ABX test (short term or long term), it probably isn't worth the cost premium to you.
Bromo , I think the point of the abx box is so the average faan can go out and reliably get "good" speakers... Especially since we all can agree "higher priced" doesn't always result in "better" quality.
This is exactly the same reason why Robert M. Parker, Jr. of The Wine Advocate refuses to take a double blind TASTE test.
This one of the most well balanced articles I've read on the pros and cons of ABX testing . The pros and cons were very well laid out and I really look forward to your findings. I do find it very interesting that lines are already being drawn in the comments. What are you people afraid of? I don't think anyone is going to make the claim that if you can't statistically pick out a component there is no audible difference but it will be interesting how many times components can be identified. I
LOL These tests (in situations where they can be implemented PROPERLY) force audiophiles to confront the fact that they do not hear many of the differences that they THINK they do. At a stereo club meeting, a guy I brought to it, tried to sell people his, "Transmission line" RCA interconnect cables. These went for $300/pair. They put him on the ABX with his VS. $2.00 cheap blister pack cables. He batted about 50% meaning he could NOT hear them. I have seen so many articles disparaging the ABX test. I know in ADVANCE what the people will say when they take one about a $5,000 DAC or something, and FLUNK IT. – NOT that all expensive DACs are inaudible either, and that is just an example. It has become the JOB of most high-end audio writers to PROMOTE this stuff, rather than face reality, and let many of these companies go the way of laetrile.
No one is saying, suggesting or implying (or has probably EVER said) that average listeners need to participate in ABX testing as a supplement or alternative to music listening. You and Part-Time Audiophile below have constructed a straw man. Likewise, no one is saying that you shouldn't enjoy listening to your music on whatever good stereo system you own. In fact, the only times I can think of when a commenter implied that someone else SHOULDN'T be enjoying their music are when audiophiles criticize mass-market equipment as being "not audiophile grade," as someone did yesterday on the Home Theater Junkies FB page when referring to a Samsung system he'd never heard.
The fact that speakers will sound different in different locations in the room is a major consideration. As anyone who has ever been to the Harman Northridge, CA listening lab will tell you they factor this out by doing double-blind testing with the speakers behind a screen and a conveyor system that moves each speaker under test into the exact same spot. When I first took this test about 10 or so years ago I was pretty nervous, as my reputation as a critical listener and audiophile was on the line. I did very well in the double-blind testing (whew). But I wonder how many people are simply afraid to take an A-B-X test. That said, I don't think you need do do an ABX test to "scientifically prove" that you're hearing differences between speakers and components. But I can't prove that!
While you may be okay with being fooled into parting with ridicolous amounts for average equipment, so long you can enjoy your music, alot of us are not. You hardly qualify as a troll for wanting better reviews. Your condecending attitude against it, more so.
It IS difficult to conclusively answer this seemingly obvious concept. I could see and agree with one side, then go have coffee. After coffee, I could then agree with the other side. This perplexes me. Suffice it to say, I can certainly understand the knee jerk purchases that can be made in audio. In that moment, you have taken all of the stress out of being logical/correct. Who wants stress? We sum up our buys to enjoying the path along the way. Only a few have made a choice for themselves that lasts for decades.
It saddens me that there is enough demand that justifies a "Pro Level ABX Box" ... this signals that the trolls and the flame wars are winning, and enjoying yourself and your music is losing. I love listening to music in the few hours I get to unwind at the end of a day. Enough so that we bothered to get good speakers and a nice stereo to enjoy it all on. I have no desire to disrupt this to open it up to people picking through it declaring choices I have made to be genius or foolish. That's not listening to music, that's trolling, and I have given up Usenet Newsgroups in the 1980s when the Trolls took over. But, I don't judge how people want to use their free time. If their hobby isn't listening to music, but performing ABX testing, who am I to judge? If it makes them happy, and so long as they don't bother me ... I'm fine. I will say that based upon the back-and-forth between the debunkers and the beleivers, neither side seems to be particularly happy, and I'd suggest they have a happier time spinning a few sides of a favorite band, and having a few beers.
I've always been curious as to why "testing the skill/talent of an average listener" is a valid method for learning about anything in the world that that listener happens to be in. Seems more than a bit obtuse. Better still, instead of listening to the sounds occurring around you, now what are you doing with an ABX or DBT test? "Is this clip different from the last clip? Is this clip the same as the clip from 4 trials ago, or different? Am I *really* listening for the same things in trials 4-7 that I was in trials 1-3? What if my attention wanders? What if the doorbell rings? What if I'm thirsty or it's a humid day or my power grid is under particular load that one part of that testing period? Not saying it's not interesting, but as a methodology, it's a little curious as to why it's so revered. If I wanted to know whether or not a particular drug was going to impact a general population? DBT the hell out of it. If I wanted to know if a particular painting was "more compelling" or not? Not sure a DBT is the thing I'd be pulling off the shelf to answer it.
As someone who worked in the industry for several years I agree with your article. I would caution in speaker A/B/X unless the setup is done in a complete and fair way. It is easy to give one pair of speakers an unfair advantage by setting up a "favorite" in the sweet spot of a room. We used to set up more expensive speakers in the prefered spot of a room to make them sound much more superior then they would have otherwise sounded. I will not mention the name of the store for obvious reasons. That this used to be standard practice was a given inside many high end sound rooms. Perhaps this has changed? Not so sure... A/B/X is much more valid for Electronics.