Jump to content

Are Audio Double Blind Tests (DBT) To Be Doubted (TBD) ?


Message added by Marc,

This thread will be allowed to continue, but each post will need to be approved by moderators. We understand this slows the discussion down, but it is the only way to ensure that posts are on topic, helpful and relevant to the thread. Many posts have now been removed that were way off topic or considered to be trolling.

Recommended Posts

In a recent thread this happened

 

  

11 hours ago, frednork said:
13 hours ago, Ittaku said:

Well that's the thing. They ALWAYS fail at these tests of questionable changes, so either every single DBT ever conducted in the history of audiophilia has been done wrong, or something else. Some random person saying they conducted a DBT at home and heard a difference doesn't count since no one knows exactly how it was truly conducted.

It all depends on what you are trying to prove and what was done. Broadly speaking if the goal of many of the DBT's done was to prove that people with unknown abilities of discrimination , when put into an unfamiliar situation with an  unfamiliar sound system and room  in random  listening positions  and tested once were unable to determine a difference then , I would say the tests have been successful.

 

Of course it would be better to look at the individual tests and how they were done and go from there.

 

So this thread will have a look at some that have been performed and see how they stack up.

 

To be clear, although I have worked in sensory analysis it is not my area of expertise and am not coming from an expert practitioner perspective but rather someone who has been intimately involved in many tests on both sides (ie, the tester and the tested).  My observations from sensory analysis of food/drink are that blinding already makes it very hard for random people to discern the difference. Sometimes I have seen people on here advocate that subjects being tested should have no information whatsoever on what is being tested. While this is fine for an involuntary response (change in some blood marker) If I apply my previous experience to this then I would say the more the subjects know about what is being tested the more likely they are to be able to reliably discern a difference .  A question I would put to those people is "as long as effective blinding is used, what does it matter how much a subject knows about the test being done?". 

 

Sometimes I do informal testing or blind testing with a friend. On the occasions where I have not known anything about the difference being tested at all I have found it very difficult to determine a difference as I dont know what I should be listening for as there are many dimensions of possible difference. Is it in the bass , mids treble, background noise, sibilance, soundstage depth, or width or height , level of background noise (blackness), transient speed, dynamics, and many other variation within. I think of it as a wheres wally exercise where I am being asked what colour is wally's top but I cant even find wally as he is not easy to find. Try assessing across all those parameters in a 15 second sample ABX.

 

Most serious audio DBT's seem to be setup by engineers rather than sensory specialists and I would suggest this is where the issues come in.  Imagine trying to measure something with your super duper sensitive measuring rig BUT, you dont know if it will give the correct answer as it can get performance anxiety, it gets distracted as some random point and is much less accurate. You cant make it measure too many things too quickly as it gets fatigued.  And most importantly some of the gear (test subjects) never worked and although it looks like it works it is fundamentally unable to  .

 

This is how listening trials should be approached and all these issues (and more) should be mitigated for. The Blinding bit which is most often obsessed over is the easy bit.

 

The trials have been chosen from this list in this thread. https://www.head-fi.org/threads/testing-audiophile-claims-and-myths.486598/ provided by @Ittaku . Have had a quick look and heads up, not all are DBT's but are just links to ABX's and other resources. I will go through every one regardless as if you havent seen it before it is probably still of use.

 

1 - ABX Double Blind Comparator.

This is a web site dedicated to such testing. Back in May of 1977 there was a comparison of amplifiers which found over three tests of two amps each, listeners could tell a difference in two, but not the third which was an even split. It is important to note that not all of the ABX tests here are negative. Some do find differences can be identified. That shows that with some parts of the hifi chain there are real differences, but with others there are not.

http://www.foobar2000.org/components/view/foo_abx

A test of interconnects and speaker cables found that no one could pick out the differences between a series of wires from ‘blister pack $2.50 to $990 speaker cable. All the results were even with approximately 50% going for the cheap and expensive options.

There is an interesting comparison of ‘video cables’ which found that once over 50 feet it was easy to spot which was the 6 foot cable and the much longer one.

DACs don’t fair well with CDPs finding an original CDP being distinguishable from a more modern one, but an expensive stand alone DAC being the same as a CDP.

None of the tests involve a large amount of people and some are just of one person.

 

Ok, the tests mentioned are not there or have been removed. You can still try the abx's there and they are worthwhile to build your "listening muscles" . In an ideal audio trial all participants would be initially tested across some parameters to ensure they at least have a fair chance of being able to discriminate some difference. Better stlll, regular training in the type of conditions that you would be doing the test in would be more optimal.

 

2 - Effects of Cable, Loudspeaker and amplifier interactions, an engineering paper from 1991.

http://www.aes.org/e-lib/browse.cfm?elib=5975

Twelve cables are tested from Levinson to Kimber and including car jump leads and lamp cable, from $2 to $419 per metre. The results are based on the theory that loudspeaker cable should transmit all frequencies, unscathed to any speaker from any amplifier and loss is due to resistance. There is an assumption that letting through more frequencies with less distortion will sound better. But that seems reasonable to me.


The best performance was with multi core cables. The car jump leads did not do well and cable intended for digital transmission did! The most expensive cable does not get a mention in the conclusions, but the cheapest is praised for its performance and Kimber does well. Sadly there is not a definitive list of the cost of the cables and their performance, so it is not clear as to whether cost equals performance, but the suggestion is that construction equals performance.

 

This is just a comparison of electrical measurements, no dbt's, the link doesnt take you to the full paper but I found it here http://tmr-audio.de/pdf/kabl_cap.pdf . If you dont think speaker cables can be different, get with the program and read this. Dont give me the no difference to bell wire from Radio Shack guff!!

 

3 - Do all amplifiers sound the same? Original Stereo Review blind test.

(The original Bruce Coppola link is broken, and I cannot find any existing link at this time)

A number of amplifiers across various price points and types are tested. The listeners are self declared believers and sceptics as to whether audiophile claims are true or not.

There were 13 sessions with different numbers of listeners each time. The difference between sceptic and believer performance was small, with 2 sceptics getting the highest correct score and 1 believer getting the lowest. The overall average was 50.5% getting it right, so that is the same as you would expect from a random guess result. The cheapest Pioneer amp was perfectly capable of outperforming the more expensive amps and it was ‘striking similar to the Levinson‘.

As an extra to this and for an explanation of how amps can all sound the same, here is a Wikipedia entry on Bob Carver and his blind test amp challenges

http://en.wikipedia.org/wiki/Bob_Carver#Amplifier_modeling

 

Again no link and no real DBT, but more informal sighted listening.  Have added a link,  still is interesting and worth a read though https://www.stereophile.com/content/carver-challenge

 

4 - Cable directionality.

Not the best link as it only refers to a test without giving too many specifics. The cable maker Belden conducted a test with an un named magazine which found the result was perfectly random.

I liked the next sentence which was “Belden is still happy to manufacture and sell directional cables to enthusiasts”

http://www.aes-media.org/sections/pnw/pnwrecaps/2000/lampen/

 

no real info on how test was done but apparently was a DBT, if you did it properly it would take some effort and then you would want to let people know what you did so i call TBD (to be doubted) on this one

 

5 - Head - Fi ABX Cable Taste Test Aug 2006.

Three cables from Canare, Radio Shack and a silver one were put into the same sleeving to disguise them, a mark put on each one so only the originator knew which was which and then sent around various forum members. The result was that only one forum member got all three correct. The Radio Shack cheap cable and the silver were the most mixed up.

Unfortunately I cannot see from the thread, which is huge how many members took part and what the exact results were.

Strangely no link to the test on the same site here it is https://www.head-fi.org/threads/blind-cable-taste-test-results.190566/   so its a little confusing but this guy sent out three cables which were covered so as to mask their true identity and then people sent them on to each other.

Triangle Cable is Solid Silver

Circle Cable is Canare
Square Cable is Rat Shack
Blind Cable Taste Test is the arena where Head-Fiers await the challenge in their own home. The Head-fiers have one week to tackle the sonic mystery of the cables. Using all their senses, skills, creativity, they are to discover for themselves what the true mystery beholds.

 

Ok whats wrong with this? its blinded and people can do what they want on their own system and repeat as many times as they want. Well what are we asking them to do? Identify 3 cables which we dont even know if they sound different, we dont know if the participants have heard these cables before. Its just dumb and and a big waste of time. How could they have done it better?? One of many better options might be send 2 identical and 1 different and ask people to identify the different. Even so the potential for the inability of the system to show the difference , the participant to not be able to discriminate the difference etc is  a factor here. I call TBD!!

 

Wow time flies, will come back and do more at some point but not doing well so far, will see what the next tranche brings.

 

 

Link to post
Share on other sites
  • Replies 268
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

I completed perceptual testing of audio products in my PhD thesis and got reliable results. Enough to make design decisions on. This was supervised by a research audiologist professor (specialist in p

Dear @Grant Slack, thank you for your letter dated 3 hours ago. Its nice to get a civilised letter, you wouldnt believe some of the uncouth communique's I receive.    I have considered

I think I covered this before.   Sighted just adds other variables. So it's a question of whether you are trying to measure the whole perception or just what can be mechanically heard.

I've churned through a vast amount of gear in the last two years in search of HeadFi Nirvana (and yes I did have the Auris at one point). I've lost money and found the journey frustrating at times. But what a journey! I've discovered so much more music than I thought possible. I've made some very nice friends from this forum ? and learned so much about the audio chain. Sometimes people even ask for my opinion! I do feel that my listening muscles are well developed and striated! I'll get to the point....

I've done hundreds of hours of A-B on DACs, amps, streamers, headphones and cables. Using a solid set of tracks and listening parameters, like volume matching. Very rarely, blind. Why?......well it's really hard to do on your own! And very few people apart from my good friend Hugo @Indie Hi-Fi are willing to set things up and devote the time. You also need to hear things in your environment with your gear. I'm ok with subjective bias!

Here's just a small sample of what I've learned:

No one hears the same as someone else, even with the same genre likes. 

We all describe sound in a different way. Your transparency is my clarity.

Only change one thing in a system at once.......and let it burn in!

Look for synergies in your system - so hard to do when you already have that statement piece

Amps, headphones and speakers colour the sound most

Try to avoid all of the side grades - save up and buy up

I buy cables that are well made....not for the "air" they bring

Power management is way more important than I gave it credit for

 

Look, I could go on for days. When it comes to blind tasting.....I mean testing, just have fun. Try as much stuff as you can at whatever price point. Watch out for a lot of reviews that are just paid for. Listen to statement pieces, it helps avoid the side grade. Trust your own ears! YMMV!

  • Like 4
Link to post
Share on other sites
2 hours ago, frednork said:

Ok whats wrong with this? its blinded and people can do what they want on their own system and repeat as many times as they want. Well what are we asking them to do? Identify 3 cables which we dont even know if they sound different, we dont know if the participants have heard these cables before. Its just dumb and and a big waste of time. How could they have done it better?? One of many better options might be send 2 identical and 1 different and ask people to identify the different. Even so the potential for the inability of the system to show the difference , the participant to not be able to discriminate the difference etc is  a factor here. I call TBD!!

 

Wow time flies, will come back and do more at some point but not doing well so far, will see what the next tranche brings.

 

 

If you know that cables are what are being tested, as the subject, it is already not a DBT. 

 

Link to post
Share on other sites
Posted (edited)
1 hour ago, Eggcup the Dafter said:

If you know that cables are what are being tested, as the subject, it is already not a DBT. 

 

Excellent comment and many seem to agree with you, but sadly totally wrong!!!!.  Thankyou for saying that @Eggcup the Dafter as this is a common fallacy that somehow if the subject knows about what is being tested than it is not blind test. The blinding bit only relates to the fact  that the identity of the samples is not known to the test subject. You can tell the subjects exactly what is being tested for and and as long as they can discriminate what you have setup your experiment to measure it is fine.  There may be cases where you are looking for an involuntary response ie blood pressure reduction or some chemical marker where it is understood that knowing what the test is for may change the subject response through placebo or other psychological effect.  However remember that the mere act of testing can change someones normal response so it can be a bit tricky.

 

When testing audio my suggestion would be to let the subject know as much as possible because God knows, doing these tests is already hard enough as it is. 

Edited by frednork
  • Like 2
Link to post
Share on other sites


1 hour ago, frednork said:

Excellent comment and many seem to agree with you, but sadly totally wrong!!!!.  Thankyou for saying that @Eggcup the Dafter as this is a common fallacy that somehow if the subject knows about what is being tested than it is not blind test. The blinding bit only relates to the fact  that the identity of the samples is not known to the test subject. You can tell the subjects exactly what is being tested for and and as long as they can discriminate what you have setup your experiment to measure it is fine.  There may be cases where you are looking for an involuntary response ie blood pressure reduction or some chemical marker where it is understood that knowing what the test is for may change the subject response through placebo or other psychological effect.  However remember that the mere act of testing can change someones normal response so it can be a bit tricky.

 

When testing audio my suggestion would be to let the subject know as much as possible because God knows, doing these tests is already hard enough as it is. 

In the case of a medical test, where people are taking a pill or something and you are testing for a placebo effect, you'd be right. The patient needs to know that they are taking a pill for a particular condition or the possible effect of the treatment under study.

 

In an audio DBT, the listener should only know that an audio system is under test and that a change is being made. We're trying to cancel out bias. If for example I tell someone we are testing "amplifiers" they are already biased towards listening in a particular way or for particular changes.

 

If an audio test is trying to elicit a particular response - we're testing the person not the system - then yes, the test design may involve giving that information.

 

 

 

 

Link to post
Share on other sites
Just now, Eggcup the Dafter said:

In the case of a medical test, where people are taking a pill or something and you are testing for a placebo effect, you'd be right. The patient needs to know that they are taking a pill for a particular condition or the possible effect of the treatment under study.

 

In an audio DBT, the listener should only know that an audio system is under test and that a change is being made. We're trying to cancel out bias. If for example I tell someone we are testing "amplifiers" they are already biased towards listening in a particular way or for particular changes.

 

If an audio test is trying to elicit a particular response - we're testing the person not the system - then yes, the test design may involve giving that information.

 

 

 

 

Sorry, still wrong. How exactly does knowing that you are testing for amplifiers help you identify a difference correctly

  • Like 1
Link to post
Share on other sites

ok let the fun continue

 

6 - HiFi Wigwam, The Great Cable debate. Power cable ABX test Oct 2005.

This is a very well done large scale ABX test. A similar set up to Head-fi where four mains cables including 2 kettle leads (stock power cords that had come with hifi products), an audiophile one, a DIY one and a tester CD were sent out forum members. The results were inconclusive to say the least, for example;

The kettle lead was C. There were 23 answers :
4 said that the kettle lead was A
6 said that it was B
8 said that it was C
5 said that they didn't know.
http://www.hifiwigwam.com/showthread.php?654-The-Great-Cable-Debate&highlight=blind+test

The overall conclusion was that the kettle lead could not be properly identified or that one cable was better than another.

EDIT - one of the participants to this test has pointed out that the two kettle leads, described in the test as exactly the same were in fact not identical and were just basic leads which had come with hifi products.

 

Interesting that this was described as well done as there is nothing well done about it. 4 cables sent out, why? Lets think about it from the flip side. If we assume that telling apart power cables by listening to them is very hard how do we test for it. We can send 2 cables and say: Is there a difference? this will give us the answer, We can send 3 cables with 2 being the same as ask Which is different? This setup was destined to fail from the start definitely TBD

 

7 - What Hifi The Big Question on cables. Sept 2009

From the Sept 2009 issue. Three forum members were invited to WHF and blind tested where they though the kit (Roksan, Cyrus, Spendor) was being changed, but instead the cables were. The same three tracks were used throughout.

The kit started out with the cheapest cables WHF could find and no one liked it saying it sounded flat and dull. Then a Lindy mains conditioner and Copperline Alpha power cords were introduced and the sound improved.

The IC was changed to some Atlas Equators and two out the three tracks were said to have improved with better bass and detail.

Last the 60p per metre speaker cable was changed for £6 per metre Chord Carnival Sliverscreen. Again, changes were noticed, but they were not big.

Various swaps took place after that which confirmed the above, that the power cords made the biggest difference. When the test was revealed the participants were surprised to say the least!

But, this is not an ABX test, it is a blind listening review and as you read on you find the two produce different results. What is worrying is that when I asked Clare Newsome, the Chief Editor about such tests she claimed that they were ABX and elsewhere on their forum they have claimed to do ABX testing. But they do not, they are blind listening reviews, which allow people the chance to claim a difference, but offers no evidence they they can really hear a difference.

 

3 people tested 4 different things Sheesh! this is utter garbage, regardless of outcome. OK people , if you are planning on doing a better quality home DBT-

1. Think about the simplest/easiest test that would prove the subject could tell the difference

2. Try to train them on the exact samples you will be using or get them to help choose the samples. maybe they have a particular track they know well.

3 Dont use more than 3 samples for the test , so 2 same 1 different , or 2 samples choose if they are different for instance.

 

if your subject blitzes the test then you can make it harder if you want.

 

Super TBD for this one

 

8 - Secrets of Home Theatre and High Fidelity. Can We Hear Differences Between A/C Power Cords? An ABX Blind Test. December, 2004

A comprehensive article with pictures and the overall result was 73 out of 149 tests so 49% accuracy, the same as chance.

http://www.hometheaterhifi.com/volum...s-12-2004.html

 

Had higher hopes for this one as they did put lots of effort in and at least the test was a 2 choices thing.  Well it was fine as an exploratory study on the impact of music choice on ability to detect a difference. maybe do it again to see if things repeat themselves in the same way. Get rid of anyone who didnt get over 50%, get rid of tracks where the rest (over 50% ers ) didnt go well and rinse and repeat. Thing is, if half or more of these participants are not very good at discriminating you are sunk before you start. Think about how the wine industry does its testing. Wine quality is not easy for the lay person to identify. How do they know if the wine is going as it should. They use expert panels , so they have figured out what discriminatory powers people need to have to have a chance of being an effective wine taster and also figured out tests to apply to these people to weed out the hopeless ones.  And then they train them . Its not easy. and.lots of work but they have no choice. No machine can tell you how a wine will taste. No measurement will tell you how a stereo will sound.

SAdly TBD

 

9 - Boston Audio Society, an ABX test of Ivor Tiefenbrun, the founder of Linn. August 1984


A rather complex testing of Ivor Tiefenbrun himself, who at that time was very pro vinyl and anti digital (the opposite almost of how Linn operate now!). There are various different tests and the overall conclusion was

"In summary, then, no evidence was provided by Tiefenbrun during this series of tests that indicates ability to identify reliably:
(a) the presence of an undriven transducer in the room,
(b) the presence of the Sony PCM-F1 digital processor in the audio chain, or
(c) the presence of the relay contacts of the A/B/X switchbox in the circuit."

http://www.bostonaudiosociety.org/bas_speaker/abx_testing2.htm

Even the founder of Linn could not back up claims he had been making when subjected to an ABX test of those claims.

 

Poor old Ivor, why would he agree to such a thing. He certainly ended up looking like a bit of a dill. Unfamiliar room, system etc. This is really not much better than inviting a friend over who keeps telling you that they can easily hear the difference between product a and product B with added bees dick difference. If you are one of these people get your ducks in a row and prove to yourself that you can do this at least on your setup in your environment under no stress conditions if you can. Find the track/s that make it easiest for you to tell the difference and any other crutches you may need. Thirdly insist it be done on your setup. It is ok to do this. Practice furiously over the course of months, maybe years and then... you are ready. you still might collapse under the pressure though...

Another TBD I'm afraid

 

10 - The (In)famous Audioholics forum post, cables vs coathanger!. June 2004

http://forums.audioholics.com/forum...9d247bf955a57b3953326a34&p=15412&postcount=28
 

Dead link and couldnt find it 

 

Disappointingly no studies yet worth not doubting, the power cable one would have been an ok start to a series of studies to figure out what is the best way to test for power cable differences. When the food and beverage industry test their products they have a whole bunch of real research which tells them what method to use, what ways to test for good levels of discrimination amongst participants, how many trials to do, how many samples you can test etc etc. What has the audio industry got? NADA it seems. but maybe in the next lot? I live in hope...

 

 

 

 

 

 

 

  • Like 1
Link to post
Share on other sites
52 minutes ago, frednork said:

Sorry, still wrong. How exactly does knowing that you are testing for amplifiers help you identify a difference correctly

It’s more likely to obscure a difference. With an amplifier you may listen for tonal difference, “life” in the music, soundstage, or your own set of changes marked “amplifier”

Now, the amplifier change does something different- let’s say it interferes  electrically with the DAC causing “digital wow”. Are you listening for wow from an amplifier?

Link to post
Share on other sites


Yes I think it is OK to have doubts as this is not a pure science; it's an applied science.

So it's OK to doubt DBT's with a NULL result and it's OK to doubt DBT's with a positive result.

Doubting the mechanisms used for testing is how future tests can be made an improvement.

 

So why were DBT's thought up and invented in the first place?
Well I presume they were born out of doubts about poorly controlled observations.

 

It's interesting the points you bring up regarding performance anxiety and fatigue.

How can anyone say it's not possible at least sometimes?

I won't.

Edited by Satanica
Link to post
Share on other sites
1 minute ago, Eggcup the Dafter said:

It’s more likely to obscure a difference. With an amplifier you may listen for tonal difference, “life” in the music, soundstage, or your own set of changes marked “amplifier”

Now, the amplifier change does something different- let’s say it interferes  electrically with the DAC causing “digital wow”. Are you listening for wow from an amplifier?

Sorry, dont understand what you are saying here.

 

Let me put it like this. If I say to you I can hear differences in ethernet cables, and whilst knowing that we are testing for ethernet cable differences I am able to discern reliably (when samples are blinded of course) when there is a difference in ethernet cable and when there is not.  Does the fact that I knew what we were testing for mean that I cant tell the difference, even though I could?

Link to post
Share on other sites
1 minute ago, Satanica said:

Yes I think it is OK to have doubts as this is not a pure science; it's an applied science.

So it's OK to doubt DBT's with a NULL result and it's OK to doubt DBT's with a positive result.

Doubting the mechanisms used for testing is how future tests can be made an improvement.

 

So why were DBT's thought up and invented in the first place?
Well I presume they were born out of doubts about poorly controlled observations.

 

It's interesting the points you bring up regarding performance anxiety and fatigue.

How can anyone say it's not possible at least sometimes?

I won't.

Yes, but this problem has already been solved elsewhere but just not applied to audio. We can mitigate for fatigue and anxiety and random poor performance. Its just that extremely few do.

Link to post
Share on other sites
3 minutes ago, frednork said:

Yes, but this problem has already been solved elsewhere but just not applied to audio. We can mitigate for fatigue and anxiety and random poor performance. Its just that extremely few do.

 

OK, but even if this is a problem and it does somehow get solved it doesn't mean that the results will necessarily change to any relevant degree.

Link to post
Share on other sites


3 minutes ago, frednork said:

Sorry, dont understand what you are saying here.

 

Let me put it like this. If I say to you I can hear differences in ethernet cables, and whilst knowing that we are testing for ethernet cable differences I am able to discern reliably (when samples are blinded of course) when there is a difference in ethernet cable and when there is not.  Does the fact that I knew what we were testing for mean that I cant tell the difference, even though I could?

How the hell did you get to that from what I said?

 

Let’s use your example on me, Mr Cynical. If you tell me we are listening for Ethernet cable changes, I’m likely to think “what a waste of time” and not bother....

 

What I was saying is that knowing what you are listening to is likely to influence what changes you listen for. If the change isn’t what you are listening for, there is a good chance you will miss it = false negative. 

Link to post
Share on other sites

It’s really funny that this discussion was kicked off on a thread about audio differences in Ethernet cables. It’s seriously trivial to build an automated abx engine to test cables using a single switch with a remote management capability.
 

The testing can be randomised in a script and applied uniquely each and every time the test is run so there is no risk of someone, even the operator, guessing the sequence. To correlate results at the end, you pull the config log on the switch and tell people what they were listening to. 
 

what would you think would be the problem with an abx test of this nature?

Link to post
Share on other sites

@frednorkhere is one that I mentioned in the other thread here.  Please note that I have linked you to a new thread started just today which has the original thread linked in the first post.  Blind test was passed, measurements taken, friendships lost.

 

https://audiophilestyle.com/forums/topic/62416-bit-identical-playback-can-sound-different/?tab=comments#comment-1119666

 

 

10 minutes ago, BugPowderDust said:

 

what would you think would be the problem with an abx test of this nature?

 

Many would say the switch is the problem...

Link to post
Share on other sites
1 hour ago, Eggcup the Dafter said:

What I was saying is that knowing what you are listening to is likely to influence what changes you listen for. If the change isn’t what you are listening for, there is a good chance you will miss it = false negative. 

 

Ah ok. In a well conducted trial this sort of issue would have been sorted out previously in pretrial test runs to ensure there are no unforseen circumstances.  Doing a proper sensory evaluation is a lot of work, and then you have to get a whole bunch of people to be tested so you dont want them to come in for a poorly thought though trial. Its a bit like doing a dress rehearsal before the show in front of an audience.

 

Link to post
Share on other sites


When I think of the money, resources and time spent Harman puts into product development including testing double blind and othetwise, I figure they put a lot of credence in the results they find because unlike a lot of blokes sitting around listening for the smallest changes as crucial to the future of hi-fi in their lounge room systems, there is money to be made.

 

A corporation in the hunt for profits in a competitive international economy developing product that will cause a significant uptick in sales as well as rob market share from its competitors across the world is going to get my  attention because it is a serious undertaking of which appealing to audiophiles may be the least of its concerns.

 

To get it right, double blind testing of both trained and untrained listeners by the company ensures that they don't get a false idea of their capabilities or their results, develop, sell and manufacture products that will be attractive to many and add some weight to accounts bottom line.

 

 

  • Like 2
Link to post
Share on other sites
Posted (edited)
14 hours ago, BugPowderDust said:

what would you think would be the problem with an abx test of this nature?

there is nothing wrong with it as long as that is exactly what you are trying to test.

 

So, if my study was to determine if random members of the public could distinguish a difference between certain cables by using an automated ABX test then there is no issue.  So if the test was then extrapolated to say well, there are no differences,  Some people might say, well, hang on, we think ABX is not a suitable test for listening for fine differences and you should use another kind of test. Another bunch of people might say , we dont think this is something the general public would be able to do, you need to get a tested expert panel that has a chance of doing this.  I might go, ok, well you do it then, or I might go, hang on , they might be right and I dont want my work to be disproven by someone else so I might do it and remain the expert in the area, and so forth.

 

BTW some dont consider a short ABX enough time to discriminate well with audio tests.

see my first post here 

 

Edited by frednork
Link to post
Share on other sites
25 minutes ago, acg said:

@frednorkhere is one that I mentioned in the other thread here.  Please note that I have linked you to a new thread started just today which has the original thread linked in the first post.  Blind test was passed, measurements taken, friendships lost.

 

https://audiophilestyle.com/forums/topic/62416-bit-identical-playback-can-sound-different/?tab=comments#comment-1119666

 

 

 

Many would say the switch is the problem...

Yes did see that one at the time, it was a heroic effort , they should have come back the next day and done it again as a new trial. I suspect there would have been an undeniable result one way or the other.

Link to post
Share on other sites

Millions of us, the great unwashed can drive a car,  if  Ford is developing a car and want to see what it can do in regards to ultimate performance are you going to hire the guy that commutes to work everyday in traffic or are you going to hire a professional driver to take it around the test track?

  • Like 1
Link to post
Share on other sites
23 minutes ago, allthumbs said:

To get it right, double blind testing of both trained and untrained listeners by the company ensures that they don't get a false idea of their capabilities or their results, develop, sell and manufacture products that will be attractive to many and add some weight to accounts bottom line.

Yep, its why any company does it, I guess due to the cottage industry style of most audio manufacturers its not even on the radar let alone space for it on the bottom line. It is expensive and difficult and you must be committed to it. But once you get it going and get good at it , it is very valuable.

  • Like 1
Link to post
Share on other sites
16 hours ago, Satanica said:

OK, but even if this is a problem and it does somehow get solved it doesn't mean that the results will necessarily change to any relevant degree.

Possibly, but we just dont know unless we do it. All we can say is that the test is probably not sufficiently well done to draw a firm conclusion.

Link to post
Share on other sites
Just now, frednork said:

Possibly, but we just dont know unless we do it. All we can say is that the test is probably not sufficiently well done to draw a firm conclusion.

 

Who ever said a NULL result is conclusive, more importantly was it me? If NULL results are not conclusive that would make sighted listening experiences even less so, hey? But they seem to be given and taken like fact/gospel. Can you see the problem? 

Link to post
Share on other sites
5 minutes ago, Satanica said:

 

Who ever said a NULL result is conclusive, more importantly was it me? If NULL results are not conclusive that would make sighted listening experiences even less so, hey? But they seem to be given and taken like fact/gospel. Can you see the problem? 

Well, not sure why sighted listening is less conclusive,, not even sure what less conclusive means. both are not conclusive. That is the purpose of doing a DBT, to prove to a certain level of statistical confidence that something is probably true. No absolute truths usually possible but we can be reasonably confident as long as the test was designed well. These poorly conducted DBT 's just muddy the water as most will believe them to a greater degree due to quasi science procedure whereas I would suggest they are doomed to failure before they start and if you dont understand the flaws, seem believable and yet have no more validity and  just end up a big waste of time and give people the wrong impression..

  • Like 2
Link to post
Share on other sites

I'd like to contribute a thought/question.  Does anybody not think that any attempt at blind, or double blind testing is better than a purely sighted test (where you make a change then decide yourself if it is good)?

 

Yes, there will still be some doubts about accuracy but surely it's better than the alternatives.  Still, I can recall people saying they think things like, any B or DB test is worse than sighted, because of reasons like - you have to listen for x hours, days, months, before you really know if there is a change and whether it is good.      I have big problems with these types of claims as I believe the brain can "get used to" the new sound, and also cannot remember the old sound.

 

I feel that any attempt to obscure clues about what is being tested, bringing the listener closer to judging by sound alone, has got to be a positive and yield more reliable results.

Link to post
Share on other sites
9 minutes ago, aussievintage said:

I'd like to contribute a thought/question.  Does anybody not think that any attempt at blind, or double blind testing is better than a purely sighted test (where you make a change then decide yourself if it is good)?

I not think that. 

 

10 minutes ago, aussievintage said:

I feel that any attempt to obscure clues about what is being tested, bringing the listener closer to judging by sound alone, has got to be a positive and yield more reliable results.

I think its fair to say that most of the stuff that these DBT's are done on is implicitly difficult to discern otherwise there would be no arguments. If you are willing to accept that so rather than make it more difficult for the subject it is behoven on the researcher to make it as easy as possible to try to get some sort of result. In the end you are testing whether it is discernable which is what the arguments are about. Then you can argue about what the level of discernability means , etc.

Link to post
Share on other sites
7 minutes ago, frednork said:

I think its fair to say that most of the stuff that these DBT's are done on is implicitly difficult to discern otherwise there would be no arguments. If you are willing to accept that so rather than make it more difficult for the subject it is behoven on the researcher to make it as easy as possible to try to get some sort of result. In the end you are testing whether it is discernable which is what the arguments are about. Then you can argue about what the level of discernability means , etc.

 

But without it being blind in some form,  you fall prey to all sorts of tricks your brain can play.  I do not want a result, if it's only due to that.  That does not show that it is discernible reliably.  Even just at home, for my own testing,  I do not trust myself,  especially when the differences are small.  I get my partner to control what I am testing, and her disinterest in the technical side of things makes it nearly a double blind test.

 

Link to post
Share on other sites
Posted (edited)
13 minutes ago, aussievintage said:

 

But without it being blind in some form,  you fall prey to all sorts of tricks your brain can play.  I do not want a result, if it's only due to that.  That does not show that it is discernible reliably.  Even just at home, for my own testing,  I do not trust myself,  especially when the differences are small.  I get my partner to control what I am testing, and her disinterest in the technical side of things makes it nearly a double blind test.

 

Am not suggesting blinding is not useful but unless it is done properly it is no more useful than sighted (unless you get it consistently right). The reason why its worse is that peoples perception of the value of poorly conducted blind test with a null result is that is proves something and it really doesnt. However, even if a poorly conducted DBT results in an ability to discern something than that still has some value as long as the blinding was done correctly.

 

Should also say that doing simple DBT's at home should be encouraged as it lessens the impact of doing a DBT on the result, essentially you are training yourself to do DBT's. However if you want to tell someone that you cant hear a difference because you did a DBT so that proves it, the DBT needs to be done to a high standard as it is too easy to get a null.

Edited by frednork
Link to post
Share on other sites
6 minutes ago, frednork said:

Am not suggesting blinding is not useful but unless it is done properly it is no more useful than sighted (unless you get it consistently right). The reason why its worse is that peoples perception of the value of poorly conducted blind test with a null result is that is proves something and it really doesnt. However, even if a poorly conducted DBT results in an ability to discern something than that still has some value as long as the blinding was done correctly.

 

True, I was only concentrating on the non-null result.   Mainly because if I fail to hear a difference, whichever way I test it,  it means it's too close for me to worry about anyway.

 

8 minutes ago, frednork said:

Should also say that doing simple DBT's at home should be encouraged as it lessens the impact of doing a DBT on the result, essentially you are training yourself to do DBT's. However if you want to tell someone that you can hear something or not because you did a DBT so that proves it, the DBT needs to be done to a high standard as it is too easy to get a null.

 

I am not understanding this concern about a null.  If you get a null, and report it to ohers, that leaves the way open for someone else to try, if they are interested.

 

As to doing a DBT at home.  As I have said, it's relatively easy to do an informal DBT with someones help - as long as you are honest with yourself, and don't try to construct a biased test.  If you do it fairly, as I try to,  then you can be reasonably confident of the results, for your own use.  If you describe the test to others and state your findings, they can then judge for themselves, what weighting the can put on whether they want to trust those conclusions.

Link to post
Share on other sites
3 minutes ago, aussievintage said:

I am not understanding this concern about a null.  If you get a null, and report it to ohers, that leaves the way open for someone else to try, if they are interested.

As long as you call it a null or similar rather than saying, "that proves no difference exists" which in effect is what all these studies i am discussing in this thread do!! This is where the damage is done.

 

5 minutes ago, aussievintage said:

As to doing a DBT at home.  As I have said, it's relatively easy to do an informal DBT with someones help - as long as you are honest with yourself, and don't try to construct a biased test.  If you do it fairly, as I try to,  then you can be reasonably confident of the results, for your own use.  If you describe the test to others and state your findings, they can then judge for themselves, what weighting the can put on whether they want to trust those conclusions.

Absolutely, everyone should have a go! just dont necessarily believe the DBT result is the real result. I guess what I am saying here is unless the test results in the ability to discern successfully , Your own DBT is TBD!!

  • Like 1
Link to post
Share on other sites

How many of you wear glasses and have observed this?

 

I wear glasses and only take it off when I am going to sleep, and many times I have noticed that when I am listening to someone talk with the glasses on, I can clearly hear and understand what they are saying, but there are times, I would have my glasses off for one reason or the other, and in those time I cannot understand them properly or have the words mixed. May be my hearing has indeed gone bad, and I am using my sights to supplement the missing bits or may be I need both the sight and hearing as a whole for my brain to understand and process it.

 

I wonder, how I would score in blind test? Does this mean that when sighted my hearing is improved or just that my brain's ability to put things in correct prospective is improved? What if I am not able to differentiate between say Amp A and Amp B in blind test, do these amps still sound the same when sighted? How does this affect when stereo listening, after all there are no visual just audio, yet I am perfectly fine in stereo listening?

  • Like 1
Link to post
Share on other sites
1 hour ago, frednork said:

Well, not sure why sighted listening is less conclusive.

 

Because all bias has been added in to muddy the waters.

It's less\worse than a NULL result; it's an INVALID result.

Note that I've used the word 'result' and not 'conclusion'.

Edited by Satanica
Link to post
Share on other sites
2 minutes ago, Satanica said:

 

Because all bias has been added in to muddy the waters.

It's less\worse than a NULL result; it's an INVALID result.

Note that I've used the word 'result' and not 'conclusion'.

I would say this. In a sighted listening exercise, an ability to discriminate, as well as ,inability to discriminate should be doubted, in a (less than high standard) blind test, inability to discriminate should be doubted at a similar level to the sighted tests but an ability to discern, even in a poorly designed DBT still has some value (as long as adequate blinding is undertaken). So in that sense a home blind test can still be (from an absolute proof prespective) valuable. I still find doing blind tests valuable for my personal use though. 

Link to post
Share on other sites
19 minutes ago, :) Go Away (: said:

I wonder, how I would score in blind test? Does this mean that when sighted my hearing is improved or just that my brain's ability to put things in correct prospective is improved? What if I am not able to differentiate between say Amp A and Amp B in blind test, do these amps still sound the same when sighted? How does this affect when stereo listening, after all there are no visual just audio, yet I am perfectly fine in stereo listening?

These are good questions and if they havent been answered by the audio industry they should have been.

Link to post
Share on other sites
11 minutes ago, frednork said:

...but an ability to discern, even in a poorly designed DBT still has some value (as long as adequate blinding is undertaken).

 

Not necessarily, the blinding is only one aspect. What if one of the items was faulty or appropriate level matching didn't take place then there's no value as the result is INVALID.

Link to post
Share on other sites

11 - Matrixhifi.com from Spain. ABX test of two systems. June 2006.

Two systems, one cheap (A) with a Sony DVD and Behringer amp (supported on a folding chair) with chepo cables and the other more expensive (B) with Classe, YBA, Wadia and expensive cables and proper stands were hidden behind a sheet and wired to the same speakers.

7563


The results were;
38 persons participated on this test
14 chose the "A" system as the best sounding one
10 chose the "B" system as the best sounding one
14 were not able to hear differences or didn't choose any as the best.

http://www.matrixhifi.com/ENG_contenedor_ppec.htm
 

So another better attempt with considerable effort but they really jumped to a more difficult question to answer which was which system is better, rather than first establishing  can you tell the difference. All they proved was that a 1/3 preferred either the cheaper or more expensive system and 1/3 were undecided. When you get this sort of even spread you can accept that is the answer or you should say, hangon if i am getting essentially an even distribution across possible answers (essentially a random response) then is there an issue with my test?  All the usual problems arise, we dont know if anyone in the panel has the discriminatory power to tell the difference or their sound preferences, despite their claims to being audiophiles.  They all sat in different positions,  For me this result would be a , "lets go back to the drawing board and see what we can prove" situation and build it up from there.

highly TBD

 

12 - AVReview. Blind cable test. April 2008

Some of AVR's forum members attended at a Sevenoaks hifi shop and listened to the same kit with two cheap Maplins cables at £2 and £8 and a Chord Signature at £500. They found the cheaper Maplins cable easy to differentiate and the more expensive harder to differentiate from the Chord. Their resident sceptic agreed he could hear differences. The final conclusion was;

....from our sample of 20 near-individual tests, we got 14 correct answers. That works out at 70 per cent correct....

So that is the second ABX to join What Hifi which suggests there is indeed a difference. But like What Hiif it shows the difference in results from Blind to ABX testing and how easy it is to try and obscure the two types of test.

https://hifiwigwam.com/forum/topic/12043-av-review-blind-cable-test/?tab=comments#comment-341108 - note the link to test is broken, unable to find another

 

Sadly not much more info on this and how it was done. It seems there was an encouraging level of discrimination. but cant really make a comment with the info given.

 

13 - Journal of the Audio Engineering Society, ABX test of CD/SACD/DVD-A. Sept 2007

a summary of which states "A carefully controlled double-blind test with many experienced listeners showed no ability to hear any differences between formats". The results were that 60 listeners over 554 trials couldn’t hear any differences between CD, SACD, and 96/24.

EDIT - this test is apparently flawed, see post 962 for ful details, but basically the hi rez example used was from an original CD.

 

Not really sure if it is flawed but possibly, however on the face of it, it is easily the best conducted study so far on this list DOes it prove there is no difference, maybe, I always get a bit thingy about studies of this kind where the authors have a view that discrimination is not possible (which apparently these guys did) It just means they are coming from the wrong place to ensure every single little thing possible was done to allow a difference to be heard. Not suggesting this is a conscious decision but still may be there nonetheless.

Well I guess that does it, throw out your hirez stuff now!!!

 

until someone did another study

I will quote from here to save time https://www.soundandvision.com/content/yes-high-res-difference-audible

 

 

Seven years later, a new study was submitted to AES by Helen M. Jackson, Michael D. Capp, and J. Robert Stuart, all of whom are associated with Meridian Audio Ltd. in the U.K. The title is The Audibility of Typical Digital Audio Filters in a High-Fidelity Playback System. The abstract says: "This paper describes listening tests investigating the audibility of various filters applied in high-resolution wideband digital playback systems. Discrimination between filtered and unfiltered signals was compared directly in the same subjects using a double-blind psychophysical test. Filter responses tested were representative of anti-alias filters used in A/D (analog-to-digital) converters or mastering processes. Further tests probed the audibility of 16-bit quantization with or without a rectangular dither. Results suggest that listeners are sensitive to the small signal alterations introduced by these filters and quantization. Two main conclusions are offered: first, there exist audible signals that cannot be encoded transparently by a standard CD; and second, an audio chain used for such experiments must be capable of high-fidelity reproduction." (Emphasis added.) Twenty bucks will get you a look at the study here.

So here are two studies undertaken with double-blind methodology, which the Perfect Sound Foreverists (and objectivists in general) insist is the scientific gold standard. They say a double-blind study is always right; they slam anything else as "pseudo-science." What could go wrong? Yet these two double-blind studies contradict one another. Only one of them can be right. But which one?

We might drill deeper into the details and discuss the different hardware, content, listeners, and testing practices. One aspect of the newer study that I find interesting is the Training section. I probably can't quote the paywalled material at length but I'll summarize. Jackson, Capp, and Stuart believed, based on preliminary data and feedback, that listeners needed time to prepare themselves for the task. So they implemented a three-phase training program that allowed listeners to familiarize themselves with the 200-second piece of music used for comparison, the filtering used in the test to distinguish CD-quality audio from high-res audio, and the test conditions. Only when listeners had prepared themselves in this manner did the actual testing move forward.

The conclusions? Listeners could hear the difference between 16/44.1 and 24/192. The filters and quantization used to downsample high-res masters for CD release can have a "deleterious effect." However, not all music reveals this loss of transparency. It is more audible with music having prominent echoes. This is roughly consistent with my considerably less scientific high-res listening experience: Sometimes I can hear the difference, sometimes I can't. Jackson, Capp, and Stuart also caution that, to ensure meaningful results, psychophysical tests should "minimise cognitive load," which presumably was the intent of their training procedure.

 

I love this because it just confirms what I have been banging on about and is well understood in sensory science. You need to have done your homework and practiced to get a positive result. Also it helps you to understand just how important that phase of the work is, so next time I say, they didnt assess the participants and didnt practise you will know why.

So even this very scientifically performed study at number 13 is TBD

 

14 - What Hifi, Blind Test of HDMI cables, July 2010

Another What Hifi test of three forum members who are unaware that the change being made is with three HDMI cables. As far as they know equipment could be being changed. The cables are a freebie, a Chord costing £75 and a QED costing £150. Throughout the test all three struggle to find any difference, but are more confident that there is a difference in the sound rather than the picture. They preferred the freebie cable over the Chord one and found it to be as good as the most expensive QED. That result is common in blind testing and really differentiates it from ABX tests.

In my opinion the way the differences between the cables are reported, they can be explained by the fact that it would have taken three brave testers to have said there was no difference. They had been invited to a test expecting to be able to identify differences.

 

Do I really need to comment? same again, why 3 cables? what assesment and training of participants was done. Mind you, if you have really read this far I will share something I wouldnt normally on the forum. In my own personal (think it was sighted in this case) tests I tried a chord company usb cable thinking if its good enough for Rob Watts must be ok. Imagine my surprise when on listening it was not just similar to generic cable but significantly worse and at the time I felt it was "unlistenable". so maybe the study above worked to some extent.  Since then I have been made aware that chord electronics uk (which makes dacs) and chord company UK which makes cables are not related and if I owned chord electronics I would sue the other F%%%%ers for attempting to destroy my reputation.

 

Anyhow will take a few deep breaths.

 

The next one will take some time as it is a Toole study and I wanna have a good look. Dont expect the next installment too soon!! Iknow, Iknow, but you will be ok.

 

 

Link to post
Share on other sites
1 hour ago, Satanica said:

 

Not necessarily, the blinding is only one aspect. What if one of the items was faulty or appropriate level matching didn't take place then there's no value as the result is INVALID.

Yep sure, would absolutely agree with that

Link to post
Share on other sites
1 hour ago, :) Go Away (: said:

may be I need both the sight and hearing as a whole for my brain to understand and process it.

My very lay person understanding is that parts of brain our brains dealing with senses in some ways work together.  That is, one sense can enhance the outcome of another.  It is interesting to consider that through brain plasticity the lack of one sense can be compensated for by the improved development of a remaining active sense.  Just consider the likes of Geoffrey Gurrumul Yunupingu or Nobuyuki Tsujii.  Both blind from birth yet amazingly gifted musicians.  In the case of the later he keeps in time with the conductor over the sound of a full orchestra as he can hear the conductor breath.  Incredible.

John

Link to post
Share on other sites
1 hour ago, allthumbs said:

 

This is good in that it shows just how much effort it takes to be able to publish this sort of stuff. Dragging a few people into someones house/shop for a few hours and having a go just doesnt cut it.

  • Like 1
Link to post
Share on other sites
  • Recently Browsing   0 members

    No registered users viewing this page.




×
×
  • Create New...