Quote Originally Posted by RGA
The problem is that most studied tests in controlled environment have some correlation in real world listening environments...and we can go on ad nauseum about the tests forever, but the test environment is not the same(identical) to a non test environment...and there is no coreelation between the two but a lot of assumptions and innuendo as to what the result of a test says and what the real world says. Floyd Toole's also notes that these are results for the test environment not a real world environment. DBT's have shown that within the testing environment and the controls set-up - people have failed to distinguish differences to a statistically significant level better than chance.

That is ALL there is on the subject...Innuendo by the uninformed beyond this is why Americans got fat eating low fat diets for 30 years instead of following the once maligned now considered food God Dr. Atkins. The body of sicence was wrong because they took short cuts and made ASSUMPTIONS with having ALL the facts. Audio may not be the same...but there are certainly ASSUMPTIONS. There are two terms about testing Reliability which reproduces the same results over and over so we can reliably predict what is going to happen in a test involving trials. Then there is validitiy...how does what is being tested directly relate to that of reality. If a stereo is designed to provide long term musical enjoyment in one's home - then how valid is a test not set-up to that goal? Vague yes...but lots of bad tests have reliability, validity is the most important and of the two MORE important than reliability. You'd need both. Problem is that the direct problem is that normal listening is sighted, which is contradictory to what a DBT demands...it is this that causes "some" of the confusion and bickering. Nothing wrong with Double Blind tests - The complete story not according to psychologists or statisticians - the complete story to engineers? pick your field.
Agreed, assumptions are made that completely invalidate the tests.
Two points I would like to reply too.

1) Who decides which tests are valid and which aren't?

2) One can duplicate, over and over again, and get the same results each time as assumptions can lead people to the same conclusions, thus total inaccuracy.

For instance, how many times do you have the subjects listen to the same selection, and over what period of time. Over and over again certainly leads to the blending of the sound of the two different pieces of gear. You will always get the results of no difference. This happens visually too. Pretty close to black will be perceived as black if shown enough times. (This applies to any color, one the actual color and another that is close to that color.)

Another problem is if any comments are made, it could end up being deceitful. And in fact, deceit was directed toward the subjects, causing erroneous results. Crafts used one reference in which this occured.