# The Mind-Reading Salmon: The True Meaning of Statistical Significance

## The Mind-Reading Salmon: The True Meaning of Statistical Significance

If you want to convince the world that a fish can sense your emotions, only one statistical measure will suffice: the p-value.

The p-value is an all-purpose measure that scientists often use to determine whether or not an experimental result is “statistically significant.” Unfortunately, sometimes the test does not work as advertised, and researchers imbue an observation with great significance when in fact it might be a worthless fluke.

Say you’ve performed a scientific experiment testing a new heart attack drug against a placebo. At the end of the trial, you compare the two groups. Lo and behold, the patients who took the drug had fewer heart attacks than those who took the placebo. Success! The drug works!

Well, maybe not. There is a 50 percent chance that even if the drug is completely ineffective, patients taking it will do better than those taking the placebo. (After all, one group has to do better than the other; it’s a toss-up whether the drug group or placebo group will come up on top.)

The p-value puts a number on the effects of randomness. It is the probability of seeing a positive experimental outcome even if your hypothesis is wrong. A long-standing convention in many scientific fields is that any result with a p-value below 0.05 is deemed statistically significant. An arbitrary convention, it is often the wrong one. When you make a comparison of an ineffective drug to a placebo, you will typically get a statistically significant result one time out of 20. And if you make 20 such comparisons in a scientific paper, on average, you will get one significant result with a p-value less than 0.05—even when the drug does not work.

Many scientific papers make 20 or 40 or even hundreds of comparisons. In such cases, researchers who do not adjust the standard p-value threshold of 0.05 are virtually guaranteed to find statistical significance in results that are meaningless statistical flukes. A study that ran in the February issue of the American Journal

of Clinical Nutrition tested dozens of compounds and concluded that those found in blueberries lower the risk of high blood pressure, with a p-value of 0.03. But the researchers looked at so many compounds and made so many comparisons (more than 50), that it was almost a sure thing that some of the p-values in the paper would be less than 0.05 just by chance.

The same applies to a well-publicized study that a team of neuroscientists once conducted on a salmon. When they presented the fish with pictures of people expressing emotions, regions of the salmon’s brain lit up. The result was statistically significant with a p-value of less than 0.001; however, as the researchers argued, there are so many possible patterns that a statistically significant result was virtually guaranteed, so the result was totally worthless. p-value notwithstanding, there was no way that the fish could have reacted to human emotions. The salmon in the fMRI happened to be dead.

Από: http://www.scientificamerican.com/article.cfm?id=the-mind-reading-salmon&WT.mc_id=SA_CAT_BS_20110812

The p-value is an all-purpose measure that scientists often use to determine whether or not an experimental result is “statistically significant.” Unfortunately, sometimes the test does not work as advertised, and researchers imbue an observation with great significance when in fact it might be a worthless fluke.

Say you’ve performed a scientific experiment testing a new heart attack drug against a placebo. At the end of the trial, you compare the two groups. Lo and behold, the patients who took the drug had fewer heart attacks than those who took the placebo. Success! The drug works!

Well, maybe not. There is a 50 percent chance that even if the drug is completely ineffective, patients taking it will do better than those taking the placebo. (After all, one group has to do better than the other; it’s a toss-up whether the drug group or placebo group will come up on top.)

The p-value puts a number on the effects of randomness. It is the probability of seeing a positive experimental outcome even if your hypothesis is wrong. A long-standing convention in many scientific fields is that any result with a p-value below 0.05 is deemed statistically significant. An arbitrary convention, it is often the wrong one. When you make a comparison of an ineffective drug to a placebo, you will typically get a statistically significant result one time out of 20. And if you make 20 such comparisons in a scientific paper, on average, you will get one significant result with a p-value less than 0.05—even when the drug does not work.

Many scientific papers make 20 or 40 or even hundreds of comparisons. In such cases, researchers who do not adjust the standard p-value threshold of 0.05 are virtually guaranteed to find statistical significance in results that are meaningless statistical flukes. A study that ran in the February issue of the American Journal

of Clinical Nutrition tested dozens of compounds and concluded that those found in blueberries lower the risk of high blood pressure, with a p-value of 0.03. But the researchers looked at so many compounds and made so many comparisons (more than 50), that it was almost a sure thing that some of the p-values in the paper would be less than 0.05 just by chance.

The same applies to a well-publicized study that a team of neuroscientists once conducted on a salmon. When they presented the fish with pictures of people expressing emotions, regions of the salmon’s brain lit up. The result was statistically significant with a p-value of less than 0.001; however, as the researchers argued, there are so many possible patterns that a statistically significant result was virtually guaranteed, so the result was totally worthless. p-value notwithstanding, there was no way that the fish could have reacted to human emotions. The salmon in the fMRI happened to be dead.

Από: http://www.scientificamerican.com/article.cfm?id=the-mind-reading-salmon&WT.mc_id=SA_CAT_BS_20110812

**bruna**- Posts : 42

Join date : 04/02/2010

Age : 31

Location : Leiden, Netherlands

## Απ: The Mind-Reading Salmon: The True Meaning of Statistical Significance

Απολύτως σχετκό

Η ασπιρίνη εχει διαφορετικό αποτέλεσμα ανάλογα με το ζώδιο...

ISIS-2 Trial: Details of the Trial and the Subgroup Analysis by Astrological Sign

The large ISIS-2 trial1 involved 17,000 patients. The beneficial effect of aspirin for patients having a heart attack was very substantial and equal to the effect of streptokinase (a powerful clot dissolving medication). Both were life saving medications. (The trial result for aspirin was very statistically significant (2p <.00001) with much less than a 1/1000 chance of these findings being the result of chance.)

The ISIS-2 investigators note: "When in a trial with a clearly positive overall result, many subgroup analyses are considered, false negative results in some particular subgroups must be expected."

The ISIS-2 authors then give as an example that “subdivision of the patients in ISIS-2 with respect to their astrological birth sign appears to indicate that for persons born under Gemini or Libra, there was a slightly adverse effect of aspirin on mortality (9% increase, SD 13; NS), while for patients born under all other astrological signs there was a striking beneficial effect (28% reduction, SD 5; 2p <0.00001).”

The subgroup of analysis suggesting that Gemini and Libra had an adverse effect rather than a beneficial effect with aspirin was not a true relation. These patients would benefit from aspirin to an equal degree as the rest of the group.

http://www.improvingmedicalstatistics.com/Subgroup%20analysis1.htm

Δυστυχώς τα περισσότερα αρθρα εχουν subgroup analysis σε τέτοιο βαθμό που πάντα κάτι βγαίνει στατ. σημαντικό. Δειτε και αυτό

Lagakos SW. et al. (2006)

The challenge of subgroup analyses--reporting without distorting.

N Engl J Med. 2006 Apr 20;354(16):1667-9

Η ασπιρίνη εχει διαφορετικό αποτέλεσμα ανάλογα με το ζώδιο...

ISIS-2 Trial: Details of the Trial and the Subgroup Analysis by Astrological Sign

The large ISIS-2 trial1 involved 17,000 patients. The beneficial effect of aspirin for patients having a heart attack was very substantial and equal to the effect of streptokinase (a powerful clot dissolving medication). Both were life saving medications. (The trial result for aspirin was very statistically significant (2p <.00001) with much less than a 1/1000 chance of these findings being the result of chance.)

The ISIS-2 investigators note: "When in a trial with a clearly positive overall result, many subgroup analyses are considered, false negative results in some particular subgroups must be expected."

The ISIS-2 authors then give as an example that “subdivision of the patients in ISIS-2 with respect to their astrological birth sign appears to indicate that for persons born under Gemini or Libra, there was a slightly adverse effect of aspirin on mortality (9% increase, SD 13; NS), while for patients born under all other astrological signs there was a striking beneficial effect (28% reduction, SD 5; 2p <0.00001).”

The subgroup of analysis suggesting that Gemini and Libra had an adverse effect rather than a beneficial effect with aspirin was not a true relation. These patients would benefit from aspirin to an equal degree as the rest of the group.

http://www.improvingmedicalstatistics.com/Subgroup%20analysis1.htm

Δυστυχώς τα περισσότερα αρθρα εχουν subgroup analysis σε τέτοιο βαθμό που πάντα κάτι βγαίνει στατ. σημαντικό. Δειτε και αυτό

Lagakos SW. et al. (2006)

The challenge of subgroup analyses--reporting without distorting.

N Engl J Med. 2006 Apr 20;354(16):1667-9

**old stat**- Posts : 86

Join date : 24/12/2009

## Απ: The Mind-Reading Salmon: The True Meaning of Statistical Significance

Kαι μιας και αρχισα με αστρολογία δείτε και αυτό:

http://cver.upei.ca/files/cver/04_Astrological%20associations%20and%20illness_jce.pdf

Testing multiple statistical hypotheses resulted in spurious associations: a study of astrological signs and health

Journal of Clinical Epidemiology 59 (2006) 964-969

http://cver.upei.ca/files/cver/04_Astrological%20associations%20and%20illness_jce.pdf

Testing multiple statistical hypotheses resulted in spurious associations: a study of astrological signs and health

Journal of Clinical Epidemiology 59 (2006) 964-969

**old stat**- Posts : 86

Join date : 24/12/2009

## Απ: The Mind-Reading Salmon: The True Meaning of Statistical Significance

..κι ενα τελευταίο

Γνωστή μου που εργάζεται σε τμήμα Credit scoring τους ζητήθηκε να κάνουν ανάλυση στα δεδομένα της τράπεζας για να βρούν χαρακτηριστικά πελατών που ειναι κακοπληρωτές.

Βρήκαν λοιπόν ότι ο μήνας γέννησης ηταν στατιστικά σημαντικός!

Σκέφτεστε να δινονται δάνεια με διαφορετικό επιτόκιο αν εισαι πχ Σκορπιός με τη σελήνη στον Πλούταρχο;

Ξεφυγα...

Γνωστή μου που εργάζεται σε τμήμα Credit scoring τους ζητήθηκε να κάνουν ανάλυση στα δεδομένα της τράπεζας για να βρούν χαρακτηριστικά πελατών που ειναι κακοπληρωτές.

Βρήκαν λοιπόν ότι ο μήνας γέννησης ηταν στατιστικά σημαντικός!

Σκέφτεστε να δινονται δάνεια με διαφορετικό επιτόκιο αν εισαι πχ Σκορπιός με τη σελήνη στον Πλούταρχο;

Ξεφυγα...

**old stat**- Posts : 86

Join date : 24/12/2009

## Απ: The Mind-Reading Salmon: The True Meaning of Statistical Significance

..και η ιστορία πως προέκυψε αυτό το αποτέλεσμα...

"... Aspirin displayed a strongly beneficial effect in preventing death after myocardial infarction (p<0·00001, with a narrow

confidence interval). The editors urged the researchers to

include nearly 40 subgroup analyses. The investigators

reluctantly agreed under the condition that they could

provide a subgroup analysis of their own to illustrate

their unreliability. They showed that participants born

under the astrological signs Gemini or Libra had a

slightly adverse effect on death from aspirin (9%

increase, SD 13; not significant) whereas participants

born under all other astrological signs reaped a strikingly

beneficial effect (28% reduction, SD 5; p<0·00001).

Anecdotal reports of support from astrologers to the

contrary, this chance zodiac finding has generated little

interest from the medical community. The authors

concluded from their subgroup analyses that:

“All these subgroup analyses should, perhaps, be taken

less as evidence about who benefits than as evidence

that such analyses are potentially misleading.”

See Lancet 2005; 365: 1591–95

and Lancet 2005; 365: 1657–61

"... Aspirin displayed a strongly beneficial effect in preventing death after myocardial infarction (p<0·00001, with a narrow

confidence interval). The editors urged the researchers to

include nearly 40 subgroup analyses. The investigators

reluctantly agreed under the condition that they could

provide a subgroup analysis of their own to illustrate

their unreliability. They showed that participants born

under the astrological signs Gemini or Libra had a

slightly adverse effect on death from aspirin (9%

increase, SD 13; not significant) whereas participants

born under all other astrological signs reaped a strikingly

beneficial effect (28% reduction, SD 5; p<0·00001).

Anecdotal reports of support from astrologers to the

contrary, this chance zodiac finding has generated little

interest from the medical community. The authors

concluded from their subgroup analyses that:

“All these subgroup analyses should, perhaps, be taken

less as evidence about who benefits than as evidence

that such analyses are potentially misleading.”

See Lancet 2005; 365: 1591–95

and Lancet 2005; 365: 1657–61

**old stat**- Posts : 86

Join date : 24/12/2009

Σελίδα

**1**από**1****Δικαιώματα σας στην κατηγορία αυτή**

**Δεν μπορείτε**να απαντήσετε στα Θέματα αυτής της Δ.Συζήτησης