Just For Fun: The Trouble with p<.05

In statistics, a little star next to a coefficient generally means that the result is statistically significant at the p<.05 level. In English, this means that there is only a 1 in 20 chance that the finding just popped up by pure random chance. In sociology, that’s generally considered good enough to conclude that the finding is “real.”

If one investigates a lot of relationships, however, this way of deciding which ones to claim as real has an obvious pitfall. If you look at 20 possible but false relationships, chances are that one of them will be statistically significant by chance alone. Do enough fishing in a dead lake, in other words, and you’ll inevitably pull up some garbage.

Thanks xkcd, for making this funny.

Lisa Wade, PhD is an Associate Professor at Tulane University. She is the author of American Hookup, a book about college sexual culture; a textbook about gender; and a forthcoming introductory text: Terrible Magnificent Sociology. You can follow her on Twitter and Instagram.

Comments 4

dandanar — January 4, 2015

"In statistics, a little star next to a coefficient generally means that the result is statistically significant at the p<.05 level. In English, this means that there is only a 1 in 20 chance that the finding just popped up by pure random chance."

That's not quite right - we wish that p<.05 meant that a finding was only 5% likely to have popped up by chance, but p-values can't actually tell us that. p<.05 actually means that *if* there were no relationship (if the null hypothesis, usually that some correlation is zero, were true), then we would only see data like our data or more extreme less than 5% of the time. But p-values can't actually tell us how likely it was that the null was true from the beginning - and thus can't tell us how certain we should be of a finding after seeing the data.

Hence Ioannidis' provocative claim that "most published research findings are false". These findings all achieved statistical significance, but because of multiple comparisons and other problems, they are still more likely to be false than true. See also Andrew Gelman's body of work on the problems with p-values and multiple comparisons.

physioproffe — January 4, 2015

Benjamini-Hochberg, FTMFW!!

Just For Fun: The Trouble with p - Treat Them Better — January 4, 2015

[…] Just For Fun: The Trouble with p […]

65 new external resources and articles about data science, big data – January 6 | Doclens — January 7, 2015

[…] Just for fun: The trouble with p-values […]