UK Psychologist, Monica Harris’, take on Steve Ziliak’s talk, “The Cult of Statistical Significance”
Social psychologists
have been making Ziliak’s argument for a very long
time. The best known “classic” article along these lines is Meehl’s:
·
Meehl,
P. E. (1978). Theoretical risks and tabular asterisks: Sir
Karl, Sir Ronald, and the slow progress of soft psychology. Journal of
Consulting and Clinical Psychology, 46, 806-834.
The first
half is a ringing indictment of significance testing as it is (improperly) used
in psychology; his best quote is “the almost universal reliance on merely
refuting the null hypothesis…is a terrible mistake, is basically unsound, poor
scientific strategy, and one of the worst things that ever happened in the
history of psychology” (p. 817). The second half of the article presents
a different way of doing things (consistency tests) that never caught on and
could be safely skimmed/skipped.
Another
classic article is Jacob Cohen’s, which has one of the best article titles I’ve
ever seen:
·
Cohen,
J. (1994). The earth is round (p <.05). American
Psychologist, 49, 997-1003.
This should
be required reading for every social scientist (and is for my students).
The Rosenthal
and Rosnow article below covers other issues in
addition to effect sizes estimates, but it is essential reading for its classic
line, “surely, God loves the p of .06 nearly as much as the p of .04,” a line
that I have used with varying success to get journal editors to accept results
in my manuscripts that just miss traditional levels of significance.
·
Rosnow,
R. L., & Rosenthal, R. (1989). Statistical procedures and the justification
of knowledge in psychological science. American Psychologist, 44(10),
1276-1284.
And, I am by no means suggesting that
I belong up there in the hallowed company of Meehl,
Cohen, and Rosenthal, but I might humbly recommend my own little effort in
making a case for effect sizes:
·
Harris,
M. J. (1991). Significance
tests are not enough: The role of effect size estimation in theory
corroboration. Theory and Psychology, 1, 375-382.
Fairness
compels me to suggest two articles that gamely struggle to make a case for
significance testing:
·
Frick,
R. W. (1996). The appropriate use of null hypothesis testing.
Psychological Methods, 1, 379-390.
·
Abelson,
R. P. (1997). On the surprising longevity of flogged horses:
Why there is a case for the significance test. Psychological Science,
8, 12-15.
In my methods
course, I assign all these articles and then attempt to hold a rousing debate
on the role of significance testing in social psychology, a debate that usually
goes nowhere fast as the students wisely hustle to line up on my side.
;-)