Harris on S. Ziliak

UK Psychologist, Monica Harris’, take on Steve Ziliak’s talk, “The Cult of Statistical Significance”

Social psychologists have been making Ziliak’s argument for a very long time. The best known “classic” article along these lines is Meehl’s:

· Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806-834.

The first half is a ringing indictment of significance testing as it is (improperly) used in psychology; his best quote is “the almost universal reliance on merely refuting the null hypothesis…is a terrible mistake, is basically unsound, poor scientific strategy, and one of the worst things that ever happened in the history of psychology” (p. 817). The second half of the article presents a different way of doing things (consistency tests) that never caught on and could be safely skimmed/skipped.

Another classic article is Jacob Cohen’s, which has one of the best article titles I’ve ever seen:

· Cohen, J. (1994). The earth is round (p <.05). American Psychologist, 49, 997-1003.

This should be required reading for every social scientist (and is for my students).

The Rosenthal and Rosnow article below covers other issues in addition to effect sizes estimates, but it is essential reading for its classic line, “surely, God loves the p of .06 nearly as much as the p of .04,” a line that I have used with varying success to get journal editors to accept results in my manuscripts that just miss traditional levels of significance.

· Rosnow, R. L., & Rosenthal, R. (1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44(10), 1276-1284.

And, I am by no means suggesting that I belong up there in the hallowed company of Meehl, Cohen, and Rosenthal, but I might humbly recommend my own little effort in making a case for effect sizes:

· Harris, M. J. (1991). Significance tests are not enough: The role of effect size estimation in theory corroboration. Theory and Psychology, 1, 375-382.

Fairness compels me to suggest two articles that gamely struggle to make a case for significance testing:

· Frick, R. W. (1996). The appropriate use of null hypothesis testing. Psychological Methods, 1, 379-390.

· Abelson, R. P. (1997). On the surprising longevity of flogged horses: Why there is a case for the significance test. Psychological Science, 8, 12-15.

In my methods course, I assign all these articles and then attempt to hold a rousing debate on the role of significance testing in social psychology, a debate that usually goes nowhere fast as the students wisely hustle to line up on my side. ;-)