Null hypothesis significance testing (NHST) has long been the method of inference in psychology, yet its flaws have been well documented. Both Cohen (1990) and Schmidt (1996) highlight the dangers of relying on statistical significance as the primary criterion for evaluating research findings, yet they do underline various aspects of the problem. Both critiques reveal that NHST not only misleads individual researchers in their interpretation of results, but also obstructs psychology’s ability to build cumulative scientific knowledge. 

Cohen (1990) directed attention to the conceptual and interpretive flaws rooted in NHST. He observed that many researchers misunderstand the meaning of a p-value, mistakenly interpreting it as the probability that the null hypothesis is true. P-values reflect the probability of obtaining the observed data if the null hypothesis were true, which is a very different proposal. Cohen described the ritualistic use of p < .05 as a kind of secular religion, in which illogical cutoffs dictate scientific conclusions. More importantly, Cohen (1990) noted that statistical significance does not speak to the magnitude of an effect. Large samples can render trivial differences statistically significant, while small samples often fail to detect meaningful effects. This misalignment between significance and practical importance has, in Cohen’s (1990) view, diverted attention away from effect sizes, confidence levels, and statistical power. He urged psychologists to focus on estimating the size and importance of effects rather than celebrating the mere achievement of significance. 

Schmidt (1996) on the other hand emphasized the consequences of significance testing. He urged that NHST undermines the cumulative progress of psychology by fostering publication bias, wherein studies that fail to reach significance are suppressed in the file drawer. This selective publication inflates the evident robustness of effects and creates a biased scientific record. Schmidt (1996) also highlighted how NHST is greatly influenced by sample size. Small studies are typically underpowered and prone to Type II errors, whereas very large samples can make even slight effects appear statistically compelling. As a result, NHST provides a poor basis for theory building and instead encourages fragmented finding that fail to accumulate into reliable knowledge. According to Schmidt (1996), the problems with NHST are not solely methodological but structural, hindering discovery and limiting psychology’s growth as a cumulative science.  

References

Cohen, J. (1990). Things I Have Learned (So Far). American Psychologist, 45(12), 1304–1312.

Schmidt, F. (1996). Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers. Psychological Methods, 1(2), 115–129.

    Would you like to discuss this project or get it done? 

    Reach out on WHATSapp at +1 (240) 389-5520

    Or 
     Place an order on our website for quick help

    Guarantees

    A+ Paper
     Timely Delivery
     Zero Plagiarism
     Zero AI


    Leave a Reply

    Your email address will not be published. Required fields are marked *