The generalizability crisis - PubMed
- ️Wed Jan 01 2020
The generalizability crisis
Tal Yarkoni. Behav Brain Sci. 2020.
Abstract
Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned - that is, that the two must refer to roughly the same set of hypothetical observations. Here, I argue that many applications of statistical inference in psychology fail to meet this basic condition. Focusing on the most widely used class of model in psychology - the linear mixed model - I explore the consequences of failing to statistically operationalize verbal hypotheses in a way that respects researchers' actual generalization intentions. I demonstrate that although the "random effect" formalism is used pervasively in psychology to model intersubject variability, few researchers accord the same treatment to other variables they clearly intend to generalize over (e.g., stimuli, tasks, or research sites). The under-specification of random effects imposes far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints can dramatically inflate false-positive rates, and often leads researchers to draw sweeping verbal generalizations that lack a meaningful connection to the statistical quantities they are putatively based on. I argue that failure to take the alignment between verbal and statistical expressions seriously lies at the heart of many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.
Keywords: Generalization; inference; philosophy of science; psychology; random effects; statistics.
Conflict of interest statement
Conflict of interest. None.
Figures

Consequences of mismatch between model specification and generalization intention. Each row represents a simulated Stroop experiment with n = 20 new subjects randomly drawn from the same global population (the ground truth for all parameters is constant over all experiments). Bars display the estimated Bayesian 95% highest posterior density (HPD) intervals for the (fixed) condition effect of interest in each experiment. Experiments are ordered by the magnitude of the point estimate for visual clarity. (A) The fixed-effects model specification in Eq. (1) does not account for random subject sampling, and consequently underestimates the uncertainty associated with the effect of interest. (B) The random-effects specification in Eq. (2) takes subject sampling into account, and produces appropriately calibrated uncertainty estimates.

Effects of unmeasured variance components on the putative “verbal overshadowing” effect. Error bars display the estimated Bayesian 95% highest posterior density (HPD) intervals for the experimental effect reported in Alogna et al. (2014). Positive estimates indicate better performance in the control condition than in the experimental condition. Each row represents the estimate from the model specified in Eq. (4), with only the size of σunmeasured2 (corresponding to σu22 in Eq. (4)) varying as indicated. This parameter represents the assumed contribution of all variance components that are unmeasured in the experiment, but fall within the universe of intended generalization conceptually. The top row (σu22=0) can be interpreted as a conventional model analogous to the one reported in Alogna et al. (2014) – that is, it assumes that no unmeasured sources have any impact on the putative verbal overshadowing effect.
Comment in
-
Psychologists should learn structural specification and experimental econometrics.
Ross D. Ross D. Behav Brain Sci. 2022 Feb 10;45:e29. doi: 10.1017/S0140525X21000108. Behav Brain Sci. 2022. PMID: 35139932
-
Science with or without statistics: Discover-generalize-replicate? Discover-replicate-generalize?
Ioannidis JPA. Ioannidis JPA. Behav Brain Sci. 2022 Feb 10;45:e23. doi: 10.1017/S0140525X21000054. Behav Brain Sci. 2022. PMID: 35139936
-
Lessons from behaviorism: The problem of construct-led science.
Dickins TE, Rahman Q. Dickins TE, et al. Behav Brain Sci. 2022 Feb 10;45:e12. doi: 10.1017/S0140525X2100008X. Behav Brain Sci. 2022. PMID: 35139939
-
Mismatch between scientific theories and statistical models.
Gelman A. Gelman A. Behav Brain Sci. 2022 Feb 10;45:e15. doi: 10.1017/S0140525X21000091. Behav Brain Sci. 2022. PMID: 35139940
-
The four different modes of psychological explanation, and their proper evaluative schemas.
Gilead M. Gilead M. Behav Brain Sci. 2022 Feb 10;45:e17. doi: 10.1017/S0140525X21000145. Behav Brain Sci. 2022. PMID: 35139941
-
Random effects won't solve the problem of generalizability.
Bear A, Phillips J. Bear A, et al. Behav Brain Sci. 2022 Feb 10;45:e3. doi: 10.1017/S0140525X2100011X. Behav Brain Sci. 2022. PMID: 35139942
-
Syed M, McLean KC. Syed M, et al. Behav Brain Sci. 2022 Feb 10;45:e32. doi: 10.1017/S0140525X21000431. Behav Brain Sci. 2022. PMID: 35139943
-
Mechanistic modeling for the masses.
Turner MA, Smaldino PE. Turner MA, et al. Behav Brain Sci. 2022 Feb 10;45:e33. doi: 10.1017/S0140525X2100039X. Behav Brain Sci. 2022. PMID: 35139944
-
Addressing a crisis of generalizability with large-scale construct validation.
Flake JK, Luong R, Shaw M. Flake JK, et al. Behav Brain Sci. 2022 Feb 10;45:e14. doi: 10.1017/S0140525X21000376. Behav Brain Sci. 2022. PMID: 35139945
-
Observing effects in various contexts won't give us general psychological theories.
Donkin C, Szollosi A, Bramley NR. Donkin C, et al. Behav Brain Sci. 2022 Feb 10;45:e13. doi: 10.1017/S0140525X21000479. Behav Brain Sci. 2022. PMID: 35139946
-
Separate substantive from statistical hypotheses and treat them differently.
Dacey M. Dacey M. Behav Brain Sci. 2022 Feb 10;45:e9. doi: 10.1017/S0140525X21000157. Behav Brain Sci. 2022. PMID: 35139947
-
Generalizability, transferability, and the practice-to-practice gap.
de Leeuw JR, Motz BA, Fyfe ER, Carvalho PF, Goldstone RL. de Leeuw JR, et al. Behav Brain Sci. 2022 Feb 10;45:e11. doi: 10.1017/S0140525X21000406. Behav Brain Sci. 2022. PMID: 35139948
-
Impact on the legal system of the generalizability crisis in psychology.
Brewin CR. Brewin CR. Behav Brain Sci. 2022 Feb 10;45:e7. doi: 10.1017/S0140525X21000480. Behav Brain Sci. 2022. PMID: 35139949
-
Without more theory, psychology will be a headless rider.
Hensel WM, Miłkowski M, Nowakowski P. Hensel WM, et al. Behav Brain Sci. 2022 Feb 10;45:e20. doi: 10.1017/S0140525X21000212. Behav Brain Sci. 2022. PMID: 35139950
-
The "'Crisis' Crisis" in psychology.
Medaglia JD, Fernandez KA. Medaglia JD, et al. Behav Brain Sci. 2022 Feb 10;45:e28. doi: 10.1017/S0140525X21000364. Behav Brain Sci. 2022. PMID: 35139951
-
The crisis from above: Gatekeepers need better standards.
Schiavone SR, Bottesini JG, Vazire S. Schiavone SR, et al. Behav Brain Sci. 2022 Feb 10;45:e30. doi: 10.1017/S0140525X21000546. Behav Brain Sci. 2022. PMID: 35139952
-
An accelerating crisis: Metascience is out-reproducing psychological science.
Watson PD. Watson PD. Behav Brain Sci. 2022 Feb 10;45:e36. doi: 10.1017/S0140525X21000121. Behav Brain Sci. 2022. PMID: 35139953
-
A crisis of generalizability or a crisis of constructs?
King KM, Wright AGC. King KM, et al. Behav Brain Sci. 2022 Feb 10;45:e24. doi: 10.1017/S0140525X21000443. Behav Brain Sci. 2022. PMID: 35139954
-
Iliev R, Medin D, Bang M. Iliev R, et al. Behav Brain Sci. 2022 Feb 10;45:e22. doi: 10.1017/S0140525X21000509. Behav Brain Sci. 2022. PMID: 35139955
-
The role of generalizability in moral and political psychology.
Harris EA, Pärnamets P, Brady WJ, Robertson CE, Van Bavel JJ. Harris EA, et al. Behav Brain Sci. 2022 Feb 10;45:e19. doi: 10.1017/S0140525X2100042X. Behav Brain Sci. 2022. PMID: 35139956
-
Causal complexity demands community coordination.
Sievers B, DeFilippis E. Sievers B, et al. Behav Brain Sci. 2022 Feb 10;45:e31. doi: 10.1017/S0140525X21000418. Behav Brain Sci. 2022. PMID: 35139957
-
The cost of crisis in clinical psychological science.
Grubbs JB. Grubbs JB. Behav Brain Sci. 2022 Feb 10;45:e18. doi: 10.1017/S0140525X21000388. Behav Brain Sci. 2022. PMID: 35139958
-
Increasing generalizability via the principle of minimum description length.
Bonifay W. Bonifay W. Behav Brain Sci. 2022 Feb 10;45:e5. doi: 10.1017/S0140525X21000467. Behav Brain Sci. 2022. PMID: 35139959
-
Improving the generalizability of infant psychological research: The ManyBabies model.
Visser I, Bergmann C, Byers-Heinlein K, Dal Ben R, Duch W, Forbes S, Franchin L, Frank MC, Geraci A, Hamlin JK, Kaldy Z, Kulke L, Laverty C, Lew-Williams C, Mateu V, Mayor J, Moreau D, Nomikou I, Schuwerk T, Simpson EA, Singh L, Soderstrom M, Sullivan J, van den Heuvel MI, Westermann G, Yamada Y, Zaadnoordijk L, Zettersten M. Visser I, et al. Behav Brain Sci. 2022 Feb 10;45:e35. doi: 10.1017/S0140525X21000455. Behav Brain Sci. 2022. PMID: 35139960
-
We need to be braver about the generalizability crisis.
Braver TS, Braver SL. Braver TS, et al. Behav Brain Sci. 2022 Feb 10;45:e6. doi: 10.1017/S0140525X21000510. Behav Brain Sci. 2022. PMID: 35139961
-
From description to generalization, or there and back again.
West KL, Soska KC, Cole WG, Han D, Hoch JE, Hospodar CM, Kaplan BE. West KL, et al. Behav Brain Sci. 2022 Feb 10;45:e37. doi: 10.1017/S0140525X21000522. Behav Brain Sci. 2022. PMID: 35139962 Free PMC article.
-
Generalizability challenges in applied psychological and organizational research and practice.
Wiernik BM, Raghavan M, Allan T, Denison AJ. Wiernik BM, et al. Behav Brain Sci. 2022 Feb 10;45:e38. doi: 10.1017/S0140525X21000492. Behav Brain Sci. 2022. PMID: 35139963
-
Exposing and overcoming the fixed-effect fallacy through crowd science.
Cyrus-Lai W, Tierney W, Schweinsberg M, Uhlmann EL. Cyrus-Lai W, et al. Behav Brain Sci. 2022 Feb 10;45:e8. doi: 10.1017/S0140525X21000297. Behav Brain Sci. 2022. PMID: 35139965
-
Publishing fast and slow: A path toward generalizability in psychology and AI.
Lampinen AK, Chan SCY, Santoro A, Hill F. Lampinen AK, et al. Behav Brain Sci. 2022 Feb 10;45:e26. doi: 10.1017/S0140525X21000224. Behav Brain Sci. 2022. PMID: 35139966
-
Wilford R, Ardila-Cifuentes J, Baggs E, Anderson ML. Wilford R, et al. Behav Brain Sci. 2022 Feb 10;45:e39. doi: 10.1017/S0140525X21000285. Behav Brain Sci. 2022. PMID: 35139967
-
Maniadis Z. Maniadis Z. Behav Brain Sci. 2022 Feb 10;45:e27. doi: 10.1017/S0140525X21000273. Behav Brain Sci. 2022. PMID: 35139968
-
There is no generalizability crisis.
Lakens D, Uygun Tunç D, Necip Tunç M. Lakens D, et al. Behav Brain Sci. 2022 Feb 10;45:e25. doi: 10.1017/S0140525X21000340. Behav Brain Sci. 2022. PMID: 35139969
-
Generalizability in mixed models: Lessons from corpus linguistics.
Van de Velde F, De Pascale S, Speelman D. Van de Velde F, et al. Behav Brain Sci. 2022 Feb 10;45:e34. doi: 10.1017/S0140525X21000236. Behav Brain Sci. 2022. PMID: 35139970
-
Measurement practices exacerbate the generalizability crisis: Novel digital measures can help.
Davidson BI, Ellis DA, Stachl C, Taylor PJ, Joinson AN. Davidson BI, et al. Behav Brain Sci. 2022 Feb 10;45:e10. doi: 10.1017/S0140525X21000534. Behav Brain Sci. 2022. PMID: 35139971
-
We need to think more about how we conduct research.
Gigerenzer G. Gigerenzer G. Behav Brain Sci. 2022 Feb 10;45:e16. doi: 10.1017/S0140525X21000327. Behav Brain Sci. 2022. PMID: 35139972
-
Citizen science can help to alleviate the generalizability crisis.
Hilton CB, Mehr SA. Hilton CB, et al. Behav Brain Sci. 2022 Feb 10;45:e21. doi: 10.1017/S0140525X21000352. Behav Brain Sci. 2022. PMID: 35139973
-
Causal analysis as a bridge between qualitative and quantitative research.
Blersch R, Franchuk N, Lucas M, Nord CM, Varsanyi S, Bonnell TR. Blersch R, et al. Behav Brain Sci. 2022 Feb 10;45:e4. doi: 10.1017/S0140525X21000558. Behav Brain Sci. 2022. PMID: 35139974
-
There is no psychology without inferential statistics.
Alzahawi S, Monin B. Alzahawi S, et al. Behav Brain Sci. 2022 Feb 10;45:e2. doi: 10.1017/S0140525X2100056X. Behav Brain Sci. 2022. PMID: 35139976
Similar articles
-
Is psychology suffering from a replication crisis? What does "failure to replicate" really mean?
Maxwell SE, Lau MY, Howard GS. Maxwell SE, et al. Am Psychol. 2015 Sep;70(6):487-98. doi: 10.1037/a0039400. Am Psychol. 2015. PMID: 26348332
-
Kinds of Replication: Examining the Meanings of "Conceptual Replication" and "Direct Replication".
Derksen M, Morawski J. Derksen M, et al. Perspect Psychol Sci. 2022 Sep;17(5):1490-1505. doi: 10.1177/17456916211041116. Epub 2022 Mar 4. Perspect Psychol Sci. 2022. PMID: 35245130 Free PMC article.
-
Why Hypothesis Testers Should Spend Less Time Testing Hypotheses.
Scheel AM, Tiokhin L, Isager PM, Lakens D. Scheel AM, et al. Perspect Psychol Sci. 2021 Jul;16(4):744-755. doi: 10.1177/1745691620966795. Epub 2020 Dec 16. Perspect Psychol Sci. 2021. PMID: 33326363 Free PMC article.
-
Baumeister RF, Tice DM, Bushman BJ. Baumeister RF, et al. Perspect Psychol Sci. 2023 Jul;18(4):912-935. doi: 10.1177/17456916221121815. Epub 2022 Nov 28. Perspect Psychol Sci. 2023. PMID: 36442681 Review.
-
Mayrhofer R, Büchner IC, Hevesi J. Mayrhofer R, et al. Front Psychol. 2024 Sep 12;15:1390233. doi: 10.3389/fpsyg.2024.1390233. eCollection 2024. Front Psychol. 2024. PMID: 39328812 Free PMC article. Review.
Cited by
-
Open and reproducible neuroimaging: From study inception to publication.
Niso G, Botvinik-Nezer R, Appelhoff S, De La Vega A, Esteban O, Etzel JA, Finc K, Ganz M, Gau R, Halchenko YO, Herholz P, Karakuzu A, Keator DB, Markiewicz CJ, Maumet C, Pernet CR, Pestilli F, Queder N, Schmitt T, Sójka W, Wagner AS, Whitaker KJ, Rieger JW. Niso G, et al. Neuroimage. 2022 Nov;263:119623. doi: 10.1016/j.neuroimage.2022.119623. Epub 2022 Sep 12. Neuroimage. 2022. PMID: 36100172 Free PMC article. Review.
-
Rapid online assessment of reading ability.
Yeatman JD, Tang KA, Donnelly PM, Yablonski M, Ramamurthy M, Karipidis II, Caffarra S, Takada ME, Kanopka K, Ben-Shachar M, Domingue BW. Yeatman JD, et al. Sci Rep. 2021 Mar 18;11(1):6396. doi: 10.1038/s41598-021-85907-x. Sci Rep. 2021. PMID: 33737729 Free PMC article.
-
The elusive concept of sexual motivation: can it be anchored in the nervous system?
Ventura-Aquino E, Ågmo A. Ventura-Aquino E, et al. Front Neurosci. 2023 Nov 17;17:1285810. doi: 10.3389/fnins.2023.1285810. eCollection 2023. Front Neurosci. 2023. PMID: 38046659 Free PMC article. Review.
-
Testing, explaining, and exploring models of facial expressions of emotions.
Snoek L, Jack RE, Schyns PG, Garrod OGB, Mittenbühler M, Chen C, Oosterwijk S, Scholte HS. Snoek L, et al. Sci Adv. 2023 Feb 10;9(6):eabq8421. doi: 10.1126/sciadv.abq8421. Epub 2023 Feb 10. Sci Adv. 2023. PMID: 36763663 Free PMC article.
-
Functional Connectome-Based Predictive Modeling in Autism.
Horien C, Floris DL, Greene AS, Noble S, Rolison M, Tejavibulya L, O'Connor D, McPartland JC, Scheinost D, Chawarska K, Lake EMR, Constable RT. Horien C, et al. Biol Psychiatry. 2022 Oct 15;92(8):626-642. doi: 10.1016/j.biopsych.2022.04.008. Epub 2022 Apr 25. Biol Psychiatry. 2022. PMID: 35690495 Free PMC article. Review.
References
-
- Acosta A, Adams RB Jr., Albohn DN, Allard ES, Beek T, Benning SD, … Zwaan RA (2016). Registered replication report: Strack, Martin, & Stepper (1988). Perspectives on Psychological Science, 11(6), 917–928. - PubMed
-
- Alogna VK, Attaya MK, Aucoin P, Bahník Š, Birch S, Birt AR, … Zwaan RA (2014). Registered replication report: Schooler and Engstler-Schooler (1990). Perspectives on Psychological Science, 9(5), 556–578. - PubMed
-
- Baayen RH, Davidson DJ, & Bates DM (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59 (4), 390–412.
-
- Balota DA, Yap MJ, Hutchison KA, & Cortese MJ (2012). Megastudies: What do millions (or so) of trials tell us about lexical processing? In Adelman JS (Ed.), Visual word recognition volume 1: Models and methods, orthography and phonology (pp. 90–115). Psychology Press.
MeSH terms
LinkOut - more resources
Full Text Sources