If the recent so-called crisis in psychology has highlighted anything, it is the prevalence and danger of post hoc narratives. Although statistical practices (e.g., use if significance testing) have gotten much of the blame -- at least in my corner of the research world -- the main problems are actually a level or two above that. Combining ill- or flexibly-defined theoretical concepts, post hoc reasoning, and publication bias yields a potent mixture that I would argue is responsible for the crisis.
I have been thinking about this in the context of assignments that we ask our undergraduates to do, and how we actually train post hoc reasoning early on. Here I'll offer examples of two undergraduate assignments that I think reward scientists-in-training for their BS generation skills. I'll also elaborate on what I think we can do about this.
|The "Texas Sharpshooter" paints the target around his bullet holes.|
Example assignment: Critique a peer-reviewed articleThe assignment: Students are assigned an article from a peer-reviewed psychological journal and asked to critique it. Ideally, they develop choose a few critiques for which they argue for in their essay.
The basic problem with this assignment is that students are not particularly well-versed in any particular psychological topic, nor in psychological research methods. On the other hand peer-reviewed articles have been reviewed by people who are, which means that whatever problems remain with the research have evaded skilled reviewers. This is not to say that peer-reviewed research does not have major problems, but it does mean that students who have had only a few basic courses and do not have much experience in reading peer-reviewed research are unlikely to be able to find good quality-critiques spontaneously.
Upon reading such an article and having difficulty find a critique, a student is in an awkward position: they must write an essay. So what do they do? They come up with whatever critiques come to mind, which are likely to be low-quality critiques. I suspect readers of this blog have experienced these sorts of critiques in student assignments: maybe there are cultural differences? The sample seems small. Are these really the best stimuli to use? They must choose a number of these arguments, and argue for them, in spite of the fact that they don't have sufficient knowledge on which to base such a critique. We're training them in the fine art of bullshit.
This is not to say that these problems don't occur in some studies. But forming a good argument why takes specialized knowledge they they don't yet have, so we get back noise from the students. And who gets the best marks for such an assignment? Students who can write clearly about things of which they have little actual understanding.
We have to ask ourselves: is it any wonder that we have a replication crisis?
Example assignment: Do an experiment and interpret the resultsThe assignment: Students are asked to perform a simple experiment (often in groups), analyze the data, and report the results. They must interpret the results in light of the research they've read (often primarily the textbook).
Experience doing simple experiments and analyzing the results is critical to a psychologist-in-training. But how the assignment is framed and marked is critical to whether we are training the skills we want. Students in chemistry, biology, and physics all perform easy experiments and report the results; this is as it should be.
What is different about interpreting the results of a typical psychology experiment and that of a chemistry experiment is that there are very strong reasons to expect something specific to happen in the chemistry experiment. If the psychology experiment doesn't come out as the textbook predicts, though, they must describe why that might be. There are, of course, a hundred possible reasons why this might be the case, including the possibility that the original study was wrong, statistical noise, and sloppiness in their experimental procedure.
But these explanations will not be the ones they will explore. We require students to show creativity and independent reading/thought. In an assignment like this, students know that the best way to get a good mark is to find a paper whose logic might predict the results obtained, and include a cogent argument why this might have caused the differences. The students turn in the paper, and will not test their hypothesis, of course; the argument is simply thrown in to get a better mark. The students who do the most independent reading and form the best-sounding argument will get the best mark.
This should all sound eerily familiar: we are training them in the time-honored tradition of post hoc arguments for "hidden moderators".
Fixing the problemsIf we want to train good psychologists, we must be very sensitive to the skills we're actually teaching, as opposed to those we think we are teaching. The practices in the field will be a reflection of what students are taught. How might we use assignments to train critical thinking, without teaching bad practice?
Critiquing pop scienceThe problem with the critique of the peer-reviewed article is that students are unlikely to be able to spot the real problems with the article. This is somewhat like asking first-year sports therapists to critique a professional sports player's technique; the imperfections are simply too fine, because the professionals have been honing their craft with help for years. It would be better to ask them to critique amateur sports players' techniques, because they will have more glaring problems.
Unfortunately, there is no "amateur" peer-reviewed research. There is, however, a lot of very bad non-peer-reviewed pop science. Psychologists-in-training would benefit from assessing bad popular science (not just popular psychology) assessing, for instance, spurious claims of causation (vs correlation), overblown effect sizes, and mismatches between what is claimed about a research in a pop article versus what was actually done. Critiquing popular science develops the similar skills as critiquing a peer-reviewed article, without the unfortunate side effect of asking students to BS their way to a good mark.
Separating critiques of method from critiques of resultsCritiquing methods along side of the results leads to the unfortunate asymmetry that if an experiment yields the expected result, the methods are not critiqued, whereas if it doesn't, the students are encouraged to generate BS reasons why it might not work, with no expectation of testing those reasons. If students were asked to critique methods by themselves, then they would not be rewarded for such post hoc reasoning. Moreover, in an essay of typical length, this leaves more room to discuss reasons why the methods are problematic; for instance, if the sample size is problematic, a methods-only critique would allow space for a power analysis. In a methods+results critique, I often see critiques of sample sizes with no corresponding argument why the sample size is a problem.
Being specific about potential critiquesIn whatever assignments we give to undergraduates, we should be specific about what sorts of critiques we are expecting, preferably giving a short list of possible critiques. The students will still have to read the target article, but instead of giving a shot in the dark and being forced to argue for it, students will be forced to ask, for instance, "Does this research suffer from a confound with X?", or "is this experiment sufficiently powered to detect an effect size of Z?" or "Does this DV represent a good operationalization of W?"
Perhaps, for instance, power is not a problem; they would then be in a position to argue that, yes, the experiment is sufficiently powered, instead of always (vaguely) attacking an article. Always asking for critique teaches the students that critical thinking is about dreaming up as many ways to attack an article as possible, and then forming a plausible-seeming argument around them. In contrast, being very specific about possible critiques -- that may not, in fact, turn out to be problems -- will develop critical thinking and argumentation skills better.
If we believe psychology is in crisis, we should look at the way we train undergraduates to see if part of the problem lies there. I think the crisis in psychology is reflected in some ways in our training. Doing better is not just about better statistical training or better open science training; it is also about ensuring a match between what we think we are teaching and what we actually teach.