Why I don’t like hypotheses
February 7, 2020 | General | 2 Comments
In the beginning
There was a time when I believed that stating a hypothesis was a crucial part of doing science — it was engrained in my education — starting in elementary school. I think every science fair project I did sported a neat and tidy hypothesis. And then, I dutifully set out in an attempt to “prove” that hypothesis, with a very vested interest in obtaining results toward that end. I certainly didn’t want any failure to harm my chances of getting a ribbon.
Over thirty years later, my own kids were clearly expected to display such a hypothesis for their early elementary school projects too. I did get involved and try to change this later, but clearly remember trying to convince my young kids that they really didn’t need to include a superficial hypothesis on their science fair project board! I tried to make the case that being attached to a particular outcome could lead to them accidentally affecting the results, and the exercise emphasized getting a particular “answer”, rather than investigating and learning. I won some years, but not all. The social pressure is real, even in elementary school.
I have seen many kids convey their conclusion as “my experiment failed because I didn’t prove my hypothesis.” I am not going to venture into the realm of issues with “confirmation” or “proof” here, as the point of this post is meant to be much simpler. We are in the habit of teaching and using superficial affirmative statements as hypotheses and it is not helping science. It might make us feel like we’re adhering to a systematic scientific method (and maybe convince some that we are), but it is not really supporting rigor in inductive inferences. Appealing to a theme of my blog posts — superficial hypotheses are yet another example of how we try to squeeze the process of doing science into frameworks that feel deductive — it’s comfortable to believe and act as if an answer is within reach.
This isn’t just happening in the beginning of our lives as scientists.
But not just in the beginning
I would like to say “Don’t worry, things improve after elementary school.” Unfortunately, I can’t honestly say that. It continues where stakes are much higher than a ribbon in the school science fair (though it doesn’t take much imagination to replace “ribbon” with “publication”, “award”, or “grant”).
What I have witnessed over the years from researchers with PhDs and plenty of grant funding is not much different from what happens in elementary school science fairs. This is not meant to be a commentary about the individual scientists, but the larger system in which they operate that feeds on its bad habits and superficial expectations. The use of superficial hypotheses is a particularly glaring example of one of those habits.
Over the years as a statistician, I have often brought up my negative view of hypotheses with researchers, usually in the context of helping with or reviewing grant proposals — and this is largely met with a look I don’t think I can adequately describe. The best I can do is “uh…-okay lady” or “Seriously?!!” Or, if not in person, it’s met with responses that it is a needed (i.e., expected) component of the paper or grant proposal. I think I have gotten through to a few (mainly those early in their careers and still open to views that might challenge norms) — but I think it’s rare for it to outlast the tidal wave of superficial-hypothesis-demanding culture they are diving into.
Template wording
What do I mean by superficial hypotheses? I’ll give a quick example. I am removing the specific context because the form is more important than the specific context — and this is a wording formula I saw repeatedly, particularly in the context of grants related to human health research. My suggestions to remove or change such wording were generally ignored in favor of staying within the template to increase chances of funding or publication. I have no idea whether it actually did increase the chances — but staying within the template was considered less risky, and I have no real evidence to argue against that.
So, here’s the example (in the context of an NIH grant proposal):
Aim 1: We will determine whether providing children with intervention A affects their academic achievement. Hypothesis 1: We hypothesize that children receiving intervention A will have higher school achievement scores compared to those who do not participate in the intervention.
I could spend more time discussing potential problems with the wording of the Aim and the Hypothesis, but I will force myself to stay on topic for now! Stated in this way, the hypothesis appears to exist to justify the researchers’ vested interest in a particular outcome. And, let’s be clear, they do have a vested interest in that outcome — their career and future grant funding probably depend on it (but that is a deeper part of the problem). Why dress up a very simple prediction and call it a Hypothesis? Does it trick us into thinking that rigorous science is being done through adherence to The Scientific Method? What does it really provide over information that could be included in the Aim or stated in a question?
I lied — I have to go off topic for a moment to mention just one thing about the wording of the Aim. I strongly believe we need to be more aware of when we use the word “determine” inappropriately. And, to me, this counts as an inappropriate context because I do not consider “determine” and “investigate” to be synonyms. This may seem like a subtle difference, but in my mind it is huge — particularly when non-scientists read and internalize that wording. I have had researchers counter my requests to remove the word with “But, it’s just what we use and everyone knows we mean by it.” I disagree — it’s misleading to those who don’t know what you mean by it.
What about statistical hypotheses?
The connection (if any) between a scientific hypothesis and statistical hypothesis (e.g., your “null hypothesis” and your “alternative hypothesis” from intro stats) was originally going to be part of this post. It’s clear now that it needs to be a separate one and will go on the draft list. But, the simple response is — they are not the same thing! It’s possible that statistical hypotheses have contributed to the over-simplified way of stating scientific hypotheses and I guess that’s something worth thinking more about. Regardless, the fact that some researchers think statistical hypotheses are their scientific hypotheses makes me dislike hypotheses even more. Has our use of Statistics helped ruined the potentially positive aspects of forming creative scientific hypotheses? Hmmm.
Strong Inference forgotten?
I was lucky that my first semester of graduate school landed me in a research methods class focused on J.R. Platt’s 1964 article in Science titled “Strong Inference: Certain systematic methods of scientific thinking may produce much more rapid progress than others.” It is worth reading yourself, but the gist is an argument for following systematic steps in the process of carrying out inductive inferences, referencing ideas of Francis Bacon on conditional inductive trees, T.C. Chamberlain on multiple competing hypotheses, and Karl Popper on the importance of falsification.
According to Platt, Chamberlain recognized in 1897 the “we become attached to it” trouble with a single hypothesis. The ideas in Platt’s paper immediately resonated with me as a new graduate student looking forward to a career as a scientist. They fed into my then still naive and romantic notion of what graduate school and science would be like. I suspect they gave me a good push in the direction I’m still headed today.
J.R. Platt wrote the paper because he was worried that “many of us have almost forgotten” important foundations underlying the method of science (in 1964). Sadly, I think it has only gotten worse in last half century, at least in some disciplines. There’s an interesting 2014 commentary in The Journal of Experimental Biology by Douglas Fudge called “Fifty years of J.R. Platt’s strong inference”.
Here’s one of my favorite quotes from the paper that I can’t resist sharing here:
How many of us write down our alternatives and crucial experiments every day, focusing on the exclusion of a hypothesis? We may write our scientific papers so that it looks as if we had steps 1, 2, and 3 in mind all along. But in between, we do busywork. We become “method-oriented” rather than “problem-oriented.”
Pg 348 Platt (1964)
Even the creation of hypotheses is rarely no more than busywork in many fields – to the point I believe we are generally better off without the hypotheses at all (see earlier example). I think it is worse to pretend we’re following a deeper systematic, and creative, process by using wording originally associated with it, than to honestly follow some different process.
Not alone in my dislike
I was genuinely excited, and even relieved, to find a similar opinion conveyed by Stuart Firestein in his book Ignorance: How it drives Science. I had a draft of this post started and its first title was “Why I hate hypotheses” — it has a nicer ring to it, but I decided it might be too strong. Firestein goes ahead and says it.
You may have noticed that I haven’t made much use of the word hypothesis in this discussion. This might strike you as curious, especially if you know a little about science, because the hypothesis is supposed to be the starting point for all experiments.
…
The hypothesis is a statement of what one doesn’t know and a strategy for how one is going to find it out. I hate hypotheses. Maybe that’s just a prejudice, but I see them as imprisoning, biasing, and discriminatory. Especially in the public sphere of science, they have a way of taking on a life of their own. Scientists get behind one hypothesis or another as if they were sports teams or nationalities — or religions.
Page 77-78 Stuart Firestein (2012). Ignorance: How it Drives Science. Oxford Press
And a little more about the specific dangers:
At the personal level, for the individual scientist, I think the hypothesis can be just as useless. No, worse than useless, it is a real danger. First, there is the obvious worry about bias. Imagine you are a scientist running a laboratory, and you have a hypothesis and naturally you become dedicated to it — it is, after all, your very clever idea about how things will turn out. Like any bet, you prefer to be a winner. Do you now unconsciously favor the data that prove the hypothesis and overlook the data that don’t? Do you, ever so subtly, select one data point over another — there is always an excuse to leave an outlying data point out of the analysis (e.g., “Well, that was a bad day, nothing seemed to work, ” “The instruments probably had to be recalibrated,” “Those observations were made by a new student in the lab.”). In this way, slowly but surely, the supporting data mount while the opposing data fade away. So much for objectivity.”
Page 78 Stuart Firestein (2012). Ignorance: How it Drives Science. Oxford Press
What else is there?
I am not quite naive enough to believe that doing away with hypotheses will really change researchers’ vested interests in a particular outcome. But, to dress it up a preferred outcome as an objective scientific hypothesis feels dishonest and unethical to me (those may be strong words, but they do capture my feelings about it). At the very least, it feels like a bad case of bullshit.
What would we be losing if we just gave up the requirement to include hypotheses? In many cases, not much. If scientists are using sets of competing hypotheses in rich ways, they will still do it. They don’t have to be labeled as “hypotheses” to be a useful part of scientific thinking leading to new research. In my experiences, the problem lies in forcing superficial, affirmative statements just for the sake of going through the motion of stating a hypothesis. I picture (perhaps unfairly) the grant reviewer with their checklist laid out in front of them: “Hypothesis? Check. Power analysis? Check.”
Let’s put more energy into great research questions, their justification, and then great experimental design to follow — and label predictions as what they are — predictions (not hypotheses). While they may often look the same in practice, I do think there is a different psychology around them. A “prediction” doesn’t carry the same guise of objective scientific thinking as the word “hypothesis” does. And then maybe our future scientists will learn the difference. A success is designing an experiment to learn something and inform next steps, not to get data to support a hypothesis in one study.
Why is it so hard to get traction to even just have informal conversations with some scientists about the limitations and potential harms of becoming wedded to specific hypotheses early in a process? Why do some graduate students live in fear of their hypotheses not being supported and thus not being able to publish and get their degree? Or researchers living in similar fear relative to future grant funding and tenure? I am speaking in generalities that apply differently to different disciplines, but I assure you it is there through my first hand experiences.
Here’s a bit from Firestein about the important role of creativity:
The alternative to hypothesis-driven research is what I referred to earlier as curiosity-driven research. Although you might have thought curiosity was a good thing, the term is more commonly used in a derogatory manner, as if curiosity was too childish a thing to drive a serious research project.
Stuart Firestein (2012). Ignorance: How it Drives Science
I’m not saying its easy
I do want to acknowledge that letting go of statements of predictions disguised as hypotheses can be more difficult in some research contexts than others — and particularly when testing efficacy of interventions or treatments. But, letting go of superficial hypotheses and bringing in creativity is not irrelevant or impossible in that setting, just because it might look more difficult initially. We need to take steps at being less vested in one obvious outcome — to not balance a career and future funding on an affirmative result for something that cannot be proven.
For example, if a lot of previous (perhaps more mechanistic or theoretical) research points to potential benefits of an intervention, but little or no evidence is found when the idea is first investigated using real humans (and assuming this conclusion is not simply based on a large-ish p-value) — then there are a lot of fun and creative questions to ask that can lead to more research! It not necessarily a dead end and not a failure. Is the instrument chosen for measurement able to get close enough to what we really wanted to measure? Can we improve it? What are the other sources of variability among individuals? Can we control for some of the sources in the future? Are we putting too much emphasis on group averages when really we care about individuals and don’t expect them all to respond similarly to the treatment? Why might some people respond positively and some negatively? And so on.
Buddy the Dinosaur
And, for those of you with kids (or those who just enjoy cartoons and science), here is a link where you can see a video of where I think my kids first learned the word hypothesis — thanks to PBS: “I have a hypothesis.”
References
Firestein, Stuart (2012). Ignorance: How it Drives Science, Oxford Press.
Fudge, Douglas (2014). Fifty years of J.R. Platt’s strong inference. The Journal of Experimental Biology, 217, pp. 1202-1204. doi:10.1242/jeb.104976
Platt, J.R. (1964). Strong Inference: Certain systematic methods of scientific thinking may produce much more rapid progress than others. Science, 146(3642), pp. 347-353.
2 Comments
Martha Smith
Good points. One item in particular well worth quoting: “A success is designing an experiment to learn something and inform next steps, not to get data to support a hypothesis in one study.”
Andrew Gelman
Megan:
Also see here for more along these lines:
http://www.stat.columbia.edu/~gelman/research/published/asa_pvalues.pdf