Rewind the last 1000 years (thanks to Ricky Gervais)
March 16, 2020 | General | 5 Comments
A moment
I had a moment yesterday. One of those moments when you hear the words of someone else and immediately realize those are the words you have been searching for to try to make a point. I owe a thanks to comedian Ricky Gervais for that, and to Sam Harris’ conversation with Scott Galloway (about 10 minutes from the end of Making Sense podcast #189). It was one of those “I have to write this down right now so I don’t forget!!” moments. It was the push I needed to write this post.
Thoughts I have tried to convey (unsuccessfully)
For years, I have expressed my opinion that statistical inference, as currently relied upon in many scientific disciplines, is a historical fluke. I see it as the product of some combination of sociological, philosophical, and psychological factors – not a law of science we “discovered” or a logical result from mathematics. There is no proof we should rely on statistical inference — and the specific methods commonly used — to the extent we do. There was no Scientific Inference Summit of statisticians, philosophers, and other scientists that arrived at a consensus for how we should go about using statistical inference in science.
Instead, current approaches developed as if having a life of their own, often evolving differently in different disciplines. Few scientists are aware enough of methodology beyond their discipline boundaries to see the differences. The variability in dogmatic use of particular methods across disciplines has always been a huge red flag to me — if things were truly settled, wouldn’t they have settled at the same place? Statisticians, in the unique position to work in many disciplines simultaneously, see this. To me, scientists adopting discipline-specific statistical methodological dogma is analogous people unquestioningly adopting the religion they are born into. Why not at least consider the others, or more importantly, what it means that there are others?
Scientists who rely heavily on statistical methods and inferences to do their science are rarely aware of the foundations and history that led them to that practice. Questions like “How did we end up relying on this method?” and “Why do I feel expected to use this method in my work?” are ignored in favor of time spent on technical skills needed to carry out the methods in practice — with justification conditional on the answers to questions that are not asked.
Methods of statistical inference are sold to future scientists and the rest of the public as the way to do research, an integral part of the scientific method, and as an objective way of doing science. This attitude, its perpetuation, and blind acceptance of methods are serious problems for science… and society.
The eye rolls
Okay, this is where I think I start to lose people (if I haven’t lost them already). I can see the eye shifts, the less subtle eye rolls, the transitions to blank faces, and other physical responses in reaction to my alarmist-sounding words. I can hear the voices. “Here we go again…” “Can we focus on doing science and be less philosophical?” “What else is there?” “What do you expect me to do?” “I don’t have time for history … I assume others have checked this or we wouldn’t be doing it.” “It wouldn’t have been taught in university if there wasn’t broad acceptance and justification.” And so on.
There is some hump that I’m rarely able to get over. I need a different strategy. I need a something that doesn’t just lead to zoning out or defensiveness. Maybe Ricky Gervais can help me out.
The thought exercise we need, thanks to Ricky
If we destroyed all the work of the last roughly 1000 years (or rewound 1000 years) and started over again — what would reappear in the same form as we currently have it?
Gervais makes this point relative to religion — arguing that now existing fiction and holy books would not appear in another version of history, while much in astronomy, physics, and chemistry would reappear. This thought exercise seems is an incredibly important one, and a clever definition of what counts as hard science. Our physics and chemistry books might look similar to how they look now, but how broadly does this apply across what we currently call science? What information conveyed through text books now (as if fact) would not likely reappear in a re-play of the last 1000 years? This is meant to be a fun intellectual exercise, not to inspire fear or defensiveness — we do not as a society assign value to work based only on the answer to this question.
Now, back to my discipline — Statistics reaches across disciplines and affects how we do a lot of science and decision making. Methods and skills for carrying out those methods (calculations and computing) are presented in textbooks, courses, and through mentoring as if they are beyond questioning. Information is presented as if it would re-appear in another version of history. But, I don’t believe it would. I would love to have someone convince me otherwise, but I have thought a lot about this over the last twenty years.
This does not imply that I believe we should not be using statistical inference or that it isn’t valuable — only that I believe we should change our attitudes toward it. Inference is crucially important to science and decision making. We need to study it and debate it — and not treat the problem as if it is already settled.
Maybe we would have some version of inference based on probabilities (assuming we even get to the notion of probability, which isn’t clear to me either), but I cannot even begin to convince myself that statistical inference as carried out in practice today would look anything like what we have now. I believe the probability is zero that our Statistics textbooks would look as they do today, or even that we would have something called Statistics textbooks.
I suspect those who still equate statistics with probability theory may disagree with me. Yes, there are deductive mathematical foundations, but they are conditional on acceptance of so much more. I am worried about what we are conditioning on, not the mathematics and computing we do after the fact. It is one thing to study and develop the mathematics underlying games of chance, and quite another to apply the work to hard and messy scientific questions in need of inductive inference.
A dose of humility toward our methodology
If I’m right (or even just could be right) that statistical inference as practiced today would not reappear, what does this say about how we are practicing Science today? At the very least, we need to stop acting as if we are certain the current version would reappear.
Inference is hard and inference should be hard. Statistical methods can only take us so far. Scientific inference is far larger than statistical inference — we need to stop pretending as if statistical inference is the scientific way to get us scientific inferences.
Again, I am not saying we need to (or even should) give up statistical inference and current statistical methodology. I am just appealing for a more humble view of our methods and a shift in attitudes around how we use them. They are not a law of nature, they are a conceptual human invention whose evolution has been greatly influenced by flukes, social structure, and human psychology. I believe we find ourselves in a difficult situation because they have grown to serve the non-scientific purpose of making inferences and decisions feel easier and more comfortable.
Hear it from Ricky Gervais
The moment came from his thoughts between 3:40 to 4:00, and it obviously resonated with Stephen Colbert as well.
5 Comments
hydrodynamicstability
Thank you for writing this. I too have had thoughts along these lines for years — that statistical inference in its modern form is not historically inevitable, and in fact, is highly artificial. I would hope that certain other parts of statistics, such as the principles of good study design (randomization, replication, blocking, blinding, concurrent control, etc.) would be recreated in the thought exercise. I also welcome the rise of Data Science, as it gives us a new opportunity — in our very own civilization! — to see what intelligent people, largely unencumbered by the baggage of statistical training, might create in its place, as they grapple with 21st century data problems. In these early days there may be as many examples of flawed solutions as good ones from data science, but this is to be expected in a young discipline.
Meanwhile I too have seen appalling examples of discipline-specific statistical methodology that seemed dogmatic rather than fit for purpose. However, I believe that ideally it is conceivable that each scientific problem could have a tailored data analytical method, rather than defaulting to statistical inferential methods of any kind. Beyond the highly controlled arena of phased clinical trials, and perhaps a few disciplines that emphasize empirical validation even more than statistical inference (eg, GWAS, and most of physics), statistical inference sadly seems to do more harm than good. https://doi.org/10.1080/00031305.2018.1518264
MD Higgs
Thanks for the comment. I like your phrase “statistical inference in its modern form is not historical inevitable.” But it also made me realize that as part of this conversation we need to think more about the distinction between (1) statistical inference as justified by theoretical foundations and (2) statistical inference as often carried out in practice (often not well justified by theoretical foundations). I often try to distinguish between these as “statistical inference” and “use of statistical methods and results in practice,” but I realize that language does not adequately convey what I’m after. Even if neither of the two were historically inevitable, the second seems to me to be much more extreme — as evidenced by different disciplines displaying their different dogmatic practices based on the same theoretical foundations.
I think my general cynicism is less about statistical inference in its modern form and more about the lack of humility towards it. I see that lack of humility rearing its head more in (2), but I think it stems from lack of awareness and humility about (1). I do not think we should give up using statistical inference in science, I just think we should change our attitude toward it. Treat it for what it is — a tool we currently have to help make sense of information through use of particular models and their assumptions — rather than an answer finder and decision-maker. I have no doubt a lot of good has been done in the world with the help of probability and statistical models, but we need a greater awareness of limitations and some help about how to talk about those limitations in practice. The awareness and humility might then filter into how we make inferences using them — from choice of language to more formal justifications.
Unfortunately, I do not share your optimism about methodology typically labeled as “data science,” though I’m trying to get myself there. I see the same underlying problem — a lack of humility toward methods and their results — and maybe even to a greater extent because there is even less training in INFERENCE and more training in carrying out cool sounding methods. Uncertainty and assumptions are still ignored, and maybe do a greater degree. However, the type of problems we care about are evolving and we should adapt. I hope part of adaptation will be taking the time to think through and learn to talk about limitations and assumptions — and less on over-selling methods and their results. We shouldn’t forget basic principles of design, the importance of questioning the information going in, and justifying the inferences coming out.
hydrodynamicstability
To me, the limitations of statistical inference are so profound that the inference framework can’t actually address many real scientific problems, except in designed surveys and similar controlled sampling experiments, highly pre-specified randomized clinical trials, and perhaps a few others. For example, Fisher’s sample-to-population inference concept is wholly inapplicable to the analysis of most observational data. (Much of data science is based on observational data, except for A/B testing.) Because of model uncertainty and researcher degrees of freedom, statistical inference is invalid for exploratory research more generally. Statistical modeling remains essential for such work, it’s just that the statistical inferences derived from such models are biased to an unknown extent, and cannot be taken at face value. Hence my emphasis on its use only in disciplines where empirical validation — reliance on the continual assessment of new data, rather than mathematical/computational manipulations from single sets of data to establish “significance” — is woven into in the research culture. (A more extended argument for this point of view is given in the paper I linked above.)
I agree about the lack of humility in data science, but I doubt that the statistical inferential framework would necessarily improve their discipline. The very use of probability mathematics to characterize uncertainty can sometimes be very misleading, as argued by Nassim N. Taleb (The Black Swan, 2010), Herbert Weisberg (Willfull Ignorance, 2014), John Kay and Mervyn King (Radical Uncertainty, 2020), and others; its monopoly should be challenged. Thus I agree with you about the lack of humility about statistical inference as well.
MD Higgs
First, I apologize for somehow missing the link you included in your first comment and appreciate the motivation to read the paper again: Tong (2019): Statistical Inference Enables Bad Science; Statistical Thinking Enables Good Science.
I agree with what you say in your comment, and very much agree in general with the views in the paper (and appreciate the fact it was written and included in the supplemental issue of the The American Statistician). In my opinion, it added much needed balance against the search for practical and immediately implementable “alternatives to p-values.”
I did not mean to imply that I think the statistical inferential framework (as I think you are envisioning it) would improve data science — only that I worry about many of the same things relative in data science that I worry about relative to use of statistical methods.
Thanks too for getting me thinking more about a reasonable definition for what I think of as “statistical inference” — I did a poor job conveying my view of it in my reply to your first comment. My use of “statistical inference” is much closer to that described as “statistical thinking” in Tong (2019). I see experimental and sampling design concepts, discussion of assumptions and limitations, considering appropriate scope of inference, acknowledging sources of variability and uncertainty not captured (as much as is possible to identify them), discussing what is willfully ignored (appealing to Weisberg 2014), etc. So, when I say “we should not give up statistical inference in science” — I mean we should hold onto the parts that are beneficial to science, not that we should continue with current methodology (or what many people would label “statistical inference”). I am no longer a typical practicing statistician because of my deep disagreement with our use of statistical methods to reach unjustified conclusions and decisions.
Thanks again for the thoughtful and useful comments!
hydrodynamicstability
I admire your career decision to not be a “typical practicing statistician”. I hope others will be inspired to make the same choice. Thanks for the good discussion.