Thee uncertainty

Home / Thee uncertainty

Thee uncertainty

June 5, 2022 | General | 2 Comments

I’m back for a long overdue post as I reflect on my number of years on this planet. I officially started this blog 3 years ago on this day when the name came to me on my annual birthday trail run/hike (the blog picture is from that day – a beautiful Montana May dusting of snow). It marked a hard left turn in my career path, which was followed by many unexpected sharp turns in my life path. The turns keep coming, but my relationship to them keeps changing for the better.

It is hard to sum up the last three years, but if there’s one word that works – it’s uncertainty. So much change, so much unexpected, so much joy, so much pain, and so much learning. The uncertainty explains the lack of consistent posts here, but even when I’m not posting I credit the existence of this blog as helping me synthesize my thoughts into mental posts and still often record thoughts and ideas in draft posts. And, one of the things I am forced to learn through the uncertainty is self-compassion for not publishing many posts – not an easy one for me, but very important. I now realize that most of the recent ideas I have written draft posts for hit on the theme of uncertainty in some way, so it seems a good point to jump back in on.

Before I go on, I want to acknowledge that *uncertainty* is very complicated and unsettled concept and today I am certainly (hah!) not digging deep into foundations or historical and philosophical arguments about the meaning of uncertainty.

I have been involved in several conversations in the last few months where the term “the uncertainty” has come up; in the context of Statistics, it’s variations on phrases like “we want to quantify the uncertainty” or “get the uncertainty right.” These are said with a plain “the”, but I think it’s worth playing with the “thee” to emphasize my point because I don’t think the implicit interpretation is much different. I know I have uttered such phrases myself over the years, but I am now trying to be more aware. And I don’t think it’s a silly, picky wording issue. It matters if it subconsciously affects how we interpret what a method is capable of and it affects how results are interpreted.

Describing the goal of statistical inference as “getting the uncertainty right” sounds okay on the surface, but has implications that I worry are not good for science. I suppose it is fine to state as a goal – as long as it is clear that the goal isn’t one that can be fully met, but I don’t see that as part of the conversations. Can trusting the statement that we can “get the uncertainty right” lead to overconfidence in statistical results, giving them too much authority, or taking them to represent something they simply don’t, and can’t? Accentuating the “the” by replacing it with “thee” makes the problem more obvious.

It is attractive to quantify. It feels good to feel that we’ve captured “thee” uncertainty using our statistical methods – that we’ve taken messy uncertainty and turned it into a tidy feeling interval (or some other summary). It not only feels good, but it’s expected. An expectation to attempt to quantify uncertainty isn’t necessarily a bad thing, as long as the specific expectation is reasonable. There is a big difference between quantifying “thee uncertainty” and quantifying “some uncertainty.” We talk as if we can get to “thee,” but we’re always really doing “some” and we don’t even do a good job understanding or communicating about what is included in that “some.” We’re usually leaving out a lot of the story, sweeping it under the rug, or more commonly nonchalantly tossing it into the closet so it’s out of sight.

A crucial question, even if we often can’t answer it easily or satisfactorily, is “What sources of uncertainty are we actually quantifying? And what assumptions are we relying on to do it?” How often do we attempt to convey what sources of uncertainty are actually captured in a standard error, interval, or posterior distribution? The complementary question is just as crucial: “What sources of uncertainty are we NOT quantifying?” This one blows up quickly for most real-world problems, particularly in an observational study setting, but that frustration is important information in and of itself.

How much is missing vs. captured in reported quantities (meant to convey something about uncertainty in another quantity) of course depends on the context, the data, the model, the question, etc. However, except for very well controlled laboratory studies with random assignment, or sampling exercises using finite and very homogeneous populations, “thee uncertainty” is really “some sources of uncertainty under a bunch of assumptions.” Somehow the latter doesn’t sound nearly as appealing, but it’s more honest and I think science would be better off if we were better at acknowledging that. When statisticians imply any and all uncertainty can be captured by our methodology, and that we can “get the uncertainty right,” we’re already sending a misleading, and in my view harmful, message. Maybe if we do a better job working with researchers to attempt to articulate what is represented in an interval, for example, the limitations of methods would be clearer, and over-statements of results would be lessened. Maybe we wouldn’t give so much authority to those attractive, tidy seeming results.

Progress doesn’t come from pretending we have harnessed uncertainty by hiding the mess and chaos driving that uncertainty. Progress comes from acknowledging the mess and our part in it, and learning to change our relationship to the uncertainty. We can’t get rid of the uncertainty, we can only change how we deal with it.

(Note: I wrote the draft of this post on May 18th, 2022)

About Author

about author

MD Higgs

Megan Dailey Higgs is a statistician who loves to think and write about the use of statistical inference, reasoning, and methods in scientific research - among other things. She believes we should spend more time critically thinking about the human practice of "doing science" -- and specifically the past, present, and future roles of Statistics. She has a PhD in Statistics and has worked as a tenured professor, an environmental statistician, director of an academic statistical consulting program, and now works independently on a variety of different types of projects since founding Critical Inference LLC.

2 Comments
  1. Grant Reinman

    Well written post. Uncertainty due to the data acquisition process, measurement process, and model form are difficult to quantify and thus often ignored. I like the way you ended, but I feel that examples of better ways to deal with this ‘messiness’ are needed.

    • MD Higgs

      Thank you for the comment (and sorry for the slow response!). I agree and understand the desire for examples of better ways to deal with the ‘messiness.” Unfortunately, I am not providing any in this response to your comment, at least not yet. But perhaps more interesting to me is that I don’t envision such examples as being very satisfying to many people — and I find that an important part of a larger discussion. “Better ways to deal with this ‘messiness'” will themselves be messy; as unfortunate as that is for those who would like a cleaner, more defined path forward. There’s no paved bike path that looks the same regardless of the ground it’s laid over, but instead a rough trail that reflects the geology and contours of the specific earth it cuts through, without too much flattening and straightening. Part of what’s missing is the willingness (and even excitement for) wrestling with the underlying problem of how to make and justify reasonable inductive inferences in a particular research context. It seems to me that this is where research in an area should start and this is the entry point to honoring uncertainty beyond “thee” limited uncertainty that may end up being captured by some combination of design, data collection, model, and interpretation chosen before thinking explicitly about how to make the desired inductive inferences given inherent limitations (including all the sources of uncertainty). Maybe ways to guide researchers through such a process, and examples describing the process for a few problems, would be a good starting point. Thanks again for the comment – I appreciate the chance to think more about it.

Leave a Reply