I came across the term “the average American” multiple times recently when reading about results from studies and surveys — and decided to skip ahead on my list of podcast topics to hit this one briefly.
Our reliance on averages — and the default stance that an average is really what we want — gives me heartburn on a daily basis. But, for now, we might as well start the discussion of averages on this blog by trying to get to know this average American. Who are they? Well, let’s try some ice-breaker questions:
What pronoun do they prefer when referring to themselves? Where do they live — city? suburb? rural town? Where do they lie politically in our highly divisive current political structure? How tall are they? What health problems do they have? What does their family structure look like? What type of employment do they have? Where on the earth are their ancestors from? What are their religious beliefs?
Should we keep going?
Let’s be clear — an average “score” calculated over a group of potentially very diverse Americans, or even worse a group of not very diverse Americans, does not apply to, or define, an average American. Using language as if this score applies to some “normal” person in the United States is misleading at best. It is not just semantics.
Just a quick post about the blog name. I had been searching for a name for over a year when I came up with this one while on a run on my 45th birthday (picture above is from that trail on that day). Critical Inference certainly doesn’t role of the tongue easily — and I almost cast it aside because of that. But then, I realized that is exactly the type of behavior we need to be pushing back against and is directly related to what I want to cover in this blog. The name captures what I want to convey — but it’s not simple or easy, and it takes some justification to make sense of. I shouldn’t shy away from it for those reasons — or apologize for it as I have found myself doing already — I just need to accept and embrace the extra work of justifying it to others. Here’s a visual conveying how the term came into my head that day:
Simply, we need to combine our critical thinking skills with our use of statistical inference in science and decision making. Statistical methods and results are not a substitute for critical thinking. And, it is not enough to say we teach and practice Statistical Inference and Statistical Thinking, and that we care about Critical Thinking. We must care more about what it takes to practice Critical Inference.
This post is to acknowledge the work of Herbert Weisberg in his book Willful Ignorance: The Mismeasure of Uncertainty, published in 2014 by John Wiley & Sons. Anyone relying on statistical methods and results for their work should read at least a few chapters of this book (I suggest started with Chapters 1, 11, and 12 if you realistically won’t read the whole thing — 2, 3, 9, and 10 are next in line!).
I love the term Willful Ignorancein this context. It does an amazing job capturing a fundamental aspect of how we currently tend to do science and support decision making — and we need to be acknowledging it, and then wrestling with it. This is the first book I read on the topic of how we are using statistical inference in science where I felt “Yes! This is what I have been trying to say!” His ideas are eloquently described and he includes an impressive amount of research on the history of probability and statistics — which is crucially tied to issues with how we are using statistical inference today. This post is coming early in my blog because I want to acknowledge Weisberg’s work, ideas, and language, and because I intend to build off some of the ideas and foundations for my own work. You may see quotes from his book show up fairly often to begin with!
So, what does he mean by Willful Ignorance? In order to carry out statistical inference by relying on probability-based models, we must get to the point where we are willing to ignore information (both information we currently have and that we don’t). We must willfully ignore in order to proceed. So, what is the problem with this? Willfully ignoring information can be incredibly useful — just as any model is a (hopefully useful) simplification of reality, so are notions of probability and the application of games-of-chance-like mechanisms to scientific work. The problem lies in a general failure to acknowledge the usually implicit decision to base inferences on probability models. In our scientific culture today, scientists start from the assumption that they will use probability models in their work — as a default — without ever thinking about or justifying the assumptions underlying that decision. It is simply accepted as the way to do science. We “check”, usually inadequately, the more mathematical assumptions associated with particular models (e.g., linear regression), but forget to think about the fundamental assumptions underlying the decision to use probability in the first place. Acknowledging the act of willfully ignoring, no matter how uncomfortable, is a step toward making the choice explicit and forcing us to take responsibility for justification of that choice — through an understanding of its limitations, not only its potential benefits. We are consistently jumping the gun with no warm up — instead, we need to stop, slow down, and contemplate where the starting line really is and if we’re really trained for the race we signed up for.
I will delve deeper in future posts, but end here with a couple of quotes from the Preface and Chapter 1 of Weisberg’s book.
I have proposed that willful ignorance is the central concept that underlies mathematical probability. In a nutshell, the idea is to deal effectively with an uncertain situation, we must filter out, or ignore, much of what we know about it. In short, we must simplify our conceptions by reducing ambiguity. In fact, being able to frame a mathematical probability implies that we have found some way to resolve the ambiguity to our satisfaction. Attempting to resolve ambiguity fruitfully is an essential aspect of scientific research. However, it always comes at a cost: we purchase clarity and precision at the expense of creativity and possibility.
Page xiii Weisberg (2014). Willful Ignorance: The Mismeasure of Uncertainty. John Wiley and Sons, Inc.
Probability by its very nature entails ambiguity and subjectivity. Embedded within every probability statement are unexamined simplifications and assumptions. We can think of probability as a kind of devil’s bargain. We gain practical advantages by accepting its terms but unwittingly cede control over something fundamental. What we obtain is a special kind of knowledge; what we give up is conceptual understanding. In short, by willingly remaining ignorant, in a particular sense, we may acquire a form of useful knowledge. This is the essential paradox of probability.
Page 6 Weisberg (2014). Willful Ignorance: The Mismeasure of Uncertainty. John Wiley and Sons, Inc.
I made a quick list of about 50 blog topics and it’s so hard to know where to start! This one came up in everyday conversation a couple of times for me yesterday (in non-statistical contexts), and I think it is a fundamental aspect underlying what I see to be mis-uses and mis-characterizations of statistical inference in Science and decision making. I also believe it is a substantial root underlying so many other of problems in the world. Right or Wrong. Blue or Red. Significant or Not Significant. Where is the acknowledgment of the gray area between two extremes — where most things lie?
Humans are uncomfortable with the gray area. Humans want answers and strong opinions, and expect each other to give them. Somehow, the use of statistical inference in science evolved over the last 70 decades or so to become a way of taking information from data with its inherent uncertainty and dichotomizing it into “yes” or “no”, or “significant” or “not significant.” I am oversimplifying a little here (but not as much as a I wish I was). Instead of embracing the idea of quantifying a particular type of uncertainty under a set of assumptions and then wrestling with how to use that information relative to a scientific problem, we greatly oversimplify the context by applying inadequately justified thresholds and pretending as if we have a yes or no answer. This is engrained in scientific culture — and now other cultures based on data as well. Every scientist and data user should ask themselves questions: Why do I feel the need for a yes or no answer? Is that really an appropriate end point for this work? Am I oversimplifying? Am I potentially misleading others by presenting the results in the context of a dichotomy? Is this how statistical inference was developed to be used?
I leave these questions out there for now — so much to dig into in future posts. For this post, I just ask you to simply pay attention to when you desire to dichotomize (or categorize in general) when perhaps it’s not justified, or even needed. What is your motivation? Even if it makes a task or discussion more challenging, can you try to get around it and embrace the gray area?
On the first day of my 46th year, I finally created this blog. It then took me another 3 months to actually make it active. Sharing my thoughts in a social media framework is terrifying, but the ideas and desire for momentum won out. I ran out of good reasons not to do it, so here we go …
See the page ABOUT ME (AND THIS BLOG) to get started, and more coming soon!