Trust is complicated
December 17, 2019 | General | 6 Comments
I’ve been thinking a lot about trust again lately — in science, and in life more generally. As I gather more experiences, I find that trust becomes more complicated. It’s complex in personal relationships with other humans, and complex within the ways we do science (which inherently involve other humans).
I often find myself returning to the concept of trust when I try to dig into why we have come to rely on certain ways of using statistical methods and results — especially those that seem to have obvious flaws that have been publicly identified (by me and many others). My thoughts generally settle in the feeling that people are too willing to trust things they don’t understand. This is counter to a phrase we often hear — that people don’t trust things they don’t understand. This last version makes more logical sense to me, but it isn’t what I have experienced as a statistician in the context of doing science.
I have also come to realize that pointing a finger at issues with trust, or even lack-of-trust, is far from straightforward. Our ability, and desire, to trust is both a good and bad thing for science — and trying to analyze it seems to lead to contradictions and further challenges. With that in mind, I use this post just to share my first layer of thoughts on the subject. Like so many of the topics I have been writing about, degree of trust exists on a continuum from no trust to complete trust — and in the post I recognize I walk dangerously close to falling into another one of our false dichotomies (trust or no trust). I will try to be careful.
I strongly believe we should all try to be more in tune about when and why we exhibit a large (or small) degree of trust without questioning, or even when we definitely know better. And, when and why do we make what seem to be reasonable judgements of trust? My current opinion is that we tend to trust too much, particularly in the process of doing science, despite scientists holding on to a reputation of healthy skepticism. For most (not all) of the scientists I have worked with over the years, I don’t believe they deserve the healthy skeptic label, at least not across all aspects of their work. I don’t mean this in a disrespectful way — I am just being honest and acknowledging the impact of our humanness in the process of doing science.
The act of deciding to trust something (or simply not thinking about how much it should be trusted before trusting it to a large degree) is complicated. It depends on so many ancillary aspects of life that may have little, if nothing, to do with a logical and rational approach to assessing degree of trust warranted for a situation. For example, it depends on our mood, on who’s telling us to trust (or not to trust), and on the implications of the decision in terms of making our lives easier or harder. In short, I believe our decisions about how much to trust something are largely self-serving — will it serve us better to place a lot of trust in it, or to place little trust in it? The answer to that question seems to depend heavily on our current situation, and less on a rigorous practice of digging into details and critically evaluating evidence. The more rigorous alternative is simply too onerous of a task for an individual to undertake as often as would be needed. Because of the this, judgements about trust are often not individualized tasks. Systems are established over time as largely social constructs that seem to give individuals a free pass on rigorously assessing trust — allowing decisions to made quickly and with little pain. The large degree of trust I observe scientists giving to statistical methods and their results seems to come from such a system.
Heavily relying on statistical inferences for research and decision making requires an substantial amount of trust. Over-relying on simple statistical summary measures without a deep understanding of the information they contain (e.g. p-values, confidence intervals, posterior intervals, effect sizes, etc.) requires an enormous amount of trust. There are so many layers of trust that have to be in place to even get to the obvious layer of assessing trust in the presented results (hopefully described as inferences and not facts). When we start to peel back the layers, it gets overwhelming quickly and I admit this quickly lands me feeling uncomfortable and facing my low tolerance for bullshit.
For this post, I focus on the layer of researchers trusting that they should use particular methods and state their results in a particular way because that is how it has been done by those before them — a very social construct as I previously mentioned. In this layer, my observation is that healthy questioning and some level of mistrust have largely disappeared in many disciplines. Ways of carrying out a study using statistical methods are presented and treated as if facts about the way science should be done. And I don’t think most who use them even know how those methods came to be. There seems to be a trust that some great committee of knowledgeable people got together and made the hard decisions by deciding the “best” way to collect, analyze, and report on data. The degree of trust I see is consistent with this view — but this view is false. This is trusting that we should just trust — and I believe this mentality underlies the use of statistical methods in practice and contributes to (or is at least related to) many of the problems we see and hear about related to their use.
Trusting is far more comfortable and easy. Struggling with lack of trust is hard and difficult work that rarely lets up (if it is honest lack of trust– and not simply superficial trusting of those who tell you not to trust something). I believe this is the reason we develop social systems around trust, but they can backfire in major ways. This backfiring is what I see in the use of Statistics. Instead of encouraging health skepticism, people seem to love to trust methods they don’t really understand, or even to trust that they understand methods when in fact they do not (knowledge needed to carry out a method is not the same of knowledge needed to understand its limitations). This approach keeps things from becoming too overwhelming.
Take machine learning algorithms — which are all the rage. Many (most?) people using these algorithms do not have a deep understanding of how they are turning information from data into inferences or decisions — yet they can easily apply them to data in their research. Many (most?) of the assumptions are hidden from view. Their black-boxiness may be uncomfortable to some, but mainly it appears as an invitation to not have to figure out what’s in the box. It’s an invitation to completely trust the algorithms; to trust those using the algorithms; and to trust those who developed and programmed the algorithms (who may be detached from how they are actually being used). As you may have guessed by now, I strongly believe there is too much trust in these algorithms and far too little critically evaluation and discussion of assumptions. Some are trying to raise awareness and start conversations around the attached ethical issues (e.g., Cathy O’Neil’s book Weapons of Math Destruction), but continuing to trust is so much easier.
Thinking about trust in machine learning algorithms is low hanging fruit in this context. What about the magnitude of trust given to p-values, confidence intervals, effect sizes, linear models, etc. ? In my experience as a scientist, there is widespread trust in the idea that much of science should be based on statistical methods, though it is rare that we study why and how the belief in this paradigm came to be. It is far easier to continue to trust in the trust that is already established than to start to distrust and have to figure out what new path to take.
At the risk of repeating myself (again), trust in statistical methods and automatic ways of making conclusions and decisions from the results — without adequate justification for assumptions or even a deep understanding of the limitations of the method is easy. It’s easy for researchers, it’s easy for those reading the research, and it’s easy for those making downstream decisions.
The efforts I spent as a collaborative statistician and teacher trying to help others form a more realistic view of the limitations of Statistics were usually wasted. People generally seemed to agree with what I was saying, but in the end clearly wanted to just trust what had been done by others before them and follow suit. This led to many hours of reflection on my part as to why it seemed so difficult to get researchers to be more skeptical and to have less trust in approaches that hyperextend the natural limits of statistical inference. All the reflection did not lead me to a satisfactory, settled position. It did lead me to acknowledging the deep connection between trust and ease of living or feelings of comfort. Healthy distrust through questioning and critical thinking is uncomfortable and really hard work. And, it is not really possible without a deep understanding of that which you are questioning.
Because I have dedicated my professional life to statistical inference, I of course feel strongly about this topic and have the knowledge and experience to immediately push back against trust without doing a lot of extra work. I had hoped to be able to pass this information on in a productive way to keep others from having to do that work for themselves. But, ironically, that means they must trust my view and the information I’m providing them over the information provided by their previous teachers, their research mentors, grant reviewers, peer reviewers, funding agencies, etc. It is not a question of trusting me or not trusting me — it is a question of weighing whether they should trust me or the rest of the system within which they live and must count on for survival. Framed in this way, I don’t take it so personally — but it isn’t any less frustrating.
I catch myself trusting when I shouldn’t every day. Not because I’m not a skeptical person and not because I don’t have the skills to be able to question and critically evaluate, but because I have limited time and energy each day and must proceed with life and work. For example, I tend to trust doctors’ advice more than I know I should — because it’s generally comforting and easier than not doing so. However, I have made a point at trying to be aware of when I am doing this and I think that is important. It has helped me triage my trust scenarios and put effort into second opinions and reading research when stakes are higher. I also continue to make my fair share of mistakes in trusting other humans.
In summary, I strongly believe science and decision making can be improved if we start the research process from a place of healthy lack of trust — and then build up trust in our methods as we go. Instead of starting by accepting all the assumptions we need to make and then going back and half-heartedly “checking” them, let’s start from a place of questioning our trust in the assumptions and having to convince ourselves that it’s reasonable before proceeding and before trusting anything that comes from using the methods. Starting from a mindful lack of trust should be an integral part of statistical inference and science — despite the discomfort and difficulty it can add to our lives.
6 Comments
Martha Smith
Well put.
thomas marvell
The bottom line is the strength of the quality control system in one’s discipline. If it is adequate, then trust is easier and more justified. Quality control is mainly the peer review system and replication processes. In my area, social science, peer review is often sloppy and replication rare. Even if one finds that a study does not replicate, it is hard to find a publication outlet.
MD Higgs
I do agree about the relevance of the strength of a discipline’s quality control system. But I guess “strength of a system” can have different meanings. The aspect I’m most worried about is the quality of the quality control system. A strong system can be based on mis-placed trust (e.g., using p-values as detectors of what should be published) and therefore contribute and perpetuate underlying problems — and I do think this is part of what is going on in science today. The systems are strong enough that it is difficult to push against them — and they appear trustworthy — but their foundations are weak.
Andrew Gelman
Megan:
I agree with what you write in your post, and I wonder what you think about the trust that econometricians have in causal identification methods such as regression discontinuity. One thing I’ve noticed is sometimes people believe the theory more than what’s in front of their own damn eyes; see discussion here, for example. This is turbocharged trust, and it’s interesting in part because the use of causal identification is in large part motivated by distrust of traditional observational studies.
Any thoughts on this?
MD Higgs
Andrew,
Thanks for the comment. I completely agree and thanks for pointing to the effective example discussed on your blog. I also appreciate your quote “And you can acquire the attitude that the methods just simply work.” This certainly captures part of the trust issue I am talking about — and I think it’s helpful to recognize it as attitude too. Your comment brings up (at least) three additional layers of the complication that I think are important to consider generally.
(1) From a practical perspective, I have repeatedly experienced the lack of willingness to sit with the graphical display of the raw data contrasted with a fitted model. Even in very simple modeling situations, trust seems to go to the fitted model over the raw data time and time again. This has always been very perplexing to me. This is a situation where the greater the lack of understanding about where the fitted model comes from (how it is tied (or not) to information in the data), the greater the trust in the fitted model over the data. I also see that plotting anything beyond default outputs from a statistical software package (like SPSS or Stata) is a serious barrier for researchers to even get to the point of visualizing the raw data alongside the fitted model. Without that ability, the trust comes even easier.
(2) We should acknowledge the role of a priori trust in one’s belief in a theory and that it can bring on “turbocharged trust”. And really this can be less formal than a labeled theory — it can be just a collection of facts or an accepted way of doing things. It doesn’t seem that we ask the question “Why do you trust it?” often enough. I just finished reading Ignorance: How It Drives Science by Stuart Firestein and he has some great examples across science. I see trust and ignorance as clearly related in science — and both need to be more openly discussed, as well as their connection to each other.
(3) Finally, you make the important point that trust in one method/theory/etc is often motivated by distrust in others. And the fun part seems to come from how trust in the subsequent method is garnered. Here’s the problem in logic as I see it in practice: Recognizing the limitations of one method, and that it should not be blindly trusted, does not imply that another method should be trusted! In other words, the justification for trusting the results of a new or more sophisticated method or theory seems too often rely on just saying the simpler or more naive shouldn’t be trusted. This isn’t enough.
Trust can easily be misplaced and I believe we all misplace it to varying degrees everyday to survive and do our work. It is a judgement call. So, the problem isn’t so much that we trust what/where/when we shouldn’t, but that there appears to be lack of motivation to attempt to analyze and justify reasons for trusting. As with trusting people, we generally need a reason to question trust before we start thinking hard about whether we really should be trusting. I think statisticians (and other scientists) can help motivate this questioning by less marketing and overselling of their methods and honest relaying of limitations (including statistical inference in general). We need to do more to motivate skepticism and give users the language needed to openly question degree of trust in methods and theories and practices. Your blog is a place where this is provided – and given its popularity, it gives me hope that on some level people really want this.
MD Higgs
Pointing to this related Jan 9, 2020 post by Andrew Gelman: https://statmodeling.stat.columbia.edu/2020/01/09/no-i-dont-think-that-this-study-offers-good-evidence-that-installing-air-filters-in-classrooms-has-surprisingly-large-educational-benefits/