At the very least it is important for decision-makers to be aware that people are prone to overconfidence, and that to assume one is not is to unwittingly fall prey to the bias. Most of us can improve the calibration of our judgements by simply considering the question “How could I be wrong?”
A very interesting paper was just published in a journal of the American Statistical Society called Significance [link to abstract]:
I know I’m right! A behavioral view of over confidence
Albert Mannes and Don Moore
Abstract. Statistics is all about uncertainty. Why, then, are so few of us uncertain enough? Being far too certain is a near-universal trait: the consequences have sometimes been catastrophic. Albert Mannes and Don Moore outline the ways humans are overconfident in their judgements – and why so many of us think that we can finish decking the patio on time.
Good judgement is surely good for society; and persistent overconfidence surely indicates poor judgement. Behavioural research tends to focus on three forms of overconfidence that occur with some frequency in modern life: overestimation, overplacement, and overprecision.
A hallmark of good judgement is that the assessments a person makes about probabilistic events are well calibrated to the long-run frequency of those events. For example, a meteorologist who claims that there is an 80% chance of rain today is well calibrated if on average it rains on 8 of the 10 days he or she makes this pronouncement.
Being well calibrated – that is, having judgement that on average is neither underconfident nor overconfident – means that the confidence expressed in corresponds closely with the frequency with which the respondent is actually correct.
The typical research finding, however, is that people’s confidence exceeds their actual performance. For example, the actual number of correct answers for judgements expressed with 70% confidence is less than 7 out of 10. This form of overconfidence is naturally called overestimation.
Overprecision refers to “our excessive confidence in what we believe we know, and our apparent inability to acknowledge the full extent of our ignorance and the uncertainty of the world we live in”. To be overprecise is to underestimate the degree to which one’s judgement may err. Subjective beliefs about accuracy are too sharp relative to true accuracy. We believe we are close or spot on far more often than we actually are.
JC comment: ’overprecision’ is the main criticism that I have had of the IPCC’s particular brand of overconfidence; overprecision also leads us to consider the white area of the Italian Flag.
The costs of being wrong are often asymmetric. Showing up early for a flight is less costly than showing up late and missing it entirely, so uncertainty about the travel time should lead people to depart earlier for the airport. We used this principle to demonstrate overprecision in a laboratory setting. We asked the participants in our studies to estimate the temperatures in the city where they live over several historical dates under three pay-off conditions. (Each person made guesses under all three conditions.) In the first condition, participants were paid a fixed amount (in the form of lottery tickets towards a prize) if their guesses were within a specified margin of error, either over or under the correct answer.
Their answers here, in which the costs of being wrong were symmetric, allowed them (and us) to gauge their knowledge in this domain. In the second condition, they were rewarded only for correctly guessing the temperature or overestimating it within a margin of error. And in the third condition, they were rewarded only for correctly guessing the temperature or underestimating it within a margin of error. For all estimates, participants received trial-by-trial feedback on their errors.
As expected, people biased their estimates in the appropriate direction given their payoffs – when rewarded for overestimation, they adjusted their estimates upward, and when rewarded for underestimation, adjusted them downward. But their adjustments in these conditions were systematically insufficient: given their actual knowledge, participants would have earned significantly more had they made larger adjustments, which would have reduced the frequency of over- or underestimation when those errors had no pay-off.
JC comment: hmmmm . . . does a framework of the precautionary principle bias estimates?
They acted (in their adjustments) as if their knowledge was more precise than it actually was; metaphorically speaking, they consistently missed their flights. Note that their insufficient adjustments could not simply be explained as being anchored on their best guess. Instead, the levels of overprecision for this task were positively correlated with participants’ overprecision using traditional 90% confidence intervals and with their expressed confidence in their knowledge of the domain.
In social situations, there may be a market for overconfidence. Experimental participants in one study were more likely to purchase advice when sellers expressed more confidence in their judgement, holding accuracy constant. As a result, as the sellers of advice competed with each other, their judgements were expressed with increasing confidence over time without becoming more accurate – they were rewarded for being overconfident.
JC comment: there is an implicit expectation that the IPCC’s confidence level will increase with each assessment report. Hence the ‘leaked’ 95% confidence level for attribution from the forthcoming AR5 report, in spite of reduced accuracy of the climate models relative to the last 15+ years of observations and apparent lowering of the climate sensitivity bound to 1.5C.
Overconfidence has proven remarkably resistant to debiasing – perhaps because it does have value for people in certain situations. Nevertheless, the factors that contribute to overconfidence suggest some ways to become better-calibrated judges of our knowledge.
First, accurate and timely feedback improves calibration. Despite reputations to the contrary, meteorologists are quite well calibrated, no doubt in part because they receive regular information about the quality of their forecasts.
JC comment: unfortunately the same is not true of climate modelers. They only receive useful feedback on decadal time scales, but even with this feedback, they don’t seem to see the need for recalibration. The following seems to explain why.