Monday, March 05, 2012

Tetlock podcast: expert predictions

I recently came across this excellent talk (podcast number 84 on the list at the link) by Philip Tetlock about his research on expert prediction.

Putting aside the fox vs hedgehog dichotomy, I think the main takeaway is that "expert" predictions are no better than those of well-informed ordinary people, and barely outperform simple algorithms. ... Tetlock took advantage of getting tenure to start a long-term research project now 18 years old to examine in detail the outcomes of expert political forecasts about international affairs. He studied the aggregate accuracy of 284 experts making 28,000 forecasts, looking for pattern in their comparative success rates. Most of the findings were negative— conservatives did no better or worse than liberals; optimists did no better or worse than pessimists. Only one pattern emerged consistently.

“How you think matters more than what you think.”

It’s a matter of judgement style, first expressed by the ancient Greek warrior poet Archilochus: “The fox knows many things; the hedgehog one great thing.” The idea was later expanded by essayist Isaiah Berlin. In Tetlock’s interpretation, Hedgehogs have one grand theory (Marxist, Libertarian, whatever) which they are happy to extend into many domains, relishing its parsimony, and expressing their views with great confidence. Foxes, on the other hand are skeptical about grand theories, diffident in their forecasts, and ready to adjust their ideas based on actual events.

The aggregate success rate of Foxes is significantly greater, Tetlock found, especially in short-term forecasts. And Hedgehogs routinely fare worse than Foxes, especially in long-term forecasts. They even fare worse than normal attention-paying dilletantes — apparently blinded by their extensive expertise and beautiful theory. Furthermore, Foxes win not only in the accuracy of their predictions but also the accuracy of the likelihood they assign to their predictions— in this they are closer to the admirable discipline of weather forecasters.

The value of Hedgehogs is that they occasionally get right the farthest-out predictions— civil war in Yugoslavia, Saddam’s invasion of Kuwait, the collapse of the Internet Bubble. But that comes at the cost of a great many wrong far-out predictions— Dow 36,000, global depression, nuclear attack by developing nations.

Hedgehogs annoy only their political opposition, while Foxes annoy across the political spectrum, in part because the smartest Foxes cherry-pick idea fragments from the whole array of Hedgehogs.

Bottom line… The political expert who bores you with an cloud of “howevers” is probably right about what’s going to happen. The charismatic expert who exudes confidence and has a great story to tell is probably wrong.

And to improve the quality of your own predictions, keep brutally honest score. Enjoy being wrong, admitting to it and learning from it, as much as you enjoy being right.

See also Intellectual honesty: how much do we know?


dwbudd said...

As a corollary, compare the performance of the folks on "intrade" to the pundits on (pick your favourite) chat shows.

David Coughlin said...

Sooo... regularizing a hopeful shot in the dark?

Richard Seiter said...

Have you see any detailed analyses comparing the performances of intrade and pundits?  I really wish there was a mechanism to hold pundits accountable for their statements (both prediction and factual errors).  I think being able to do this (even if only public praise/shame) would do wonders for improving the SNR in the media.  Tetlock's book makes a number of good points about how hard it is to evaluate prediction accuracy and hold people accountable though (virtually impossible in the climate we have now where people can't even agree on a small common set of "facts").

Did Tetlock talk about his predictive models decisively outperforming all the humans?  I read his book, but haven't gone through the methodology appendix thoroughly enough yet to understand what his models are (and whether they are truly fair and only use a priori data).

LaurentMelchiorTellier said...

My favorite formalization of this is the Dunning Kruger effect. The kind of generalized, social observation always much more valuable when it's been formalized into blind, clinical experiment. 

The Dunning–Kruger effect is a cognitive bias in which the unskilled suffer from illusory superiority, mistakenly rating their ability much higher than average. This bias is attributed to a metacognitive inability of the unskilled to recognize their mistakes. 

In other words: the less capable you are of getting it right, the less likely you are to cognize post-facto that you were wrong, to say nothing of why you were wrong.

David Coughlin said...

 I think that the former is judging against their peers, and the latter are judging themselves against the problem.

Yan Shen said...

"The Dunning–Kruger effect is a cognitive bias in which the unskilled
suffer from illusory superiority, mistakenly rating their ability much
higher than average. "
"Interestingly, the genuinely skilled also tend to underestimate their own relative ability."

IIRC, Steve Hsu mentioned that East Asians were the only group which tended to systematically underrate their own abilities. For instance, studies have shown that African Americans generally tend to have the highest levels of confidence/self-esteem and East Asian Americans the lowest, with European Americans somewhere in the middle of that spectrum.

I suspect that even when adjusted for IQ, East Asians are less confident on average than members of other races.

Yan Shen said...

 See also

These differences are highlighted in a meta-analysis Heine is now
completing of 70 studies that examine the degree of self-enhancement or
self-criticism in China, Japan and Korea versus the United States and
Canada. Sixty-nine of the 70 studies reveal significant differences
between the two cultures in the degree to which individuals hold these
tendencies, he finds.

In another article in the October 2001 Journal of Personality and Social Psychology
(Vol. 81, No. 4), Heine's team looks more closely at how this occurs.
First, Japanese and American participants performed a task at which they
either succeeded or failed. Then they were timed as they worked on
another version of the task. "The results made a symmetrical X," says
Heine: Americans worked longer if they succeeded at the first task,
while Japanese worked longer if they failed.

There are cultural,
social and individual motives behind these tendencies, Heine and
colleagues observe in a paper in the October 1999 Psychological Review
(Vol. 106, No. 4). "As Western society becomes more individualistic, a
successful life has come to be equated with having high self-esteem,"
Heine says. "Inflating one's sense of self creates positive emotions and
feelings of self-efficacy, but the downside is that people don't really
like self-enhancers very much."

Conversely, East Asians'
self-improving or self-critical stance helps them maintain their "face,"
or reputation, and as a result, their interpersonal network. But the
cost is they don't feel as good about themselves, he says. Because
people in these cultures have different motivations, they make very
different choices, Heine adds. If Americans perceive they're not doing
well at something, they'll look for something else to do instead. "If
you're bad at volleyball, well fine, you won't play volleyball," as
Heine puts it. East Asians, though, view a poor performance as an
invitation to try harder.

Iamexpert said...

maybe hedgehogs are wrong not because applying grand theories is a bad strategy, but rather because most grand theories themselves are wrong. If however you adopt one of the few grand theories that is correct, your predictions will be likewise.

Iamexpert said...

I hate how if an expert comes to the same common sense conclusion that billions of other people have come to (dumb people are too dumb to know they're morons) it gets a fancy pretentious name like the "Dunning-Kruger effect" and these two must be credited every time the idea is invoked.

Iamexpert said...

It would be fun to give a random sample of people a test asking them to make predictions on all kinds of important question (i.e. Who will be elected president in 2012?) and then waiting to see whose predictions came true and correlating prediction making ability with IQ.

Yan Shen said...

Well isn't the point that most expert predictions are no better than those made by average people because of the underlying complexity of the phenomenon being predicted? But it seems like confidence is inversely correlated with IQ though...

Iamexpert said...

The point is there's no correlation between expertise and prediction ability but does that mean there's no correlation between IQ and prediction ability?

LaurentMelchiorTellier said...

While I agree to an extent, I think you must take into consideration the operative first paragraph of my post: it's one thing to formulate a working hypothesis, and quite another to formalize and prove it in clinical, randomized experiment. 

The former is profoundly vulnerable to errors of overconfidence and confirmation bias, while the latter is to open ones convictions to falsification (scientific method).A "billion morons" may indeed have believed it so, but only Dunning-Kruger believed it on a scientific basis, and allowed you or I to do the same.

Ene Dene said...

Of course that the predictions of experts is poor, what did you expect?

It's a complex system, people are guessing the outcomes.
Imagine a space of all possible parameters that is needed to give an accurate prediction for time t. Even if we knew all the parameters, our predictions would drastically fall as time t rises since it's not possible to know the exact values of parameters. That's a nature of a system, not some kind of inconvenience we just need to learn how to fix.
The real situation is that we know only a small subset of all parameters, we have a poor understanding of interactions between variables and as a result our predictions are more or less random (even if you guessed right, you were probably lucky). Now people may think that if you spend much more time then the next guy to try to understand the big system that you'll do better proportionally to spent time, unfortunately that probably looks more like a logarithmic increase. And then some layman comes to you and gives a better prediction because his parametrization of a problem (in his brain, I don't mean he gives you a better equation) was better than yours which you're trying to perfect for last 20 years.
Statistics is a integrate part of any science curriculum, but so should be the understanding of the nature of complex systems.
Unfortunately I have little faith that work of P.Tetlock will have any effect in scientific community, I have no doubt whatsoever that next year will get a better model, a faster computer how will climate look like in 2050. 

LaurentMelchiorTellier said...

Speaking of Tetlockian hedgehogs... ^_^

David Coughlin said...

 And likewise, there is no correlation of IQ to modesty.

tractal said...

There is another way to interpret this study. Ideologues with strong priors do a bad job, but experts who avoid totalizing theories do ok. Notice at the bottom of the  article that Mid-east regional experts as a block predicted many of the problems we ran into in Iraq. They knew a ton of history and ethnic geography and were able to make really important predictions. 
The same could be said for George Kennan, and many other non-ideological experts in social science. Kennan looked at a vastly complex system and predicted Soviet geo-strategy and its vulnerabilities. Ultimately, he saw (when no one else did) that the Soviets would be relentlessly expansionist and aggressive, but would eventually succumb to their structural inflexibility. Now, maybe he just got lucky. Everyone is bound to have a good day once in a while. But his predictions were sophisticated, specific, and heterodox. Nonetheless, his ideas were recognized as superior by the expert community, and formed the basis of our grand strategy for half a century. This wasn't just a case of "someone is bound to get it right", the community of experts recognized the highly unusual ideas of a low level staffer and his views became the new orthodoxy. A rain dance is bound to work once in a while. But if all that was really dumb luck we got really, really lucky. 

Matthew Carnegie said...

There's an interesting response here: and

It does seem like there is an effect where at least Japanese self enhance their collectivist behaviour and competence relative to the reality:

"Americans self-enhanced on individualistic behaviors, but self-effaced (i.e., indicated that they were less likely than the typical group member to perform the behavior) on collectivistic behaviors.

Japanese, on the other hand, self-enhanced on collectivistic behaviors, but neither self-enhanced nor self-effaced on individualistic behaviors."

There might be a deal with East Asians where they tend to think they're more cooperative and better group coordinators and are more prosocial than they actually are (whether or not they actually), and tend to believe they behave more harmoniously than the actuality. 
That might help explain the soft skills gap in Asians that you talk about or at least the perceptions amongst Asian people that they their prosocial behaviour is underappreciated. If you think you're more prosocial than you actually are, you might have exceptations of social competence that aren't well backed up, or think that prosociality isn't rewarded as much as it ought to be. Asians may also tend more to oversell what they can contribute to as a team player, even while self effacing as individuals.
Might be true even if Asians are more prosocial than Whites, just as individualistic Ashkenazi Jews who are probably more competent (judging by IQs) on average than East Asians still overestimate their competence more than EAs ( 
Also might lead to underreporting of Asperger-like or psychopathic traits in the Asian population, as judged by self assessment (if Asians think they're more prosocial than they are), while Americans underrate their prosociality.
"The results made a symmetrical X," says Heine: Americans worked longer if they succeeded at the first task, while Japanese worked longer if they failed.
In the context of Asian Americans, who are higher achievers than Asian Asians, maybe Asian Americans are kind of intermediate between these populations (which is what is generally found in these kind of studies AFAIK). 
Maybe Asian Americans both have a norm of working longer and harder under failure conditions than White Americans (thus accounting for that self perception) but are more encouraged and motivated by success and work harder after initial success relative to Asian Asians, who might work more to keep up with the group average and then avoid standing out and risking being a tall poppy (eat bitterness until you're either Mr Average or respectable and then give up or get lazy).
If Americans perceive they're not doing well at something, they'll look for something else to do instead
Makes me wonder how much our Western "creativity" relative to East Asia is just us sucking at something and so trying to find a lateral solution instead (of course, most attempted lateral solutions tend to suck). Like modern artists not really being good at the skills fine artists have ^_^ .

Yan Shen said...

Not sure why white Americans routinely bash East Asian Americans as being uncreative grinds. But it's definitely the most common meme you hear these days from defensive white Americans as they lose the academic competition against East Asian Americans.

This kind of behavior is of course unfortunate, because you don't hear many East Asian Americans bashing white Americans for being lazy and stupid. And the ones who feel that way probably just keep their thoughts to themselves.

Yan Shen said...

 I'm not sure if you followed the story Larent, but I think a particularly good example of the Dunning-Kruger effect was earlier last year, when the College Republicans at UC Berkeley decided to hold an affirmative action bake sale, protesting the supposedly reprehensible practice of race based affirmative action. For the sale, they priced the cookies the highest for white Americans, followed by Asian Americans, Hispanic Americans, and black Americans. Lost amidst the outrage over the UC Berkeley bake sale was the rather curious fact that Asian Americans were the ones who suffered the most from race based AA, not white Americans!

I mentioned earlier that this was truly one the saddest things I had witnessed in my own lifetime. College Republican white males, clearly motivated more by perceived ethnic self interest rather than principled universalism, were so oblivious that they didn't even realize that they were the beneficiaries of race based AA! As an East Asian American, I too sometimes take the viewpoint of someone like commenter Han. In a society that lacks the traditional Confucian virtue of humility and self-criticism, one often witnesses events which can only be described as tragicomedy...

tractal said...


LaurentMelchiorTellier said...

It's nothing to do with the Dunning Kruger effect, really (I think you're substituting D-K for "overinflated self worth", that's not exactly what D-K is). But more broadly, your point is correct that the bake sale price hierarchy of American discrimination would more properly switch AA's with WA's. 

During a recent visit to the US, I had the fortune of meeting a young AA gentleman who had been refused entry to a major university despite a perfect (or near-perfect) SAT score. His mature and humble outlook on the situation was highly inductive of my respect, fortunately his dignified and cogent reaction had caught some interest and landed him in a quite nice situation, which was the cause for my meeting him. We had some interesting conversations which I want to follow up on when next we meet, very decent guy.

I think he's a better model to emulate than Han, TBH. When the Jews eased discrimination out of Harvard, the most obnoxious loudmouths took credit for themselves over the esteem won by the Einstein types, but you don't have to believe them. Synnotts book on the subject presents a more source-driven picture.

