·Business & Strategy
Section 1
The Core Idea
Philip Tetlock spent twenty years tracking 28,000 predictions made by 284 experts — political scientists, economists, intelligence analysts, journalists — across domains where their credentials were supposed to confer predictive authority. The study, published as Expert Political Judgment in 2005, is the most comprehensive empirical evaluation of expert forecasting ever conducted. The results were devastating. On average, the experts performed barely better than chance — and notably worse than simple statistical algorithms that extrapolated base rates. A dart-throwing chimp selecting from three options (things will get better, stay the same, get worse) would have matched the experts' calibration. The experts with the highest media profiles and the strongest reputations performed the worst. Credentials did not predict accuracy. Confidence did not predict accuracy. The single strongest predictor of forecasting failure was the expert's certainty that they were right.
Nassim Taleb crystallised the implication in The Black Swan and Antifragile: we do not have an expertise problem. We have a domain problem. Expertise is genuine and powerful in fields with stable rules, tight feedback loops, and repeatable patterns — chess, surgery, weather forecasting in the short term, actuarial science. These are what Tetlock calls "kind" learning environments: the relationship between action and outcome is visible, consistent, and available for calibration. A surgeon who botches a procedure sees the patient deteriorate. A chess player who makes a bad move loses the game. The feedback is immediate, unambiguous, and corrective.
The expert problem emerges in "wicked" environments — complex systems where causation is opaque, feedback is delayed or nonexistent, outcomes are driven by interactions among thousands of variables, and the same conditions never recur in identical form. Macroeconomics. Geopolitics. Long-term financial markets. Technology adoption curves. In these domains, the machinery of expertise — pattern recognition, causal modelling, scenario construction — still runs. It just produces outputs that are no more reliable than coin flips, dressed in the vocabulary of authority. The economist who predicted the 2008 crisis did not predict the 2020 pandemic crash. The geopolitical analyst who called the fall of the Soviet Union did not foresee 9/11. Each correct prediction looks prescient in isolation. Across the full portfolio of predictions, the hit rate is indistinguishable from noise.
The social cost is not that experts are wrong. Everyone is wrong about complex systems. The cost is that society allocates credibility, capital, and policy influence based on the assumption that expert predictions in these domains are meaningfully better than non-expert guesses — an assumption Tetlock's data comprehensively demolished. We pay economists to forecast
GDP growth. We pay strategists to predict competitive landscapes. We pay political analysts to forecast election outcomes. In each case, the credential creates a confidence premium that the track record does not support. The expert's authority comes not from demonstrated predictive accuracy but from the institutional expectation that expertise should confer predictive accuracy — an expectation that persists because nobody tracks the predictions systematically enough to expose the gap.
Tetlock identified two cognitive profiles among his experts. "Hedgehogs" — experts who knew one big thing and applied it everywhere — were the worst forecasters. They were confident, media-friendly, and spectacularly unreliable. "Foxes" — experts who drew from multiple frameworks, updated frequently, and expressed uncertainty — performed materially better, though still modestly. The fox's advantage was not superior intelligence. It was the willingness to treat every prediction as provisional and to revise when new evidence contradicted the prior view. The hedgehog's disadvantage was not stupidity. It was the psychological investment in a single explanatory framework that made disconfirming evidence feel like a personal attack rather than useful data.
The expert problem is not anti-expertise. It is a precision instrument for distinguishing the domains where expertise delivers genuine predictive power from the domains where it delivers only the performance of predictive power — and for recognising that the second category is far larger than our institutions acknowledge.