·General Thinking & Meta-Models
Section 1
The Core Idea
When a measure becomes a target, it ceases to be a good measure.
The sentence is eight words long and describes a failure mode so pervasive that it operates, undetected, inside virtually every organisation that uses quantitative targets — which is to say, virtually every organisation.
Charles Goodhart, a British economist advising the Bank of England, first articulated this principle in a 1975 paper on monetary policy. His original formulation was narrower — "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes" — and was aimed specifically at the problem of using monetary aggregates as policy targets. If the central bank targeted M3 money supply growth, the relationship between M3 and inflation that had made M3 a useful indicator would break down, because financial institutions would find ways to shift activity outside the measured category.
The anthropologist Marilyn Strathern generalised Goodhart's observation in 1997 into the version that has become canonical: "When a measure becomes a target, it ceases to be a good measure." The expansion was decisive. Goodhart was talking about monetary economics. Strathern was talking about everything.
The mechanism is straightforward and ruthless. A metric is chosen because it correlates with something you care about — customer satisfaction, educational achievement, code quality, national productivity. The metric is useful precisely because people aren't trying to manipulate it. Then you make it a target. You attach rewards to it, or punishments for missing it.
At that moment, the humans in the system redirect their intelligence and effort from producing the underlying outcome to producing the number. The correlation between the metric and the thing it was supposed to measure begins to decay. Not because people are malicious, but because they are rational. The metric is legible, concrete, and attached to consequences. The underlying goal is ambiguous, multidimensional, and hard to verify. Rational agents, operating under time and resource constraints, will always converge on the legible target.
Soviet factories provide the most cited historical illustration. When Moscow set production targets by weight, factories produced absurdly heavy chandeliers and nails the size of railroad spikes. When targets shifted to unit count, factories produced millions of tiny, useless nails. The factories hit every target. The economy got neither the chandeliers nor the nails it needed. The planners weren't stupid. They were fighting a structural law: the act of targeting a proxy for quality predictably severs the proxy from quality.
British colonial India surfaced the same dynamic in a different guise. The British government in Delhi, concerned about the number of venomous cobras, offered a bounty for every dead cobra presented. Enterprising citizens began breeding cobras for the income. When the authorities discovered the farms and cancelled the bounty, the breeders released their now-worthless snakes, increasing the cobra population beyond its original level. The "cobra effect," as it's now known, is Goodhart's Law operating through the gap between what was measured (dead cobras) and what was desired (fewer living cobras). The metric responded perfectly to the incentive. The outcome moved in the opposite direction.
The phenomenon isn't confined to command economies or colonial misadventures. The No Child Left Behind Act of 2001 measured school quality by standardised test scores in reading and maths. Schools responded rationally: they narrowed curricula to tested subjects, eliminated art, music, and science instruction, and focused resources on students near the proficiency cutoff — the "bubble kids" — whose score improvements would most efficiently move the metric. A 2007 RAND Corporation study found that score gains on state-specific tests were two to five times larger than gains on the National Assessment of Educational Progress, which measured the same skills but carried no institutional stakes. The schools were producing scores, not learning. Goodhart's Law explains the gap.
In technology, the pattern recurs with metronomic regularity. When YouTube optimised for watch time in 2012, its recommendation algorithm began surfacing increasingly extreme and sensationalist content — because outrage, conspiracy, and emotional escalation held eyeballs longest. Watch time rose. The quality of the user experience — the thing watch time was supposed to correlate with — degraded.
When Facebook optimised for engagement, its algorithm learned that divisive political content generated more clicks, comments, and shares than any other category. Engagement rose. Social cohesion, and Facebook's own brand equity, fell.
In each case, the metric was originally a reasonable proxy. The moment it became the target, the system optimised for the proxy at the expense of the thing the proxy was supposed to represent.
The financial sector generates the highest-stakes examples. When banks targeted quarterly earnings-per-share growth in the 2000s, executives discovered that share buybacks — funded by debt, with no corresponding improvement in underlying business performance — were the fastest way to move the number. EPS rose because the denominator (shares outstanding) shrank, not because the numerator (earnings) grew.
Between 2003 and 2023, S&P 500 companies spent over $8 trillion on buybacks, and a substantial portion of that spending was driven not by genuine capital allocation logic but by executive compensation tied to EPS targets. The metric rewarded financial engineering. The underlying businesses received the investment that was left over.
The deepest version of Goodhart's Law operates at the level of incentive design. Every
KPI dashboard, every quarterly OKR, every compensation structure tied to measurable outputs is a Goodhart's Law experiment. The question is never whether gaming will occur. Gaming is the predicted response of intelligent agents to any explicit target. The question is whether the game that emerges from the target system produces outcomes aligned with what you actually want — or outcomes that merely produce numbers you like looking at while the real performance deteriorates underneath.