·Mathematics & Probability
Section 1
The Core Idea
How much of your capital should you risk on a bet you believe is in your favour? The intuitive answer — as much as possible — is mathematically guaranteed to destroy you. The correct answer was derived in 1956 by John Kelly Jr., a physicist at Bell Labs, and it changed the foundations of how rational actors should think about sizing any commitment under uncertainty.
Kelly's formula is deceptively simple. For a binary bet with probability p of winning and probability q = 1 − p of losing, where a win pays b-to-1, the optimal fraction of capital to wager is: f* = (bp − q) / b. The formula can be restated even more compactly: f* = edge / odds, where edge is the expected profit per dollar risked and odds is the payoff on a winning bet. If you have a 60% chance of winning an even-money bet, the Kelly fraction is (1 × 0.6 − 0.4) / 1 = 20% of your capital. Not 100%. Not 50%. Twenty percent.
The number shocks people who encounter it for the first time. A bet where you win 60% of the time feels overwhelmingly favourable — the kind of edge most investors, gamblers, and founders would mortgage their house for. Yet the formula says to risk only a fifth of your bankroll. The reason is the asymmetry between multiplicative gains and multiplicative losses. A 50% loss requires a 100% gain to recover. A 75% loss requires a 300% gain. Betting too large on any single outcome — even one with a genuine, quantifiable edge — exposes you to sequences of losses that compound geometrically toward zero. The Kelly criterion is the precise mathematical boundary between growth and ruin.
What Kelly actually maximised was not expected wealth but the expected logarithm of wealth — the geometric growth rate. This distinction is the entire intellectual content of the model. The arithmetic expected value of a bet tells you what happens on average across a population of bettors at a single moment. The geometric growth rate tells you what happens to a single bettor across many sequential bets over time. In any system where outcomes are multiplicative — where each result changes the base from which the next result is calculated — these two quantities diverge. The arithmetic average overstates the individual's prospects. The geometric growth rate captures the actual compounding trajectory. Kelly's contribution was proving that the bet size which maximises the geometric growth rate is unique, calculable, and always smaller than the bet size which maximises expected value.
The formula emerged from information theory, not gambling. Kelly was working on the problem of signal transmission over noisy channels — building on
Claude Shannon's foundational work at Bell Labs. Kelly noticed that a gambler receiving partially reliable information about the outcome of a horse race faces a mathematically identical problem to a communications engineer receiving a partially reliable signal over a noisy line. The optimal strategy in both cases is to allocate resources in proportion to the reliability of the information: bet more when the signal is strong, less when it's weak, and nothing when the signal is pure noise. The connection between information theory and optimal wagering was not a metaphor. It was a mathematical identity.
Ed Thorp — a mathematics professor at MIT who had independently developed card-counting systems for blackjack — recognised the formula's significance immediately. Thorp applied Kelly sizing first at the blackjack table, where his card-counting system provided a quantifiable edge that fluctuated hand by hand, and then in financial markets through his hedge fund Princeton Newport Partners. The results were extraordinary: nineteen consecutive years of positive returns with annualised performance exceeding 20%, achieved not through superior prediction of market direction but through superior sizing of exposure to identified mispricings. The same mispricings were visible to other sophisticated market participants. The difference was that Thorp sized his positions to maximise geometric growth while his competitors sized to maximise expected return — and the competitors, one by one, were eliminated by adverse sequences that their sizing could not survive.
The Kelly criterion's most counterintuitive property is what happens when you exceed it. Betting exactly the Kelly fraction maximises long-run geometric growth. Betting less than Kelly — say, half-Kelly — reduces the growth rate but also reduces the volatility of the path, producing a smoother equity curve with smaller drawdowns. This is why most professional practitioners use a fractional Kelly approach. But betting more than Kelly does something far worse than just reducing growth: it actually decreases the geometric growth rate below what under-betting would produce, while simultaneously increasing volatility and ruin probability. At twice the Kelly fraction, the expected geometric growth rate drops to zero — you are statistically guaranteed to neither grow nor shrink over the long run, while experiencing enormous swings. Above twice Kelly, the geometric growth rate turns negative — you are mathematically certain to go broke. The overbettor destroys wealth faster than the random bettor, despite having a genuine edge, because the sizing converts the edge into a liability.
This is the insight that separates Kelly from every other risk framework: the relationship between bet size and growth rate is not monotonic. There is a peak — the Kelly fraction — and both sides of the peak slope downward, but the right side (overbetting) slopes into ruin while the left side (underbetting) merely slopes into slower growth. The asymmetry means that the penalty for betting too large is categorically worse than the penalty for betting too small. A bettor who uses half the Kelly fraction captures 75% of the maximum growth rate. A bettor who uses double the Kelly fraction captures 0%. The errors are not symmetric, which is why every practitioner who has survived long enough to write about their experience advocates sizing conservatively relative to the theoretical optimum.
The framework extends beyond gambling and investing into any domain where you allocate a scarce, depletable resource to opportunities with uncertain outcomes. A founder deciding how much runway to burn on a product bet. An executive deciding how much of the R&D budget to concentrate on a single initiative. A professional deciding how much career capital to invest in a risky pivot. In each case, the Kelly logic applies: there is an optimal allocation that maximises the long-run growth rate, and that allocation is always smaller than intuition suggests, because intuition is calibrated to expected value rather than geometric growth.
The history of financial blowups is overwhelmingly a history of Kelly violations. Long-Term Capital Management leveraged Nobel Prize–winning models at 25:1 and was destroyed in 1998 by the same convergence trades that Thorp had executed profitably for decades at fraction-of-Kelly sizing. Bear Stearns and Lehman Brothers held mortgage-backed securities with genuine positive expected returns but sized their exposure so far beyond any Kelly-compatible fraction that a correlated adverse move — the 2008 subprime crisis — converted portfolio-level mark-to-market losses into institutional death. In each case, the operators had correctly identified an edge. In each case, the sizing of their commitment to that edge was the variable that determined whether the edge built wealth or destroyed the institution. The edge was right. The fraction was wrong. And in multiplicative systems, a wrong fraction overwhelms a right edge with mathematical certainty.
The Kelly criterion does not tell you what to bet on. It tells you how much — and in that distinction lies the difference between a strategy that compounds across decades and a strategy that looks brilliant until the sequence of outcomes that exposes its sizing arrives. Every sequence arrives eventually. The Kelly fraction is what determines whether you are still at the table when it does.