·Mathematics & Probability
Section 1
The Core Idea
In 1654, a French nobleman named Antoine Gombaud — the Chevalier de Méré — posed a gambling problem to Blaise Pascal. The question was deceptively simple: if two players must abandon a game of chance before it is finished, how should the stakes be divided fairly, given each player's probability of winning from the current position? Pascal wrote to Pierre de Fermat. The exchange of letters that followed created the mathematical framework that now underlies every rational decision made under uncertainty. The framework is probability theory — the formal language for quantifying what we do not know and computing what we should do about it.
Before Pascal and Fermat, uncertainty was the domain of fate, divination, and gut instinct. After them, uncertainty became calculable. The revolution was not in predicting the future — probability theory cannot do that — but in replacing the question "what will happen?" with the far more useful question "what are the relative likelihoods of each thing that could happen, and what does that imply for how I should act?" The shift from prophecy to probability is the foundational intellectual move of modern decision-making.
The mathematics is built on three axioms, formalised by Andrei Kolmogorov in 1933. First, every event has a probability between zero and one. Second, the probability of all possible outcomes sums to one — something must happen. Third, the probability of mutually exclusive events is the sum of their individual probabilities. From these axioms — simple enough to fit on an index card — the entire apparatus of modern statistics, actuarial science, financial engineering, quantum mechanics, and artificial intelligence is derived. The axioms say nothing about the world. They define a consistent language for reasoning about uncertainty, and the language turns out to be the most powerful analytical tool humans have ever constructed.
The core operation of probability theory is the expected value calculation: multiply each possible outcome by its probability, then sum the results. A bet that pays $100 with probability 0.3 and loses $40 with probability 0.7 has an expected value of (0.3 × $100) + (0.7 × −$40) = $30 − $28 = $2. The expected value is not the outcome you expect to see on any individual trial — you will never win $2 on this bet. You will win $100 or lose $40. The expected value is the average outcome per trial across a large number of repetitions — the quantity that determines whether the bet creates or destroys wealth over time.
This distinction between individual outcomes and expected values is where most people's intuition fails. The brain evaluates bets by imagining specific scenarios — "I could win $100" or "I might lose $40" — and weighting them by emotional salience rather than probability. Probability theory replaces scenario thinking with distributional thinking: not "what could happen?" but "what is the full distribution of what could happen, and what do the mathematics of that distribution imply?" The distribution contains more information than any single scenario, and decisions made on the basis of the distribution are systematically superior to decisions made on the basis of the most vivid or most feared scenario.
Conditional probability — how the probability of an event changes given new information — is where the framework becomes genuinely powerful. Thomas Bayes's theorem, published posthumously in 1763, provides the mathematics: P(A|B) = P(B|A) × P(A) / P(B). The formula tells you how to update your beliefs when evidence arrives. If you initially believe there is a 10% probability that a startup will achieve product-market fit, and you then observe that the startup's week-over-week retention is 85% — a metric that is present in 60% of companies that achieve product-market fit but only 5% of those that don't — Bayes's theorem tells you exactly how much to increase your probability estimate. The calculation is mechanical. The discipline is in doing the calculation rather than substituting narrative or intuition.
The law of large numbers completes the framework's practical foundation. It states that as the number of independent trials increases, the average outcome converges toward the expected value with probability approaching certainty. This is the mathematical justification for insurance companies, casinos, index funds, and any business model built on pooling independent risks. No individual trial is predictable. The aggregate of many trials is. The gap between individual unpredictability and aggregate predictability is the fundamental insight of probability theory, and it is the structural advantage exploited by every entity that operates at sufficient scale to let the law of large numbers do its work.
Probability theory does not eliminate uncertainty. It domesticates it. It converts "I don't know what will happen" into "here is the distribution of what might happen, here is how likely each outcome is, here is what the distribution implies I should do, and here is how I should update my beliefs as new information arrives." The framework does not require that you know the probabilities precisely — imprecise probability estimates, honestly held and rigorously updated, produce better decisions than the intuitive certainties they replace. An investor who honestly assigns "somewhere between 20% and 40%" to a thesis and sizes accordingly will outperform the investor who assigns "I'm highly confident" and bets the maximum, because the first investor's framework has room for being wrong and the second investor's does not.
The framework's most radical implication is that the quality of a decision is independent of its outcome. A bet with an 80% expected probability of success that fails was still the correct bet if the probability estimate was well-calibrated and the sizing was appropriate. Outcome-based evaluation — judging decisions by their results — is the default mode of human cognition and the enemy of probabilistic thinking. The probabilistic thinker evaluates the process: were the probabilities estimated honestly? Was the base rate incorporated? Was the position sized for the expected value and the variance? If yes, the decision was correct regardless of the outcome, because over hundreds of such decisions, the correct process produces superior aggregate results with mathematical certainty.
The intellectual descendants of Pascal and Fermat's correspondence now manage trillions of dollars of capital, power every search algorithm on the internet, guide the dosing of every drug approved by the FDA, and underpin the statistical models that predict weather, elections, and epidemics. The framework's ubiquity is itself evidence of its power: no competing method for reasoning about uncertainty has displaced it in any domain where decisions must be made and outcomes can be measured. Three and a half centuries after two mathematicians exchanged letters about a gambling problem, the framework they created remains the only rigorous language for thinking clearly about what we do not know.