The law of large numbers says that as you repeat a random experiment more and more times, the average of the results converges to the expected value. Sample a few times and the average can be far from the true mean; sample thousands of times and the average gets close. The theorem was formalised by Jacob Bernoulli (1713) and refined by Kolmogorov; it's the reason casinos and insurers make money. They don't need to win every bet or every policy — they need enough volume so that the average outcome is predictable. Variance gets averaged out. The house edge compounds; the law does the work.
The practical implication: small samples are noisy, large samples are stable. A startup with 10 customers can have a 90% NPS or a 20% NPS and neither is a reliable signal of the underlying satisfaction distribution. With 1,000 customers, the sample mean is a decent estimate. The same logic applies to A/B tests (run them long enough), to hiring (one great or bad hire doesn't define your process), and to investing (a few trades don't prove edge). When someone generalises from a handful of cases, the law of large numbers is being violated. When someone insists on more data before deciding, they're invoking it.
Two caveats. First, the law says the average converges; it doesn't say any single outcome is predictable. You can still have a terrible run in a fair game. Second, the law assumes independent (or weakly dependent) trials with a fixed distribution. If the process changes over time or trials are correlated, convergence can be slow or fail. Use the law to justify collecting more data when the sample is small and to be sceptical of conclusions drawn from few observations.
The law also explains why diversification works when bets are independent: the portfolio's average return converges to the weighted average of expected returns, and variance shrinks. When bets are correlated — same sector, same factor — the effective n is smaller and the law helps less. One bad event can hit the whole portfolio. So the law supports "many independent bets" but doesn't support "many correlated bets" as a way to reduce risk.
Section 2
How to See It
The law of large numbers shows up whenever averages over many trials are more stable than averages over few. Look for: a random process, a summary statistic (mean, rate), and the question "how many observations do we need before we trust the number?" The diagnostic is asking: is this conclusion based on a sample large enough for the average to have converged?
Business
You're seeing Law of Large Numbers when a SaaS company reports NPS or churn from a cohort of 50 customers. The number is noisy — a few detractors or one churned logo can swing it. With 500 or 5,000 customers, the same metric stabilises. The company that makes strategy from small-sample NPS is overreacting to noise.
Technology
You're seeing Law of Large Numbers when an A/B test is stopped after 100 users per variant. The observed difference might be 10%; the confidence interval is wide. Running to 10,000 per variant narrows the interval and separates signal from noise. Ship decisions on small tests at your peril.
Investing
You're seeing Law of Large Numbers when a fund has 20 positions and one of them goes to zero. The impact on the portfolio is 5%. With 200 positions, one failure is 0.5%. Diversification is the law of large numbers applied to a portfolio: more independent bets smooth the average return. The caveat: correlated bets don't diversify; they fail together.
Markets
You're seeing Law of Large Numbers when a market maker or insurer prices on expected value. They don't need to win each trade or each policy; they need volume so that realised average converges to the theoretical mean. The law is why market makers can profit with thin margins — scale turns variance into predictability.
Section 3
How to Use It
Decision filter
"Before trusting a rate, average, or proportion, ask: how many observations is this based on? If the sample is small, the number is noisy. Get more data, or treat the number as a range (e.g. confidence interval), not a point estimate. Don't over-interpret small samples."
As a founder
Metrics from early customers or few experiments are unstable. NPS, conversion, retention — all bounce around with small n. Avoid big strategy pivots on tiny samples. Run more experiments, wait for more conversions, or explicitly quantify uncertainty (e.g. error bars). The law says: scale the number of trials before you scale the conclusion.
As an investor
Portfolio performance over a few years is a small sample. One great or terrible fund doesn't prove process. The law says that edge (or lack of it) shows up over many bets. Evaluate process and position sizing, not just outcomes from a handful of investments. Same for company metrics: early traction can be noise; wait for volume or triangulate with other evidence.
As a decision-maker
When someone says "we've seen X in the data," ask for n. If n is small, treat the finding as suggestive, not definitive. Require larger samples for high-stakes decisions. Use the law to push back on overconfidence from small samples and to justify investment in more data collection when the stakes are high.
Common misapplication: Assuming the law applies to a single trajectory. The law says the average over many trials converges. Your one company, your one portfolio, your one run — that's one path. It can deviate from the mean for a long time. The law justifies trusting aggregate statistics with large n; it doesn't guarantee your single path will look like the average.
Second misapplication: Ignoring dependence. If trials are correlated (e.g. same customer over time, same market), the "effective" n is smaller than the raw count. The law assumes independence (or weak dependence). Correlated data converges more slowly; adjust your required sample size accordingly.
Section 4
The Mechanism
Section 5
Founders & Leaders in Action
Ed ThorpMathematician, author of Beat the Dealer; applied probability to blackjack and finance
Thorp used the law of large numbers implicitly: with a positive edge per hand, the key was playing enough hands for the edge to overcome variance. His Kelly-based bet sizing ensured he could survive the variance while the law did its work over thousands of trials. The law justified "we need volume"; Kelly justified "we need to size so we're still around when volume arrives."
Buffett has often said that you need very few good decisions in investing — but they need to be big and right. The law of large numbers is more salient for his insurance operations: Berkshire prices policies on expected value and relies on large numbers of independent policies so that realised losses converge to the mean. In investing, he focuses on high-conviction bets rather than diversifying into many small ones; the law still applies to the aggregate of his decisions over decades.
Section 6
Visual Explanation
Law of Large Numbers — As sample size n grows, the sample mean (solid line) converges to the true mean μ (dashed). Small samples are noisy; large samples are stable. Variance of the mean shrinks as σ²/n.
Section 7
Connected Models
The law of large numbers sits at the heart of statistics and inference. The models below either reinforce it (regression to the mean, central limit theorem), tension it (law of small numbers, gambler's fallacy), or extend it (statistical significance, sampling).
Reinforces
Regression to the Mean
Regression to the mean says extreme outcomes tend to be followed by outcomes closer to the average. That's the law of large numbers in action: the mean pulls back. After a run of good or bad luck, the next observation is more likely to be near the long-run mean. Both models say: don't over-interpret extremes; the average reasserts itself.
Reinforces
Central Limit Theorem
The central limit theorem says the distribution of the sample mean approaches normal as n grows. That gives you the shape of the convergence; the law of large numbers says the mean of that distribution is the true mean. Together they justify confidence intervals: the sample mean is approximately normal with mean μ and variance σ²/n.
Tension
Law of Small Numbers
The law of small numbers (Kahneman's term) is the cognitive bias of expecting small samples to behave like large ones — of seeing patterns and stability where there's only noise. The law of large numbers says small samples are unreliable. The tension: we're wired to trust small samples; the math says wait for more data.
Tension
Gambler's Fallacy
The gambler's fallacy is believing that past outcomes affect the next one (e.g. "red is due"). The law of large numbers says the converges; it doesn't say the next trial is influenced by history. The tension: people confuse "the average will converge" with "the next outcome will correct." Independence of trials is required for the law; the fallacy violates it.
Section 8
One Key Quote
"As the number of observations is increased, the estimate becomes ever closer to the true value."
— Jacob Bernoulli, Ars Conjectandi (1713)
Bernoulli's formulation: more trials, better estimate. The law doesn't say when you're close enough — that's a decision about acceptable error and required n. It says that the path is clear: if you want a more accurate average, add more observations.
Section 9
Analyst's Take
Faster Than Normal — Editorial View
Small samples are noise. The number of times we see strategy or hiring or product decisions driven by a handful of data points is high. NPS from 30 customers, conversion from one week of traffic, "we tried that and it didn't work" from one experiment. The law of large numbers says: treat those numbers as unstable. Get more data or state the uncertainty. Don't bet the company on n=20.
The law is about averages, not single outcomes. Your one company, your one fund, your one career — that's one path. The law says the average over many such paths converges. It doesn't say your path will be average. So use the law to justify "we need more bets" or "we need more data," not to predict your single outcome.
Dependence reduces effective n. If your "trials" are 100 users but they're the same 100 users over time, or 100 sales from the same market, the effective sample size is smaller. Correlation means the law converges more slowly. Adjust your required n when trials aren't independent.
Casinos and insurers are the law in practice. They don't need to win every bet or every policy. They need volume so that the average outcome is predictable. The same logic applies to market makers, marketplaces, and any business that aggregates many small uncertain events. Scale is how variance becomes predictability.
Section 10
Test Yourself
Is this mental model at work here?
Scenario 1
A founder concludes that enterprise sales don't work after two failed pilots. She pivots the company to SMB.
Scenario 2
An insurer prices auto policies so that expected loss per policy is $800. It writes 100,000 policies. Annual losses are close to $80M.
Scenario 3
A trader has a positive edge per trade. She makes 10 trades and loses on 7. She concludes her edge is wrong.
Scenario 4
A/B test: variant B has 12% conversion vs A's 10% after 50,000 users per variant. The team ships B.
Section 11
Summary & Further Reading
Summary: The law of large numbers says the average of many repeated trials converges to the expected value. Small samples are noisy; large samples are stable. Use it to demand more data before trusting rates and averages, to justify diversification and scale, and to avoid over-interpreting few observations. Don't confuse the law with prediction of a single outcome — it's about the behaviour of the mean. Pair with regression to the mean (extremes revert), central limit theorem (shape of the distribution), and law of small numbers (the bias of trusting small samples).
The first proof of the law of large numbers. Bernoulli showed that the proportion of successes in repeated trials converges to the probability of success.
Kahneman's "law of small numbers" — the bias of expecting small samples to behave like large ones. The psychological counterpart to the mathematical law.
Accessible treatment of the law of large numbers and the central limit theorem with examples and exercises.
average
Leads-to
Statistical Significance
Statistical significance asks whether an observed difference could be due to chance. The logic rests on the law of large numbers: with enough data, the sample mean is close to the true mean, so differences that persist are likely real. Larger n → smaller standard error → easier to detect true effects.
Leads-to
Sampling
Sampling is drawing a subset to estimate a population. The law of large numbers says that as the sample size grows, the sample mean converges to the population mean. So sampling theory is "how large does n need to be for the sample to be reliable?" — the law provides the guarantee that such an n exists.