The Central Limit Theorem (CLT) says that the distribution of the sum (or average) of many independent random variables tends toward a normal distribution, regardless of the shape of the original distributions. Add enough dice rolls, sample means, or measurement errors, and their aggregate approaches the familiar bell curve. That is why the normal distribution appears everywhere: not because the world is Gaussian at the source, but because we often look at aggregates.
The practical import is twofold. First, you can use normal approximations for sums and averages when the sample size is large enough — which underpins confidence intervals, hypothesis tests, and much of applied statistics. Second, the theorem tells you that averages are more stable than individual outcomes. A single draw can be wild; the mean of 100 draws is much more predictable. That is why we care about sample size: more observations pull the sample mean toward the population mean and shrink the spread.
The caveat: "large enough" depends on how non-normal the underlying distribution is. Thin-tailed data converges quickly; fat-tailed or skewed data may need many more observations. And the CLT is about sums and averages — it does not say that individual outcomes or maxima become normal. The worst case in 100 trials is not normally distributed.
Section 2
How to See It
You see the Central Limit Theorem when someone invokes "large sample" behaviour, assumes normality for an average, or reasons that more data reduces the spread of an estimate. The diagnostic: aggregation of many independent (or weakly dependent) quantities, and a claim about the distribution of the aggregate.
Business
You're seeing Central Limit Theorem when a team reports that average revenue per user (ARPU) has a confidence interval that narrows as they add more users. The mean of many users is approximately normal even if individual spend is skewed; the CLT justifies the interval. Sample size is doing the work.
Technology
You're seeing Central Limit Theorem when latency is reported as mean ± standard deviation over many requests. The average of many independent latencies tends toward normal; the CLT supports using normal-based bounds for the mean, though the tail of max latency may not be normal.
Investing
You're seeing Central Limit Theorem when a quant says that the average return over many periods is approximately normal even if single-period returns are fat-tailed. Portfolio returns are sums of many positions; the CLT can justify normal approximations for some risk metrics when positions are numerous and not too correlated.
Markets
You're seeing Central Limit Theorem when pollsters report margin of error: the sample proportion is an average of binary outcomes, and the CLT says the sampling distribution of that proportion is approximately normal for large n. The "n=1000 gives ±3%" logic rests on the CLT.
Section 3
How to Use It
Decision filter
"When you're looking at an average or a sum of many independent (or weakly dependent) quantities, the CLT says the distribution of that aggregate is approximately normal for large n. Use that for inference (intervals, tests) and for intuition: more data stabilises the mean. Check that n is large enough and that you're not conflating the distribution of the mean with the distribution of individual outcomes."
As a founder
Use the CLT when interpreting metrics. Averages (conversion rate, LTV, NPS) become more reliable as sample size grows — the spread of the estimate shrinks. Don't over-interpret a single week's average when n is small; the CLT tells you that stability comes with volume. When someone says "the average is X," ask for n and the spread; the CLT tells you how much to trust the number.
As an investor
When evaluating performance, distinguish the distribution of average returns from the distribution of single returns. The CLT applies to averages; it does not make extreme single outcomes less likely. A fund can have near-normal average returns over time while still having fat-tailed drawdowns. Use the CLT where it applies (e.g. standard error of the mean) and avoid applying it to maxima or single events.
As a decision-maker
When making decisions from sample data, lean on the fact that means converge and that confidence intervals narrow with n. The CLT justifies using normal-based intervals when n is large and observations are not too dependent. When the underlying distribution is very skewed or fat-tailed, require larger n or use methods that don't assume normality.
Common misapplication: Assuming the CLT makes everything normal. It applies to sums and averages, not to individual draws, maxima, or minima. Second misapplication: Using normal approximations when n is small or when the underlying distribution is heavy-tailed. The CLT is asymptotic; "large enough" can be very large for skewed or fat-tailed data.
Thorp's work on blackjack and later on quantitative investing relied on understanding the distribution of sums (e.g. card counts, returns). The CLT and related limit theorems underpin the reasoning that sample means converge and that risk can be quantified when you have enough independent trials. His emphasis on edge and sample size is consistent with using the CLT to separate signal from noise.
Renaissance's systematic approach depends on large numbers of trades and signals. The CLT is in the background: averages of many independent (or weakly dependent) outcomes are more predictable. The firm's edge relies on having enough data and enough bets for statistical regularity to show up — the same idea the CLT formalises.
Section 6
Visual Explanation
Central Limit Theorem: sum (or average) of many independent draws tends toward normal, regardless of the shape of each draw.
Section 7
Connected Models
The Central Limit Theorem sits with other models about distributions, sampling, and inference. The grid below shows what reinforces it, what creates tension, and what it leads to.
Reinforces
Law of Large Numbers
The LLN says the sample mean converges to the population mean as n grows. The CLT adds that the distribution of the sample mean is approximately normal around that limit. Together they justify "more data → more stable and predictable mean."
Reinforces
Standard Deviation & Normal Distribution
The CLT explains why the normal distribution is ubiquitous for averages and sums. Standard deviation measures spread; for sample means, the standard error (σ/√n) shrinks with n — the CLT tells you the shape while the LLN tells you the centre.
Tension
[Variance](/mental-models/variance)
Variance must be finite for the CLT to apply. Infinite-variance distributions (e.g. some power laws) do not converge to normal. The tension: in fat-tailed domains, the CLT may not kick in at any practical n; you can't assume normality.
Tension
Regression to the Mean
Regression to the mean is about extreme values being followed by closer-to-average values. The CLT is about the distribution of averages becoming normal. Related but different: regression is about a single sequence; CLT is about the sampling distribution of the mean.
Section 8
One Key Quote
"The central limit theorem is the reason we can use the normal distribution for so many things — not because nature is normal, but because we are looking at sums and averages."
— George Pólya
The world is full of skewed and fat-tailed distributions. We often care about aggregates. The CLT says those aggregates behave in a predictable, normal way under the right conditions. The quote separates the source (often not normal) from the object we observe (often approximately normal).
Section 9
Analyst's Take
Faster Than Normal — Editorial View
Trust means more with more data. The CLT is the formal reason: the distribution of the mean tightens as n grows. When someone reports an average, ask for n and the standard error. Small n means wide uncertainty even if the point estimate looks precise.
Don't apply the CLT to the wrong thing. It applies to sums and averages of many draws, not to a single outcome, a maximum, or a minimum. The worst of 100 trials is not normally distributed. The average of 100 trials is. Use the right object.
Check the premises. Finite variance, independence (or weak dependence), and large enough n. In fat-tailed or highly correlated settings, the CLT may not hold at any practical sample size. When in doubt, use methods that don't assume normality or require larger n.
Section 10
Test Yourself
Is this mental model at work here?
Scenario 1
A team reports 'average session length 5.2 minutes' with n=10,000. They use a normal-based 95% confidence interval.
Scenario 2
An analyst says 'the maximum daily loss in 100 days is approximately normal.'
Scenario 3
Polling shows 52% support with n=1000. The margin of error is reported as ±3%.
Scenario 4
Returns are highly correlated across assets. A quant assumes the portfolio return is normal because 'we have 50 positions.'
Section 11
Summary & Further Reading
Summary: The Central Limit Theorem says the distribution of the sum (or average) of many independent random variables tends to normal. Use it to justify normal approximations for sample means and to understand why more data stabilises estimates. It does not apply to single outcomes, maxima, or heavy-tailed/dependent data without care.
Accessible treatment of sampling distributions and the role of the CLT in inference.
Leads-to
[Sampling](/mental-models/sampling)
Sampling is drawing a subset to estimate a population quantity. The CLT justifies using the sample mean and its standard error for inference when the sample is large enough and observations are not too dependent.
Leads-to
Statistical Significance
Tests and confidence intervals often assume (or approximate) normality of the test statistic. The CLT is the reason many statistics (e.g. sample mean, sample proportion) are approximately normal under the null, enabling standard significance levels and p-values.