Replication Crisis Mental Model…

Q: What is Replication Crisis?

Replication Crisis is a mental model used for better thinking and decision-making.

Q: How do you apply Replication Crisis?

To apply Replication Crisis, identify situations where this framework is relevant, then use it as a lens to evaluate your options and decisions. The model is most useful when combined with other complementary mental models.

Q: What category does Replication Crisis fall under?

Replication Crisis falls under the General Thinking & Meta-Models category of mental models. Other models in this category can be found on the General Thinking & Meta-Models hub page.

Q: Why is Replication Crisis important?

Replication Crisis is important because it provides a structured way to think about problems that would otherwise be approached with intuition alone. Understanding this model helps you avoid common reasoning errors and make better decisions.

Contents

1. The Core Idea
2. How to See It
3. How to Use It
4. The Mechanism
5. Founders & Leaders in Action
6. Visual Explanation
7. Connected Models
8. One Key Quote
9. Analyst's Take
10. Test Yourself
11. Summary & Further Reading

General Thinking & Meta-Models

Section 1

The Core Idea

The replication crisis is the discovery that many published research findings do not hold when other researchers try to repeat the same study. In psychology, medicine, economics, and other fields, high-profile results have failed to replicate. The causes include publication bias (negative results don't get published), p-hacking (researchers try many analyses until one is "significant"), small samples (underpowered studies produce fragile results), and incentive misalignment (careers reward novel, positive findings). The crisis doesn't mean all science is wrong — it means that a single study, especially a surprising one, should be treated as provisional until replicated.

The mental model applies beyond academia. In business and investing, "studies show" and "research says" often rest on single papers or internal analyses that were never replicated. The discipline is to treat first findings as hypotheses, not truths; to prefer replicated and pre-registered work; and to be suspicious of results that are too good, too neat, or from small samples. When you base decisions on evidence, ask: has this been replicated? Was the analysis pre-registered? What's the base rate for replication in this domain?

Section 2

How to See It

Replication-crisis thinking shows up when someone treats a single finding as definitive or when they ask "has this been replicated?" The diagnostic: are we relying on one study, one experiment, or one internal analysis? Look for claims that cite a single paper, a single A/B test, or a single backtest. When the finding is surprising or convenient, extra scepticism is warranted. The pattern: first result → excitement → replication attempt → often failure. The discipline is to wait for replication or to treat the first result as a hypothesis to be tested.

Business

You're seeing Replication Crisis when a company bases a product or strategy decision on one successful A/B test or one cohort analysis. The result might be real — or it might be noise, p-hacking, or a fluke. Without replication (run the test again, or in another segment), the finding is provisional. The same applies to "best practices" drawn from a single case study. One company's success might be replicable; it might not. The crisis mindset: treat single findings as hypotheses.

Technology

You're seeing Replication Crisis when a model or algorithm is validated on one dataset or one time period and then deployed. Out-of-sample and out-of-time validation are forms of replication. When a quant strategy or ML model hasn't been tested on holdout data or new periods, it may not replicate. The crisis in ML and data science is the same: many published or internal results don't generalise. Replication means testing on new data and new contexts.

Investing

You're seeing Replication Crisis when an investor or analyst cites "research shows" or a single backtest to support a strategy. Backtests can be overfit; single studies can be wrong. The discipline is to ask: has this been replicated in other markets, periods, or by other researchers? What's the track record of this type of finding? The replication crisis in finance is well documented — many published factor and strategy results don't hold up out of sample.

Markets

You're seeing Replication Crisis when policy or regulation is based on a handful of studies that haven't been replicated. Evidence-based policy is good; policy based on unreplicated findings is risky. The same applies to industry standards or benchmarks that rest on single studies. The mental model: prefer meta-analyses and replicated work; treat single studies as suggestive, not conclusive.

Section 3

How to Use It

Decision filter

"Treat single findings as provisional. Prefer replicated, pre-registered, or out-of-sample evidence. When someone says 'research shows' or 'the data says,' ask: how many studies? Were they replicated? Was the analysis pre-registered? Adjust your confidence and your decisions accordingly."

As a founder

When you run experiments or analyses, don't treat the first significant result as the truth. Replicate: run the test again, or in another segment or period. Pre-register analyses where possible so you're not p-hacking. When you cite external research to support a decision, check whether the finding has been replicated — in psychology and medicine, many headline results have failed. The mistake is building strategy on a single study or a single internal test. The second mistake is ignoring negative replications because they're less exciting. Update your beliefs when replication fails.

As an investor

Portfolio companies and research often present single-study or single-backtest evidence. Ask: has this been replicated? For strategies and factors, what's the out-of-sample record? For product and growth claims, were the experiments run more than once? The replication crisis in asset pricing and factor investing means many published alphas don't hold. Apply the same scepticism to internal and external research. Prefer evidence that has been stress-tested across time, markets, or teams.

As a decision-maker

When evidence is presented to support a decision, grade it. Single study, no replication, surprising result? Low confidence. Replicated, pre-registered, or consistent across contexts? Higher confidence. Don't let a striking finding override the base rate — in many fields, most first findings don't replicate. Build a habit of asking "has this been replicated?" and of treating first findings as hypotheses. That reduces the chance of betting on false positives.

Common misapplication: Dismissing all research as unreliable. The replication crisis doesn't mean nothing replicates — it means we should distinguish between single findings and replicated bodies of work. Use the crisis to calibrate confidence, not to reject evidence altogether.

Second misapplication: Demanding replication for every small decision. Some decisions are low-stakes or time-sensitive; you'll often act on the best available evidence. Reserve strong replication standards for high-stakes, repeatable decisions. The model is a calibration tool, not a veto on all single-study evidence.

Section 4

The Mechanism

Section 5

Founders & Leaders in Action

Charlie MungerVice Chairman, Berkshire Hathaway, 1978–2023

Munger has long warned about incentive-caused bias and the reliability of reported results. His point: when people are rewarded for certain outcomes, they'll produce them — including in research and analysis. The replication crisis is incentive-caused bias in academia: careers reward novel, positive findings, so the literature is skewed. Munger's discipline is to ask "what are the incentives?" and to treat single findings with scepticism when the incentives favour positive or surprising results. The same applies to business: don't trust one backtest or one case study without asking whether it could be selection or gaming.

[Jim Simons](/people/jim-simons)Founder, Renaissance Technologies, 1982–present

Renaissance's edge depends on strategies that replicate out of sample and across time. Simons has emphasised that in quantitative investing, most ideas don't work when tested rigorously — they're overfit or flukes. The replication standard is built in: if a signal doesn't hold on holdout data or new periods, it's discarded. The replication crisis in finance — many published factors and alphas fail out of sample — is why firms like Renaissance treat in-sample results as provisional. The discipline: test on new data; don't deploy until it replicates.

Section 6

Visual Explanation

Replication Crisis: Many first findings fail when repeated. Treat single studies as provisional; prefer replicated, pre-registered, or out-of-sample evidence.

Section 7

Connected Models

The replication crisis sits with models of evidence, bias, and inference. The connections below either describe the same problem (publication bias, p-hacking), the mindset that amplifies it (confirmation bias), or the tools that address it (RCT, scientific method, significance).

Reinforces

Scientific Method

The scientific method is hypothesis, test, revise. Replication is the "test" that separates real findings from artefact. When we skip replication or don't value it, we weaken the method. The replication crisis is a reminder that the method requires replication and that incentives have to reward it.

Reinforces

Publication Bias

Publication bias is the tendency to publish positive results and not negative or null results. It's a direct cause of the replication crisis: the published literature is skewed toward findings that are more likely to be false positives. Fixing publication bias — publishing replications and null results — is part of fixing the crisis.

Reinforces

P-hacking

P-hacking is trying many analyses or specifications until one is "significant." It produces false positives that won't replicate. The replication crisis is partly a p-hacking crisis. The fix is pre-registration (commit to the analysis before seeing the data) and replication (run the pre-registered analysis again).

Reinforces

Confirmation Bias

Confirmation bias is the tendency to seek and accept evidence that supports our view. The replication crisis is exacerbated when we prefer striking, positive findings and downplay replications that fail. The discipline is to update when replication fails and to treat first findings as hypotheses, not confirmations.

Leads-to

Randomized Controlled Experiment

RCTs are the gold standard for causal evidence. They're also expensive and can be run only once in some settings. The replication crisis says: when possible, replicate the RCT or run it in another context. Single RCTs are stronger than single observational studies, but replication still raises confidence.

Tension

Statistical Significance

Statistical significance (p < 0.05) is often used as a gate for publication. The replication crisis shows that many "significant" results don't replicate — partly because p-hacking and publication bias make the published p-values optimistic. Significance is necessary but not sufficient; replication and pre-registration are the supplements.

Section 8

One Key Quote

"When the same scientific question is subjected to independent replication, the proportion of findings that are confirmed is often surprisingly low."
— John Ioannidis, Why Most Published Research Findings Are False (2005)

Ioannidis's paper was an early formal argument that many published findings are false. The replication efforts of the 2010s confirmed it in several fields. The practitioner's job: assume that a single finding might not replicate; prefer replicated and pre-registered work; and calibrate confidence and decisions accordingly.

Section 9

Analyst's Take

Faster Than Normal — Editorial View

Ask "has this been replicated?" When someone cites a study or an internal result, that's the first question. Single findings are provisional. Replicated findings (same result in another sample, period, or team) deserve more weight. In many domains, the base rate for replication is low — use that to calibrate.

Pre-register when you can. When you're running an analysis or an experiment, state the hypothesis and the analysis plan before you see the full results. That reduces p-hacking and makes the result more interpretable. Pre-registration doesn't guarantee truth, but it reduces the chance that the finding is an artefact of flexible analysis.

Don't dismiss all research. The replication crisis is a calibration tool, not a reason to reject evidence. Some findings replicate; meta-analyses and replicated bodies of work are valuable. The discipline is to distinguish between single, surprising findings and evidence that has been stress-tested. Use the former as hypotheses; use the latter with appropriate confidence.

Apply the same standard to internal work. Backtests, A/B tests, and internal analyses can suffer from the same ills: selection, p-hacking, underpowering. Replicate internal findings when the decision is high-stakes. Treat the first significant result as a hypothesis to be confirmed.

Section 10

Test Yourself

Is this mental model at work here?

Scenario 1

A company bases a major product decision on one A/B test that showed a significant lift. They don't rerun the test or check other segments.

Scenario 2

A team pre-registers their analysis plan before seeing the data, runs the analysis, and then replicates it on a holdout sample.

Scenario 3

An analyst dismisses all published research as unreliable because 'most findings don't replicate.'

Section 11

Summary & Further Reading

Summary: The replication crisis is the finding that many published research results do not hold when replicated. Causes include publication bias, p-hacking, small samples, and incentives for novelty. Use the model by treating single findings as provisional; preferring replicated, pre-registered, or out-of-sample evidence; and asking "has this been replicated?" when basing decisions on research. Don't dismiss all evidence — calibrate confidence. Connected ideas include scientific method, publication bias, p-hacking, confirmation bias, and RCTs.

Further Reading

Why Most Published Research Findings Are False — John Ioannidis (2005)

Article

The foundational paper. Ioannidis argues mathematically that under plausible assumptions, most published findings are false. The replication crisis confirmed the argument empirically.

Estimating the reproducibility of psychological science — Open Science Collaboration (2015)

Article

The Reproducibility Project: 100 psychology studies replicated; about a third replicated. The paper that made the replication crisis headline news.

The Checklist Manifesto — Atul Gawande (2009)

Book

Gawande on reducing error in complex tasks. Replication and pre-registration are checklists for research quality. The same mindset: make the process explicit to reduce failure.

Thinking, Fast and Slow — Daniel Kahneman (2011)

Book

Kahneman on biases and heuristics. He has written and spoken extensively on the replication crisis in psychology. The book provides the cognitive basis for why we're drawn to striking findings and why replication is essential.

Why this matters next

mental modelsIncentives

Replication Crisis applied the Incentives mental model

mental modelsConfirmation Bias

Replication Crisis applied the Confirmation Bias mental model

mental modelsIncentive-Caused Bias

Replication Crisis applied the Incentive-Caused Bias mental model

mental modelsScientific Method

Replication Crisis applied the Scientific Method mental model

mental modelsQuality

Replication Crisis applied the Quality mental model

mental modelsReproducibility

Replication Crisis applied the Reproducibility mental model

Frequently asked questions

What is Replication Crisis?

Replication Crisis is a mental model used for better thinking and decision-making.

How do you apply Replication Crisis?

To apply Replication Crisis, identify situations where this framework is relevant, then use it as a lens to evaluate your options and decisions. The model is most useful when combined with other complementary mental models.

What category does Replication Crisis fall under?

Replication Crisis falls under the General Thinking & Meta-Models category of mental models. Other models in this category can be found on the General Thinking & Meta-Models hub page.

Why is Replication Crisis important?

Replication Crisis is important because it provides a structured way to think about problems that would otherwise be approached with intuition alone. Understanding this model helps you avoid common reasoning errors and make better decisions.

Continue exploring

Mental model

5 Whys

A root-cause analysis technique that drills past symptoms by asking why five suc

Mental model

All Models Are Wrong

George Box s insight that every model is a simplification of reality — the quest

Mental model

Amara's Law

We tend to overestimate the effect of a technology in the short run and underest

Mental model

Eisenhower Decision Matrix

A prioritization framework that sorts tasks into four quadrants based on urgency

Mental model

First Principles Thinking

Breaking down complex problems to their most fundamental truths, then reasoning

Mental model

Goodhart's Law

When a measure becomes a target, it ceases to be a good measure — incentives shi

I send a newsletter every week — free, no spam, unsubscribe anytime.

Or open the full subscribe page.

Frequently asked questions

What is Replication Crisis?

Replication Crisis is a mental model used for better thinking and decision-making.

How do you apply Replication Crisis?

What category does Replication Crisis fall under?

Replication Crisis falls under the General Thinking & Meta-Models category of mental models. Other models in this category can be found on the General Thinking & Meta-Models hub page.

Why is Replication Crisis important?

Replication Crisis

The Core Idea

How to See It

How to Use It

The Mechanism

Founders & Leaders in Action

Visual Explanation

Connected Models

One Key Quote

Analyst's Take

Test Yourself

Is this mental model at work here?

Summary & Further Reading

Why this matters next

Frequently asked questions

Continue exploring

Popular Mental Models

Replication Crisis

The Core Idea

How to See It

How to Use It

The Mechanism

Founders & Leaders in Action

Visual Explanation

Connected Models

One Key Quote

Analyst's Take

Test Yourself

Is this mental model at work here?

Summary & Further Reading

Why this matters next

Frequently asked questions

Continue exploring

Popular Mental Models

The Core Idea

How to See It

How to Use It

The Mechanism

Founders & Leaders in Action

Visual Explanation

Connected Models

One Key Quote

Analyst's Take

Test Yourself

Is this mental model at work here?

Summary & Further Reading

Why this matters next

Frequently asked questions

Continue exploring

More like this, in your inbox

Popular Mental Models

The Core Idea

How to See It

How to Use It

The Mechanism

Founders & Leaders in Action

Visual Explanation

Connected Models

One Key Quote

Analyst's Take

Test Yourself

Is this mental model at work here?

Summary & Further Reading

Why this matters next

Frequently asked questions

Continue exploring

More like this, in your inbox

Popular Mental Models