Correlation vs Causation
Correlation describes variables moving together; causation requires a mechanism and, ideally, controlled identification. Confounding factors are the usual reason correlation lies — a third variable drives both.
Key Differences
| Dimension | Correlation | Causation |
|---|---|---|
| Claim | X and Y co-move in data | Changing X changes Y through a pathway |
| Evidence | Observation suffices | Mechanism + tests (natural experiments, A/B, etc.) |
| Risk | Spurious relationships from confounds | Overfitting narratives to noise |
| Action | Hypothesis generation | Intervention design |
| Slogan | Co-movement | Counterfactual |
When to use Correlation
- Early exploration, dashboards, and feature mining
- When you need cheap signals before expensive experiments
When to use Causation
- Pricing, policy, and safety decisions
- When incentives reward gaming the metric
Frequently Asked Questions
Correlation vs causation in startups?
Founders live on correlations first — funnel metrics, cohort curves, survey responses. Committing budget is causation territory: you need experiments or credible identification, otherwise you optimise noise. Confounding is rampant (seasonality, channel mix, macro).
What is a confounding factor?
A variable that influences both the presumed cause and the outcome, producing a misleading association. Classic example: ice cream sales correlate with drowning rates because summer drives both.