What is Delphi Method?

Aggregate expert judgment through anonymous, iterative rounds — eliminating groupthink and dominance

How does Delphi Method work?

Delphi Method provides a step-by-step process for evaluating your options and making a structured decision. Follow the guided steps above to work through the tool with your specific situation.

When should you use Delphi Method?

Use Delphi Method when you face a decision that would benefit from structured analysis rather than gut instinct. It is particularly valuable for high-stakes or irreversible decisions where the cost of being wrong is significant.

Delphi Method: How to Use It…

Delphi Method: How to Use It… | Faster Than Normal

Use this when you need a credible forecast on something nobody can measure directly — emerging technology timelines, market size in five years, regulatory trajectories, geopolitical risk. The Delphi Method structures anonymous, iterative rounds of expert judgment to produce convergent estimates without the distortions of face-to-face debate, status hierarchies, and the loudest voice in the room.

Section 1

What This Tool Does

Put ten smart people in a room and ask them to estimate when autonomous vehicles will capture 20% of new car sales. What happens is predictable and depressing. The most senior person speaks first — or the most confident, which is often worse — and their number becomes the anchor. The group clusters around it. Dissenters self-censor because disagreeing with the VP of Strategy in front of peers carries career risk that disagreeing with a forecast does not. Someone with genuine domain expertise in battery technology stays quiet because the conversation has already moved to regulatory timelines, which is the CEO's hobby horse. The group converges on a number that feels like consensus but is actually one person's guess with nine people's implicit endorsement. This is not collective intelligence. It's collective capitulation.

Olaf Helmer and Norman Dalkey at the RAND Corporation understood this in the early 1950s, when the U.S. Air Force needed forecasts about Soviet nuclear capabilities and the available data was, to put it gently, insufficient. They couldn't run experiments. They couldn't build models — the variables were too uncertain and too political. What they had was experts: physicists, intelligence analysts, military strategists, each holding a piece of the puzzle but none holding the whole picture. The question was how to combine those pieces without the social dynamics that corrupt group judgment.

Their answer was elegant in its simplicity. Remove the room. Give each expert a questionnaire. Collect the responses anonymously. Aggregate the results — medians, interquartile ranges, distributions. Feed the aggregate back to the panel, along with the anonymised reasoning behind outlier positions. Then ask everyone to revise their estimates in light of the group data. Repeat. Two rounds, sometimes three, rarely more than four. The core mechanism is controlled feedback without social pressure — experts learn what others think and why, but never who thinks it, which means they can update their beliefs based on arguments rather than authority. The method was classified for nearly a decade. When RAND finally published it in 1963, it carried the name of the Oracle at Delphi — a fitting allusion to prophecy derived from structured consultation.

What makes the Delphi Method more than a glorified survey is the iteration. A single anonymous poll captures initial impressions. The feedback-and-revision cycle is where the real work happens. Experts who were anchored on a narrow frame see the range of estimates and realise their confidence was unwarranted. Outliers who hold genuinely novel information get a mechanism to explain their reasoning without the social penalty of being the contrarian in a room full of nodding heads. The group doesn't converge because of conformity pressure — it converges because information flows. When it works, the final-round estimate is measurably more accurate than the first-round estimate, and substantially more accurate than the average of unstructured group discussions. When it doesn't work, the reasons are almost always procedural: bad panel selection, poorly framed questions, or too few rounds to let the information actually circulate.

The Delphi Method occupies a specific niche in the decision-making toolkit. It is not for problems where data exists and models can be built — use the models. It is not for problems where a single expert clearly knows more than everyone else — just ask that expert. It is for the genuinely uncertain, the multi-dimensional, the problems where no individual has enough information but a structured collective might. Technology forecasting. Strategic planning under deep uncertainty. Policy design where the consequences are long-term and the evidence base is thin. These are the domains where human judgment, properly aggregated, remains the best instrument available.

Section 2

How to Use It — Step by Step

Instructions on the left. Worked example — "When will generative AI reduce the cost of producing a feature-length animated film by 50% or more?" — on the right.

Step 1 — Design

Define the question, select the panel, and set the parameters

The question must be specific enough to produce a falsifiable answer. "What will AI do to entertainment?" is a conversation starter, not a Delphi question. You need a measurable outcome, a defined threshold, and a timeframe or request for a timeframe estimate. Panel selection is the single highest-leverage decision in the entire process. You need 10–30 experts with genuinely diverse vantage points on the question — not 15 people from the same discipline who read the same papers. Include practitioners, researchers, adjacent-domain experts, and at least two people whose perspective you expect to be contrarian. Decide in advance: how many rounds (typically 2–4), what aggregation metrics you'll report (median, interquartile range, distribution), and what constitutes sufficient convergence to stop.

Worked example

Animated film cost reduction

Question: "By what year will generative AI tools reduce the total production cost of a feature-length animated film (comparable to a mid-budget Pixar or DreamWorks release) by 50% or more, relative to 2023 costs?" Panel: 18 experts — 4 animation studio technical directors, 3 AI researchers specialising in generative video, 2 film producers with budgeting expertise, 3 VFX pipeline engineers, 2 entertainment industry analysts, 2 independent animators already using AI tools, 1 IP/copyright attorney, 1 labour economist focused on creative industries. Parameters: 3 rounds, 10 business days per round, median and IQR reported after each round.

Step 2 — Elicit

Distribute Round 1 questionnaire and collect anonymous responses

Each panellist answers independently. No discussion, no shared channel, no way to see who else is on the panel. The questionnaire should include the core forecast question plus 2–4 supporting questions that surface the reasoning behind the estimate. "What is the single biggest bottleneck to achieving this?" "What would have to be true for this to happen before 2028?" "What probability do you assign to this never happening?" These reasoning questions are as important as the number — they generate the qualitative material that drives revision in later rounds. Collect all responses before anyone sees any results.

Worked example

Round 1 collection

All 18 panellists submit independently. Year estimates range from 2027 to 2042. Median: 2032. IQR: 2029–2036. Bottleneck responses cluster around three themes: (1) creative direction and "taste" still require human judgment that AI can't replicate, (2) copyright uncertainty around AI-generated content will slow studio adoption, (3) current generative video quality is insufficient for theatrical release but improving rapidly. Two outliers predict 2027–2028 — both are independent animators already shipping AI-assisted short films. One outlier predicts "never" — the IP attorney, citing unresolved copyright liability.

Step 3 — Feedback

Share anonymised aggregate results and outlier reasoning

Compile the statistical summary: median, IQR, full distribution. Then — and this is the step that separates Delphi from a poll — include anonymised summaries of the reasoning behind outlier positions. Don't identify who said what. Do present the strongest arguments from the tails of the distribution. The early outlier who predicted 2027 should have their reasoning presented as clearly and persuasively as the consensus view. The "never" prediction should be presented with its full logic chain. The goal is to give every panellist access to information and perspectives they didn't have in Round 1. Explicitly invite revision: "In light of this information, please revise your estimate if you wish. If you choose not to revise, please explain why the new information does not change your view."

Worked example

Round 1 feedback report

The report shows the distribution histogram, median (2032), and IQR (2029–2036). Outlier reasoning summaries: "Early estimate rationale: Current AI tools already reduce storyboarding time by 80% and background generation by 60%. The remaining bottleneck — character consistency and emotional performance — is a solvable technical problem, not a fundamental limitation. Comparable quality gaps in image generation closed in 18 months." "Late/never estimate rationale: Studios face strict liability for copyright infringement in AI-generated content. Until case law establishes safe harbour for AI-assisted production, no major studio will risk a $200M release on tools with unresolved IP status. Legal resolution typically takes 5–10 years from first major litigation."

Step 4 — Iterate

Collect revised estimates and repeat if convergence is insufficient

Panellists submit Round 2 responses. Typically, the IQR narrows by 20–40% per round as experts integrate new information. If the IQR is still wide after Round 2, run Round 3 with updated feedback. Stop when one of three conditions is met: the IQR has stabilised (no meaningful narrowing between rounds), you've reached your pre-set maximum rounds, or the panel has bifurcated into two distinct clusters with irreconcilable reasoning — which is itself a valuable finding. Don't force consensus. A bimodal distribution with clear reasoning for each mode is more useful than a false median.

Worked example

Rounds 2 and 3

Round 2: Median shifts to 2031. IQR narrows to 2029–2034. The "never" outlier revises to 2038, noting that the early-adopter reasoning about technical progress was persuasive but maintaining that legal barriers will delay major studio adoption by 5+ years beyond technical feasibility. Three panellists who initially estimated 2035+ pull their estimates to 2032, citing the independent animators' evidence of current capability. Round 3: Median holds at 2031. IQR: 2029–2033. Convergence is sufficient. The panel has reached a stable estimate with clear reasoning for the remaining spread.

Step 5 — Synthesise

Compile the final report with estimates, reasoning, and residual uncertainty

The deliverable is not a single number. It's a structured forecast: the final median, the final IQR (representing the range of informed disagreement), the key drivers identified by the panel, the primary sources of residual uncertainty, and the conditions under which the estimate would shift dramatically in either direction. Include the reasoning that survived all rounds — the arguments that panellists found persuasive enough to revise their estimates. This reasoning is often more valuable than the number itself, because it tells decision-makers what to monitor.

Worked example

Final synthesis

Central estimate: 2031 (median). Range of informed disagreement: 2029–2033 (IQR). Key drivers: (1) Rate of improvement in generative video consistency and emotional performance, (2) resolution of copyright liability for AI-generated content, (3) willingness of studios to adopt hybrid human-AI pipelines before full automation is possible. Accelerators: A major court ruling establishing safe harbour for AI-assisted content could pull the estimate to 2028–2029. Decelerators: A high-profile copyright lawsuit resulting in strict liability could push it to 2035+. What to monitor: First theatrical release produced with >50% AI-generated assets; first definitive copyright ruling on AI-generated visual content.

Section 3

When It Works Best

✓

Ideal Conditions for the Delphi Method

Dimension	Best fit
Problem type	Questions where empirical data is insufficient, models are unreliable, and expert judgment is the best available input. Technology timelines, market evolution, regulatory trajectories, geopolitical risk assessments. The common thread: genuine uncertainty that no single expert can resolve alone.
Information distribution	Most powerful when relevant knowledge is distributed across multiple experts in different domains. If one person clearly knows more than everyone else, just ask them. Delphi earns its overhead when the answer requires synthesising perspectives that no individual holds — the technologist's view of what's possible, the regulator's view of what's permissible, the economist's view of what's profitable.
Social dynamics	Essential when the panel includes significant power differentials — a CEO and junior analysts, a famous professor and early-career researchers, a client and their consultants. Anonymity neutralises hierarchy. The method is less necessary when all participants are genuine peers with no career incentive to defer.
Time horizon	Forecasts beyond 2–3 years, where trend extrapolation breaks down and structural discontinuities become plausible. For next-quarter revenue estimates, use your financial model. For "when will quantum computing break RSA encryption," use Delphi.

Section 4

When It Breaks Down

⚠

Failure Modes

Failure pattern	What goes wrong	What to use instead
Homogeneous panel	If all panellists share the same training, read the same sources, and operate in the same industry bubble, iteration doesn't add information — it just amplifies shared blind spots. Fifteen AI researchers will converge on a technically optimistic timeline that ignores regulatory, economic, and cultural barriers. The median looks precise. It's precisely wrong.	Deliberately recruit from adjacent domains; include at least 2–3 panellists whose expertise is orthogonal to the core question
Vague questions	Ambiguous questions produce ambiguous answers that converge on nothing meaningful. "When will AI transform healthcare?" — each panellist interprets "transform" differently, so the estimates aren't measuring the same thing. Apparent convergence masks definitional disagreement.	Pre-test the question with 2–3 people outside the panel; if they interpret it differently, rewrite until the interpretation is unambiguous
Conformity pressure through feedback	The feedback mechanism that makes Delphi work can also kill it. If panellists interpret the aggregate as "the right answer" rather than "what others currently think," they converge toward the median not because they've updated their beliefs but because they don't want to be the outlier. The result is artificial consensus — groupthink by mail.

The most dangerous failure mode is the homogeneous panel, because it's the hardest to detect from inside the process. Everything looks right. The rounds proceed smoothly. The IQR narrows. The reasoning is coherent. The final estimate feels authoritative. But if the panel was drawn from a single epistemic community — all technologists, all investors, all academics in the same subfield — the convergence reflects shared assumptions, not validated judgment. The RAND Corporation's original Delphi studies on Soviet military capability worked because the panels included physicists, intelligence analysts, military strategists, and political scientists. Each group saw different constraints. The physicist knew what was technically possible; the political scientist knew what was politically likely; the intelligence analyst knew what the observable evidence suggested. Remove any one perspective and the forecast degrades. The protection is simple but requires discipline: before finalising your panel, list the distinct perspectives the question demands, then verify that each perspective has at least two representatives. If your panel has twelve names and they all attended the same three conferences last year, start over.

Section 5

Visual Explanation

Section 6

Pairs With

The Delphi Method produces a structured forecast. What you do with that forecast — and how you prepare the question it answers — depends on the tools you pair it with.

Use before

Reframing

The quality of a Delphi output is bounded by the quality of the question. Reframing forces you to interrogate whether you're asking the right question before you recruit a panel and spend four weeks collecting answers. "When will EVs dominate?" is a different question from "When will the total cost of ownership of an EV fall below an equivalent ICE vehicle in the US?" — and the second one produces a usable forecast.

Use before

Cynefin Framework

Delphi works in the "complicated" and "complex" domains of Cynefin — where expert judgment adds value because the system isn't fully knowable through data alone. In the "obvious" domain, just look at the data. In the "chaotic" domain, act first and sense later. Cynefin tells you whether Delphi is the right tool before you invest weeks in running it.

Use after

Scenario Planning

Delphi gives you a calibrated range. Scenario Planning takes that range and builds narrative futures around the key uncertainties. The Delphi output — "2029–2033, depending on copyright resolution and technical progress" — becomes the input for three scenarios: early resolution, delayed resolution, and fragmented resolution. Now you can stress-test your strategy against each.

Use after

Decision Matrix

Section 7

Real-World Application

Shell — long-range energy forecasting in the 1970s oil crisis

The scenario

In the late 1960s, Royal Dutch Shell faced a forecasting problem that no financial model could solve. The company needed to make capital allocation decisions — refinery investments, exploration commitments, tanker fleet sizing — with payback horizons of 15–25 years. The dominant industry assumption was that oil prices would remain stable and supply would grow predictably. Shell's planning team, led by Pierre Wack, suspected this assumption was fragile but couldn't prove it with data. The question wasn't what oil prices would be next year. It was whether the entire structure of the global oil market could shift in ways that would invalidate a generation of infrastructure investments.

How the tool applied

Shell's planning group used a modified Delphi process as one input into their broader scenario planning methodology. They assembled panels that deliberately crossed disciplinary boundaries — petroleum geologists, Middle Eastern political analysts, economists, military strategists, and energy policy experts. The panels were asked not for point forecasts but for conditional estimates: "If OPEC nations were to restrict supply as a political instrument, what is the plausible range of price impact?" "What is the probability that OPEC coordination succeeds for more than six months?" The anonymised, iterative structure allowed the political analysts — who understood the growing nationalism in oil-producing states — to present reasoning that the geologists and economists would have dismissed in a face-to-face meeting as "too political." The iteration forced the technical experts to engage with geopolitical reasoning they would normally have filtered out.

What it surfaced

The Delphi-informed panels produced estimates that diverged sharply from industry consensus. Where most oil companies assumed stable prices through the 1970s, Shell's panels identified a plausible scenario in which coordinated OPEC action could triple or quadruple prices within months. The key insight came from the intersection of two expert domains: political analysts who understood that newly independent oil states had both the motivation and the emerging coordination mechanisms to restrict supply, and economists who modelled the price elasticity of oil demand and showed that even modest supply restrictions would produce dramatic price spikes because short-term demand was highly inelastic. Neither group alone would have produced the forecast. The Delphi structure forced the synthesis.

Section 8

Analyst's Take

Faster Than Normal — Editorial View

The Delphi Method is simultaneously one of the most validated forecasting techniques in the research literature and one of the most butchered in practice. The validation is real: meta-analyses consistently show that structured, anonymous, iterative expert judgment outperforms both unstructured group discussion and individual expert forecasts, particularly for long-range, multi-factor questions. The butchering is equally real. Most organisations that claim to use Delphi run a single anonymous survey, call it "a Delphi study," and skip the iteration entirely. That's not Delphi. That's SurveyMonkey with pretensions. The iteration is the method. Without feedback and revision, you're just averaging first impressions — which is precisely the kind of shallow aggregation that Helmer and Dalkey designed the process to transcend.

The failure mode I see most often among founders and investors is panel selection driven by prestige rather than perspective diversity. The instinct is to recruit the most impressive names — the professor with the most citations, the executive with the biggest title, the investor with the highest-profile portfolio. Impressive panels produce impressive-looking reports. They do not necessarily produce accurate forecasts. What you actually need is coverage: does the panel, collectively, see the question from every relevant angle? A junior regulatory analyst at the FDA may contribute more to a biotech timeline forecast than a Nobel laureate in chemistry, because the binding constraint is regulatory, not scientific. Prestige and relevance are different axes. Optimise for relevance.

The highest-leverage modification I've encountered is what practitioners call the "real-time Delphi" — collapsing the multi-week round structure into a continuous, asynchronous digital process where panellists can see the evolving aggregate and revise their estimates at any time. Murray Turoff at the New Jersey Institute of Technology pioneered this variant. It preserves anonymity and iteration while cutting elapsed time from weeks to days. The tradeoff is that you lose the clean separation between rounds, which makes it harder to track how information flows through the panel. But for most commercial applications — where the question is "should we enter this market in 2025 or 2027" rather than "when will fusion energy be commercially viable" — the speed gain is worth the analytical cost. Run it on a shared dashboard. Let experts update as they think. Watch the distribution shift in real time. It's the same cognitive mechanism as classical Delphi, compressed into a format that matches how modern teams actually work.

Section 9

Top Resources

An Experimental Application of the Delphi Method to the Use of Experts — Norman Dalkey & Olaf Helmer (1963)

Primary source

The original RAND paper that introduced the method to the public after nearly a decade of classified use. Dense, methodological, and surprisingly readable. Dalkey and Helmer lay out the rationale for anonymity, iteration, and controlled feedback with a clarity that most subsequent textbooks fail to match. Start here to understand what the inventors actually intended — which is often quite different from how the method is practiced today.

The Delphi Method: Techniques and Applications — Harold Linstone & Murray Turoff, eds. (1975/2002)

Book

The definitive reference work, originally published in 1975 and updated in 2002 as a free digital edition. Covers classical Delphi, policy Delphi, real-time Delphi, and dozens of application case studies across technology forecasting, healthcare, education, and public policy. Turoff's chapters on the real-time variant are particularly valuable for anyone adapting the method to modern digital environments. [VERIFY]

Thinking, Fast and Slow — Daniel Kahneman (2011)

Book

Not about Delphi specifically, but essential for understanding the cognitive biases the method is designed to neutralise. Kahneman's work on anchoring, the availability heuristic, and overconfidence explains why unstructured group forecasting fails so reliably — and why anonymity and iteration are genuine cognitive interventions rather than procedural overhead. Chapter 24 on expert intuition versus statistical prediction is directly relevant.

Superforecasting: The Art and Science of Prediction — Philip Tetlock & Dan Gardner (2015)

Book

Tetlock's research on the Good Judgment Project demonstrates that structured forecasting processes — including Delphi-like aggregation of diverse perspectives — consistently outperform individual experts, even famous ones. The book provides the empirical evidence base for why the Delphi Method works: cognitive diversity, calibrated uncertainty, and iterative updating are the three ingredients that separate good forecasts from confident guesses.

Only the Paranoid Survive — Andrew Grove (1996)

Book

Grove's account of Intel's strategic inflection points illustrates exactly the kind of decision environment where Delphi earns its keep. His description of how Intel navigated the shift from memory chips to microprocessors — a decision made under deep uncertainty about market evolution, competitor behaviour, and technology trajectories — is a masterclass in why structured expert judgment matters when the data runs out and the models break.

Delphi Method

Continue exploring

What This Tool Does

How to Use It — Step by Step

Define the question, select the panel, and set the parameters

Animated film cost reduction

Distribute Round 1 questionnaire and collect anonymous responses

Round 1 collection

Share anonymised aggregate results and outlier reasoning

Round 1 feedback report

Collect revised estimates and repeat if convergence is insufficient

Rounds 2 and 3

Compile the final report with estimates, reasoning, and residual uncertainty

Final synthesis

When It Works Best

Ideal Conditions for the Delphi Method

When It Breaks Down

Failure Modes

Visual Explanation

Pairs With

Real-World Application

Shell — long-range energy forecasting in the 1970s oil crisis

Analyst's Take

Top Resources

This connects to...

Continue exploring

More like this, in your inbox

What This Tool Does

How to Use It — Step by Step

Define the question, select the panel, and set the parameters

Animated film cost reduction

Distribute Round 1 questionnaire and collect anonymous responses

Round 1 collection

Share anonymised aggregate results and outlier reasoning

Round 1 feedback report

Collect revised estimates and repeat if convergence is insufficient

Rounds 2 and 3

Compile the final report with estimates, reasoning, and residual uncertainty

Final synthesis

When It Works Best

Ideal Conditions for the Delphi Method

When It Breaks Down

Failure Modes

Visual Explanation

Pairs With

Real-World Application

Shell — long-range energy forecasting in the 1970s oil crisis

Analyst's Take

Top Resources

This connects to...