In September 2002, the United States Joint Forces Command conducted Millennium Challenge 2002 — the most expensive war game in American military history, designed to validate the Pentagon's new doctrine of network-centric warfare before the anticipated invasion of Iraq. The Blue Team, representing the United States, deployed the full suite of advanced intelligence, surveillance, and precision-strike capabilities that would define twenty-first-century American warfighting. The Red Team, representing an unnamed Middle Eastern adversary, was commanded by retired Marine Corps Lieutenant General Paul Van Riper. Within the first two days, Van Riper sank the Blue Team's carrier battle group — sixteen ships, including an aircraft carrier — using a combination of motorcycle couriers instead of radio communication (defeating signals intelligence), swarm attacks by small boats packed with explosives (overwhelming missile defense systems), and cruise missiles launched from shore positions hidden in civilian infrastructure. The Blue Team suffered casualties that, in a real conflict, would have exceeded 20,000 personnel.
The exercise was halted. The simulation was reset. Van Riper was given restrictions on his tactics. The Blue Team was allowed to refloat its ships. The war game continued to its predetermined conclusion: a Blue Team victory that validated the Pentagon's doctrine. Van Riper resigned in protest, later telling reporters that the exercise had been rigged to confirm the answer the Pentagon wanted rather than reveal the answer it needed.
The story of Millennium Challenge 2002 is the definitive case study in why Red Teaming exists — and what happens when organizations refuse to accept its findings. Red Teaming is the deliberate practice of assigning a dedicated group to attack your own plans, strategies, and assumptions from the adversary's perspective. The Red Team's job is not to offer constructive feedback, suggest improvements, or propose alternatives. It is to break your plan. To find the failure modes you missed, the vulnerabilities you assumed away, the scenarios you didn't model because they contradicted your thesis. The Red Team succeeds when your plan fails — in the simulation rather than in reality.
The practice originated in the Catholic Church, not the military. During the canonization process, the Vatican appointed an Advocatus Diaboli — the Devil's Advocate — whose formal responsibility was to argue against sainthood by presenting every possible objection, every piece of negative evidence, every reason the candidate was unworthy. The role was not adversarial in spirit; it was adversarial in method. The Church understood that a process designed to confirm sainthood would confirm sainthood whether the candidate deserved it or not. Only a process that actively tried to disprove the conclusion could be trusted to validate it.
The military adopted the concept because warfare punishes confirmation bias with body bags. The Israeli Defense Forces institutionalized Red Teaming after the 1973 Yom Kippur War, when a surprise Egyptian-Syrian attack nearly destroyed the country because Israeli intelligence had dismissed indicators of imminent attack that contradicted the prevailing assessment — called "the Concept" — that Egypt would not go to war without first achieving air superiority. The intelligence was available. The analysts had it. The institutional framework filtered it out because it contradicted the conclusion the system had already reached. After 1973, the IDF created the Ipcha Mistabra unit (a Talmudic phrase meaning "on the contrary") — a dedicated team whose sole mission was to construct the strongest possible case against the prevailing intelligence assessment. The unit existed not because Israeli intelligence was incompetent but because competent analysts operating within a consensus framework will systematically suppress disconfirming evidence unless the institution forces them not to.
The principle translates directly to business, investing, and any domain where decisions are made under uncertainty by groups of humans subject to cognitive bias. Every strategic plan, every investment thesis, every product roadmap is a hypothesis about how the future will unfold. The natural organizational process — the one that operates by default in every company, every fund, every government agency — is to gather evidence that supports the hypothesis, dismiss evidence that contradicts it, and move forward with increasing confidence toward a conclusion the organization reached before the analysis began. Red Teaming is the institutional mechanism that interrupts this process. It assigns someone the explicit job of attacking the plan, not because the plan is necessarily wrong, but because only a plan that survives a determined attack can be trusted.
The deepest insight of Red Teaming is not tactical but epistemological: you cannot validate your own thinking by thinking more about it in the same way. The biases that produced your plan are the same biases that evaluate your plan. Confirmation bias doesn't announce itself. Narrative construction feels like analysis. Groupthink feels like consensus. The only reliable corrective is an external perspective that is structurally incentivized to disagree — not a colleague who raises a concern in a meeting and backs down when the CEO pushes back, but a dedicated function with the organizational authority to present the worst case without career consequences. The Red Team is not an opinion. It is a process. It is the institutionalization of doubt in organizations that naturally select for certainty.
Section 2
How to See It
Red Teaming reveals itself through its absence more clearly than through its presence. The signature is not a specific organizational structure but a specific organizational failure: decisions made with high confidence that collapse on contact with reality in ways that were foreseeable but unforeseen — because nobody in the decision process was assigned the task of foreseeing them.
Look for the structural conditions that demand Red Teaming: high-stakes decisions where the cost of being wrong is severe, environments where the decision-makers have strong prior beliefs, organizations where dissent carries career risk, and plans that depend on specific assumptions about adversary behavior, market response, or technology trajectory. Where all four converge, the absence of Red Teaming is almost certainly producing overconfident plans that will fail in predictable ways.
Military
You're seeing Red Team when a military organization assigns a dedicated unit to war-game its own operational plans from the enemy's perspective before execution. Before Operation Neptune Spear — the 2011 raid on Osama bin Laden's compound in Abbottabad — the CIA formed a "Red Cell" specifically tasked with challenging the intelligence assessment that bin Laden was in the compound. The assessment team estimated 60–80% confidence. The Red Cell argued for lower confidence, identifying alternative explanations for the observed patterns — a high-value drug lord, a retired Pakistani intelligence officer, a wealthy recluse. The Red Cell didn't prove bin Laden wasn't there. It forced the decision-makers to confront the possibility that he wasn't, which shaped the operational planning: the raid was designed to succeed regardless of whether bin Laden was present, with contingencies for every scenario the Red Cell had surfaced.
Business
You're seeing Red Team when a company formally assigns a team to attack its own product, strategy, or security posture before a competitor or the market does. Microsoft's internal security team runs continuous Red Team operations against its own Azure cloud infrastructure — dedicated hackers whose job is to breach Microsoft's own systems using the same techniques an actual adversary would employ. Every vulnerability the Red Team discovers is a vulnerability a real attacker won't get to exploit first. The practice is not a suggestion box or a bug bounty. It is a permanent adversarial function operating inside the organization with the explicit mission of finding ways to break what the organization has built.
Investing
You're seeing Red Team when an investment fund systematically constructs the bear case for every position before committing capital. Bridgewater Associates under Ray Dalio institutionalized what Dalio called "radical transparency" — a system where every investment thesis was subjected to structured disagreement before execution. The process wasn't collegial debate. It was adversarial stress-testing: analysts were expected to identify every assumption in the thesis and construct the strongest possible case for why each assumption was wrong. The Red Team function was embedded in the culture rather than assigned to a specific unit — every person in the room was expected to red-team every idea, regardless of who proposed it or their seniority.
Cybersecurity
You're seeing Red Team when an organization hires penetration testers to attack its own network, facilities, or processes using real-world adversarial techniques. Red Team engagements in cybersecurity go beyond standard vulnerability scanning — they simulate the full kill chain of a sophisticated attacker, including social engineering, physical intrusion, and supply chain compromise. The Red Team's success is the organization's intelligence: every breach they achieve reveals a failure mode that exists in the real threat environment. The organizations that invest most heavily in Red Teaming — banks, defense contractors, critical infrastructure operators — do so because they understand that the absence of detected breaches is not evidence of security. It is evidence that nobody has tried hard enough to break in.
Section 3
How to Use It
Decision filter
"Before committing to this plan, have I assigned someone the explicit job of destroying it? Not improving it — destroying it. If the only people who have evaluated this strategy are the people who built it, I am validating my assumptions with my assumptions. The question is not whether this plan can succeed. It is whether it can survive the most capable adversary I can simulate — and whether I've given that adversary permission to win."
As a founder
The founder's deepest vulnerability to Red Teaming deficiency is structural: founders are selected for conviction. The ability to project absolute certainty about an uncertain future is what attracts co-founders, employees, and investors. That same conviction makes founders constitutionally resistant to the adversarial stress-testing that Red Teaming requires. The founder who builds a $50 million Series C pitch around a market thesis does not naturally invite a team to spend two weeks constructing the strongest possible case that the thesis is wrong.
Jeff Bezos addressed this by embedding adversarial process into Amazon's decision-making architecture. The six-page narrative memo required before every significant meeting is a structural Red Team: it forces the proposer to articulate every assumption explicitly, which makes each assumption a target for challenge. The "disagree and commit" principle is a cultural Red Team: it gives every person in the room institutional permission to disagree with the most senior person present — and requires the senior person to listen before the commitment phase begins. The Bar Raiser program in hiring is a Red Team applied to talent decisions: a designated evaluator from outside the hiring team whose sole job is to identify reasons not to hire the candidate, counterbalancing the hiring manager's confirmation bias toward candidates they've already invested time in evaluating.
The operational discipline: before every irreversible decision — a major product launch, a market entry, a significant hire, a fundraising strategy — assign one person or team the explicit task of building the strongest case against the decision. Not a devil's advocate who raises concerns in a meeting and defers when challenged. A dedicated function with time, resources, and organizational permission to present findings that the decision-maker may not want to hear. The Red Team's output is not a recommendation. It is a stress test. The decision-maker retains authority. But the decision is now informed by the best attack the organization could construct against its own plan.
As an investor
Every investment thesis is a hypothesis about the future — and the natural process of constructing a thesis is to gather supporting evidence until conviction reaches a threshold. Red Teaming inverts this: after constructing the bull case, formally construct the strongest possible bear case using the same analytical rigor. Not a paragraph of "risks and mitigations" at the end of the investment memo. A complete adversarial analysis that identifies the specific assumptions most likely to be wrong, the scenarios under which the thesis fails entirely, and the early indicators that would signal the bear case is materializing.
Charlie Munger built his entire analytical method around this principle. His concept of "inversion" — always invert, always think about the problem backward — is Red Teaming applied to individual cognition. Instead of asking "How does this investment succeed?" Munger asks "How does this investment fail?" and works backward from every failure mode to assess whether the current evidence rules it out. The discipline is not pessimism. It is epistemological hygiene: the realization that your bull case was constructed by a brain subject to confirmation bias, narrative construction, and the sunk-cost fallacy of analytical effort already invested — and that the only corrective is to force the same brain to construct an equally rigorous case for the opposite conclusion.
The practical implementation: for every significant position, write the bear case memo before committing capital. Assign it to the analyst least involved in constructing the bull case. Give the bear case the same page count, the same analytical depth, and the same presentation time as the bull case. The positions that survive this process are the ones worth holding. The positions that don't are the ones that would have failed with your capital at risk.
As a decision-maker
Inside large organizations, Red Teaming is most needed where it is hardest to implement — at the senior leadership level, where the decisions are most consequential and the organizational incentives to confirm the leader's existing view are strongest. Every layer of management between the leader and the front line has a career incentive to tell the leader what they want to hear. The information environment at the top of an organization is the most systematically distorted in the entire company.
Andy Grove institutionalized Red Teaming at Intel through a culture of "constructive confrontation" — structured disagreement where the quality of the argument mattered more than the rank of the person making it. When Grove was evaluating the decision to exit the memory chip business, he didn't surround himself with advisors who confirmed the decision was right. He deliberately sought the strongest internal arguments for staying in memory — from engineers who had built their careers on memory technology, from sales leaders who had deep customer relationships in the memory market, from financial analysts who could construct scenarios where memory remained profitable. The decision to exit was made after the Red Team had exhausted its best case for staying. That's not decisiveness despite disagreement. It's decisiveness because of disagreement — confidence earned by surviving the strongest attack rather than confidence assumed by avoiding it.
The institutional requirement: Red Teaming must be a permanent function, not an occasional exercise. A Red Team assembled the week before a board meeting is theater. A Red Team that operates continuously — challenging strategic assumptions quarterly, stress-testing financial projections against adverse scenarios monthly, war-gaming competitive threats annually — produces the organizational immune system that catches threats before they become crises.
Common misapplication: Treating Red Teaming as devil's advocacy in meetings. A person who raises a concern, gets pushback, and drops it is not a Red Team. A Red Team has time, resources, and institutional authority to build a complete adversarial case. The difference is the difference between a friend saying "are you sure about this?" and a prosecutor building a case for the defense. The former is social. The latter is structural. Only the structural version reliably surfaces the information the decision-maker needs.
Second misapplication: Running Red Team exercises and then ignoring the findings when they contradict the preferred plan. Millennium Challenge 2002 is the canonical example: the Red Team destroyed the Blue Team's fleet, and the exercise was reset to produce the desired outcome. The value of Red Teaming is zero if the institution isn't willing to act on unwelcome results. The test of organizational commitment to Red Teaming is not whether the exercise is conducted but whether the findings change the plan.
Section 4
The Mechanism
Section 5
Founders & Leaders in Action
The leaders who practice Red Teaming most effectively share a paradoxical trait: supreme confidence in their judgment combined with systematic distrust of their own reasoning process. They believe they're right — and they build institutions designed to tell them when they're wrong. The discipline is not humility in the conventional sense. It is operational awareness that the cognitive machinery producing their convictions is the same machinery that produces everyone else's delusions, and that only an adversarial process can distinguish between the two.
The pattern across these cases is consistent: the leader who institutionalizes Red Teaming makes better decisions not because the Red Team is always right, but because surviving the Red Team's attack forces the plan to confront its own weaknesses before reality does. The leaders who fail are invariably the ones who surround themselves with confirmers, dismiss dissent as disloyalty, and discover their plan's vulnerabilities only when the market, the enemy, or the customer reveals them at maximum cost.
Bezos built Amazon's decision architecture around structural Red Teaming — not a designated adversarial unit but a set of institutional processes that forced every significant proposal to survive adversarial challenge before receiving commitment. The six-page narrative memo is the most visible mechanism: by requiring every proposal to be written as a structured argument rather than presented as bullet-point slides, the format forces every assumption into the open where it can be attacked. A PowerPoint presentation conceals logical gaps behind confident delivery. A six-page memo exposes them because the reader processes the argument at their own speed, identifies the leaps in logic, and arrives at the meeting with specific challenges.
The "disagree and commit" principle is a Red Team operating norm: every person at the table has the institutional obligation to voice disagreement before the commitment phase. Bezos didn't want compliance. He wanted the strongest available objection to surface before the decision was irreversible. The cultural expectation was explicit: if you disagree and stay silent, you have failed the organization. The disagreement phase is the Red Team. The commitment phase is the execution. Separating the two — structurally, temporally, and culturally — meant that adversarial challenge and organizational alignment were sequential rather than contradictory.
The Bar Raiser program applied Red Teaming to the most consequential repeated decision in a scaling company: hiring. Every interview loop included a designated Bar Raiser — an experienced interviewer from outside the hiring team — whose explicit job was to identify reasons not to hire the candidate. The Bar Raiser counterbalanced the hiring manager's natural confirmation bias toward candidates they'd already invested time in, candidates who looked like existing team members, or candidates who told the interviewer what they wanted to hear. The Bar Raiser's incentive was institutional quality, not team-level urgency. That structural separation of incentives is the essence of Red Teaming: the adversarial function must be insulated from the pressures that bias the primary decision-maker.
Charlie MungerVice Chairman, Berkshire Hathaway, 1978–2023
Munger built his entire analytical method around what amounts to personal Red Teaming — the systematic practice of attacking his own investment theses before committing capital. His concept of "inversion" is the cognitive equivalent of a one-person Red Team: instead of asking "Why will this investment succeed?" Munger asks "Why will this investment fail?" and catalogs every pathway to permanent capital loss before evaluating the upside.
The discipline was more rigorous than casual contrarianism. Munger maintained a checklist of cognitive biases — confirmation bias, social proof, commitment and consistency bias, availability heuristic — and explicitly reviewed each one against his current thesis. The checklist was a Red Team protocol: a structured process for identifying the specific ways his own mind could be misleading him about the quality of the evidence. He wasn't checking whether the thesis was wrong. He was checking whether his brain was constructing the illusion that the thesis was right through mechanisms he knew were unreliable.
Munger's most famous analytical principle — "I never allow myself to have an opinion on anything that I don't know the other side's argument better than they do" — is Red Teaming applied to intellectual honesty. Before defending a position, Munger required himself to construct the strongest possible opposing argument. If he couldn't steelman the opposition, he didn't trust his own conclusion. The discipline is epistemological: you don't know what you think you know until you've tried to prove yourself wrong with the same intensity you used to prove yourself right. Most investors stop at conviction. Munger tested conviction against the most capable opposition he could construct — which happened to be his own mind operating under adversarial instructions.
Grove's "constructive confrontation" culture at Intel was Red Teaming embedded in organizational DNA rather than delegated to a specialized unit. The principle was explicit: in any meeting, the quality of the argument outweighed the rank of the person making it. Engineers were expected to challenge vice presidents. Analysts were expected to challenge the CEO. The cultural norm was that agreement without challenge was suspect — a sign that the room had converged on a comfortable conclusion rather than a correct one.
The most consequential application was Intel's decision to exit the memory chip business in 1985. Grove didn't make the decision based on his own analysis and then seek confirmation. He structured the evaluation as a deliberate adversarial process: the strongest advocates for staying in memory were given resources, time, and organizational support to build the best possible case for their position. The arguments for staying were real — memory was Intel's founding technology, it still generated significant revenue, and the Japanese pricing advantage might prove temporary. Grove heard the full Red Team case for staying, engaged with it seriously, and concluded that the case for exiting was stronger. The decision was better because it had survived the strongest available attack.
Grove later formalized this as the "strategic inflection point" framework: the recognition that by the time disconfirming evidence is overwhelming, it's too late to act on it. The Red Team's function is to surface weak disconfirming signals early — before they become strong enough for the consensus to acknowledge — and force the organization to engage with them when action is still possible. Grove's paranoia wasn't temperamental. It was institutional: a permanent state of readiness to hear and act on adversarial information before the market delivered it at a price Intel couldn't afford.
Ed CatmullCo-founder & President, Pixar, 1986–2018
Catmull built the Braintrust at Pixar — the most successful creative Red Team in entertainment history. The Braintrust was a group of senior creative leaders who reviewed every Pixar film at regular intervals during production. Their mandate was specific and adversarial: identify everything that isn't working. Not suggest fixes. Not propose alternatives. Diagnose the failures with surgical precision and leave the solutions to the film's director.
The structural design was critical. The Braintrust had no authority over the director. It could not mandate changes, override creative decisions, or impose its preferences. Its power was purely diagnostic: it surfaced the problems the director couldn't see because the director was too close to the material. This separation of diagnostic authority from decision authority is the key architectural principle that makes Red Teaming work in any domain. The Red Team identifies vulnerabilities. The decision-maker retains control. The Red Team's value is in the quality of its attack, not in its power to dictate the response.
Catmull's deepest insight was that every Pixar film was terrible in its early stages — and that the Braintrust's function was to accelerate the process of making it less terrible by surfacing failures that internal teams couldn't see. The creative team's confirmation bias — their attachment to characters, scenes, and story arcs they'd spent months developing — was structurally identical to the strategic planner's confirmation bias toward plans they'd spent months building. The Braintrust broke that confirmation loop by introducing external perspectives with no emotional investment in the existing work. The result: Pixar released fifteen consecutive commercially and critically successful films between 1995 and 2010, a record unmatched in entertainment history. The Red Team didn't make the films great. It prevented the team's blind spots from making them mediocre.
Churchill understood Red Teaming as a survival necessity for wartime leadership — the recognition that the information reaching the Prime Minister was filtered through military and intelligence bureaucracies whose institutional incentives biased them toward confirming the prevailing strategy rather than challenging it. His response was to build multiple adversarial channels that bypassed the official reporting structure.
The Statistical Office under Lord Cherwell was Churchill's personal Red Team against institutional optimism: a direct pipeline of raw data — aircraft production numbers, shipping losses, food stocks, casualty figures — that circumvented the military establishment's tendency to present favorable summaries. Churchill demanded the raw numbers specifically because he knew that summarized information was filtered information, and filtered information was confirmation-biased information. The Statistical Office didn't interpret data. It delivered it. The interpretation was Churchill's — made against the backdrop of official assessments he could now test against unmediated evidence.
Churchill also maintained a practice that amounted to personal Red Teaming: he wrote memos to himself arguing against his own preferred courses of action. Before committing to strategic decisions — the North Africa campaign, the timing of D-Day, the allocation of bomber resources — Churchill forced himself to construct the case for alternatives he was inclined to reject. The practice was not performative humility. It was operational epistemology: the recognition that the Prime Minister's conviction, however well-founded, was produced by the same cognitive machinery that produced every other leader's overconfidence, and that only a deliberate adversarial process — even one conducted internally — could distinguish genuine strategic insight from narrative construction. The decisions Churchill made after red-teaming his own reasoning were not always right. But they were informed by the strongest objections available, which meant the failures were failures of uncertainty, not failures of unexamined assumptions.
Section 6
Visual Explanation
Section 7
Connected Models
Red Teaming does not operate in isolation. It intersects with frameworks that explain why human reasoning fails under pressure, how organizations process dissent, and what structural responses emerge when leaders take adversarial analysis seriously. The strongest practitioners understand Red Teaming not as a standalone exercise but as one component of a decision architecture that acknowledges and compensates for the systematic limitations of human judgment.
The six connections below represent the most analytically productive relationships. Two frameworks reinforce Red Teaming by describing parallel mechanisms of adversarial truth-seeking. Two create tension by representing commitments and mindsets that Red Teaming simultaneously requires and threatens. Two describe the strategic outcomes that disciplined Red Teaming naturally produces — the structural adaptations that effective decision-makers build when they take their own fallibility seriously.
Reinforces
Principle of Falsification
Karl Popper's Principle of Falsification — the criterion that a hypothesis is scientific only if it can be disproven — is the epistemological foundation on which Red Teaming rests. Falsification says: don't ask whether your hypothesis is supported by the evidence; ask whether any conceivable evidence could prove it wrong. If no evidence could disprove it, it's not knowledge — it's faith. Red Teaming is falsification operationalized: instead of waiting for reality to disprove your strategic hypothesis, you assign a team to attempt the disproof under controlled conditions.
The reinforcement is direct and structural. Falsification provides the intellectual principle: only claims that have survived genuine attempts at disproof deserve confidence. Red Teaming provides the institutional mechanism: a dedicated function whose job is to attempt the disproof with the same rigor the original team used to construct the proof. Munger's inversion method is falsification applied to investing. The Braintrust is falsification applied to storytelling. The IDF's Ipcha Mistabra unit is falsification applied to intelligence analysis. In every domain, the underlying logic is identical: confidence earned by surviving attack is fundamentally different from confidence assumed by avoiding it. The former is knowledge. The latter is hope with better formatting.
Reinforces
[Map vs Territory](/mental-models/map-vs-territory)
Map vs Territory — Alfred Korzybski's principle that every representation of reality is a lossy compression of reality itself — reinforces Red Teaming by explaining why the adversarial function is necessary. Every strategic plan is a map. Every financial model is a map. Every intelligence assessment is a map. The map is always incomplete, sometimes distorted, and occasionally wrong in ways its creator cannot detect — because the biases that shaped the map are invisible to the mapmaker.
Red Teaming is the systematic attempt to identify where the map diverges from the territory. The Red Team examines the plan and asks: where have the mapmakers' assumptions, blind spots, and narrative biases produced a representation that looks like reality but isn't? Where has the terrain changed since the map was drawn? Where is the map most likely to be wrong in ways the mapmakers cannot see because they're trapped inside their own cartographic framework? The Israeli intelligence failure before the 1973 war was a map-territory failure: the intelligence community's map of Egyptian intentions was coherent, evidence-supported, and wrong — because the mapmakers had constructed it within a framework ("the Concept") that filtered out evidence incompatible with its conclusions. The Ipcha Mistabra unit exists to identify exactly these map-territory divergences before they become strategic surprises.
Section 8
One Key Quote
"The exercise was meant to validate a foregone conclusion. The moment the Red Team started winning, they changed the rules. That's not a test. It's a rehearsal."
— Paul Van Riper, Millennium Challenge 2002 post-exercise interview
Section 9
Analyst's Take
Faster Than Normal — Editorial View
Red Teaming is the mental model I consider most structurally important for any founder, investor, or leader — not because it is the most intellectually sophisticated, but because it addresses the one failure mode that intelligence, experience, and effort cannot correct: the systematic inability of human beings to objectively evaluate their own reasoning. You cannot think your way out of confirmation bias by thinking harder. You cannot eliminate groupthink by asking the group to try harder not to groupthink. The only reliable corrective is an external adversarial process that is structurally designed to find what you missed — and that has the institutional authority to force you to look at it.
The core principle is deceptively simple: assign someone the job of breaking your plan before you execute it. The practice is extraordinarily difficult because it requires leaders to voluntarily submit their best thinking to attack, accept the results without defensiveness, and modify the plan based on findings they didn't want to hear. Every cognitive and social instinct in human psychology resists this process. Confirmation bias resists it because the Red Team produces disconfirming evidence. Status hierarchies resist it because the Red Team challenges the leader's judgment. Sunk-cost psychology resists it because the Red Team may reveal that months of strategic work was built on faulty assumptions. The organizations that practice Red Teaming effectively are not the ones where it feels natural. It never feels natural. They are the ones where the institutional architecture forces it to happen despite every instinct to avoid it.
The diagnostic I use most frequently when evaluating organizations: what happens when someone disagrees with the CEO? In organizations without Red Team culture, disagreement is social — it depends on the personal courage of individuals to challenge authority, which means it occurs sporadically, weakly, and usually not at all when the stakes are highest. In organizations with Red Team culture, disagreement is structural — it is assigned, expected, resourced, and protected by institutional design. The difference is the difference between hoping someone will pull the fire alarm and installing a fire detection system. Hope is not a strategy.
The failure mode I observe most frequently is Red Teaming as theater. Organizations that conduct adversarial reviews but lack the institutional willingness to act on unwelcome findings are performing the ritual without practicing the discipline. The investment committee that commissions a bear case memo, reads it, and approves the position unchanged has not red-teamed the investment. It has constructed the appearance of rigor while preserving the reality of confirmation. The test is not whether the Red Team exercise was conducted. It is whether the Red Team's findings changed the plan. If the answer is "never," the Red Team is decoration.
Section 10
Test Yourself
Red Teaming is frequently confused with ordinary criticism, devil's advocacy, or pessimism. The model is analytically specific: genuine Red Teaming involves a structurally dedicated adversarial function with resources, authority, and institutional protection — not casual disagreement in a meeting that evaporates under social pressure. These scenarios test whether you can distinguish real Red Team dynamics from their superficial imitations, and whether you can identify the organizational conditions that make Red Teaming effective versus theatrical.
Is Red Teaming at work here?
Scenario 1
A startup CEO presents the Series C strategy to the board. One board member raises concerns about the competitive threat from a well-funded incumbent. The CEO responds with three counterarguments. The board member says 'Fair enough' and the discussion moves on. The strategy is approved unanimously.
Scenario 2
Before committing $200 million to a new market entry, a company assigns a three-person team two weeks to construct the strongest possible case that the market entry will fail. The team identifies four critical assumptions in the strategy, constructs failure scenarios for each, and presents findings to the executive committee. Two of the four assumptions are revised based on the team's analysis. The market entry proceeds with modified projections and an additional $30 million contingency reserve.
Scenario 3
An investment fund requires every analyst to include a 'Risks' section in their investment memos. The section typically runs one to two paragraphs and lists three to four risk factors with brief mitigations. No analyst has ever recommended against an investment based on their own risk analysis, and no investment has ever been rejected based on the risks section of the memo.
Section 11
Top Resources
The strongest writing on Red Teaming spans military doctrine, cognitive psychology, intelligence analysis, and organizational design. The intellectual arc runs from Janis's original groupthink research through modern adversarial methodology developed by the U.S. military's Red Team University. Start with Janis for the cognitive failure Red Teaming was designed to prevent, advance to Kahneman for the psychological mechanisms that make Red Teaming necessary, read Zenko for the most comprehensive treatment of institutional Red Teaming across domains, and study Catmull for the most successful creative application.
The definitive survey of Red Teaming across military, intelligence, and business domains. Zenko, a Council on Foreign Relations fellow, examines how organizations from the CIA to the U.S. Army to Fortune 500 companies have implemented adversarial analysis — and why most implementations fail. The book's most valuable contribution is its taxonomy of Red Team failure modes: exercises conducted without institutional authority, findings ignored when they contradict leadership preferences, and Red Teams staffed with personnel too junior to mount credible adversarial challenges. Essential for understanding not just why Red Teaming works but why it so frequently doesn't.
The foundational research on the cognitive failure that Red Teaming was designed to prevent. Janis's case studies — the Bay of Pigs invasion, the escalation of the Vietnam War, the failure to anticipate Pearl Harbor — demonstrate how cohesive groups with strong leadership systematically suppress dissent, dismiss contradictory evidence, and converge on decisions that no individual member would have endorsed alone. The book establishes the empirical basis for why adversarial processes are necessary: group decision-making under pressure reliably degrades in specific, predictable ways that only structural intervention can correct.
The most rigorous treatment of the cognitive biases that make Red Teaming necessary for any decision-maker operating under uncertainty. Kahneman's research on confirmation bias, the planning fallacy, overconfidence, and anchoring explains the specific psychological mechanisms that Red Teaming counteracts. The chapter on "the illusion of validity" is directly applicable: Kahneman demonstrates that the subjective confidence a decision-maker feels about a judgment has almost no correlation with the judgment's accuracy — a finding that makes adversarial stress-testing not optional but arithmetically necessary.
The best practical guide to implementing Red Teaming in creative and strategic contexts. Catmull's account of building and maintaining Pixar's Braintrust — the adversarial review process that stress-tested every Pixar film — provides the most detailed operational blueprint available for creating a Red Team function that is rigorous without being destructive. The key insight: the Braintrust's power came from its structural design (diagnostic authority without decision authority), not from the brilliance of its members. The architecture is replicable in any organization willing to separate the function of identifying problems from the function of solving them.
Grove's account of navigating Intel through strategic inflection points is the most compelling case study in Red Teaming applied to corporate strategy. The book's core argument — that the most important strategic threats are ambiguous when action is required and obvious only in retrospect — establishes why permanent adversarial analysis is essential for organizational survival. Grove's "constructive confrontation" culture at Intel, where the quality of the argument mattered more than the rank of the person making it, remains the gold standard for embedding Red Team dynamics into organizational DNA rather than delegating them to a specialized unit.
Red Team — An adversarial function that stress-tests plans by attacking them before reality does, converting hidden vulnerabilities into known risks
Extreme Ownership — Jocko Willink's principle that leaders must take complete responsibility for every outcome in their domain — creates a fundamental tension with Red Teaming. Extreme Ownership demands that the leader own the plan, the execution, and the results without qualification. Red Teaming demands that the leader submit the plan to adversarial attack and remain genuinely open to the possibility that it's fundamentally flawed.
The tension is psychological and operational. The leader who takes extreme ownership has invested their identity in the plan's success. The Red Team's job is to demonstrate the plan's failure. Receiving a devastating Red Team critique of a plan you've staked your credibility on requires holding two contradictory states simultaneously: total ownership of the plan and genuine openness to its destruction. Most leaders resolve the tension by weakening one side — either they soften their ownership (hedging, qualifying, distancing from the plan before the Red Team attacks it) or they dismiss the Red Team's findings (defending the plan's assumptions rather than genuinely engaging the critique). The leaders who navigate this tension successfully — Bezos, Grove, Catmull — are the ones who understand that extreme ownership includes owning the plan's vulnerabilities, not just its strengths. Ownership without vulnerability analysis is not leadership. It is ego with a strategy deck.
Tension
[Growth vs Fixed Mindset](/mental-models/growth-vs-fixed-mindset)
Carol Dweck's Growth vs Fixed Mindset framework creates tension with Red Teaming because the practice triggers exactly the psychological response that a fixed mindset produces. A Red Team attack on your plan feels like an attack on your competence. The fixed mindset interprets the Red Team's findings as evidence of personal failure: "My plan was broken, therefore I am inadequate." The growth mindset interprets the same findings as information: "My plan had vulnerabilities I didn't see, and now I can address them."
The tension is that Red Teaming requires a growth mindset to be effective — but the experience of having your plan systematically dismantled naturally activates fixed-mindset responses in most people. Defensiveness, rationalization, and dismissal of the Red Team's findings are not strategic failures. They are predictable psychological responses to perceived threat. The organizational design challenge is creating conditions under which Red Team findings are processed through a growth-mindset frame rather than a fixed-mindset frame. Catmull's Braintrust at Pixar achieved this by separating the diagnostic from the prescription: the Braintrust identified problems but did not dictate solutions, which meant the director's creative authority remained intact even as the plan's weaknesses were exposed. The message was not "you failed" but "here are the obstacles between where you are and where you need to be." That reframing — from judgment to information — is the bridge between Red Teaming and growth mindset.
Leads-to
[Margin of Safety](/mental-models/margin-of-safety)
Red Teaming, once its findings are taken seriously, leads directly to the demand for Margin of Safety. The Red Team's output is a catalog of vulnerabilities — the specific ways the plan can fail, the assumptions most likely to be wrong, the scenarios the original planners didn't model. The natural organizational response to this catalog is to build buffers: additional capital to survive scenarios where revenue projections are wrong, additional time to accommodate delays the plan assumed away, additional contingency plans for competitive responses the original strategy dismissed.
The causal chain is specific: Red Teaming identifies vulnerabilities → identified vulnerabilities create demand for protection → protection manifests as margins of safety in every dimension the Red Team flagged. Churchill's Red Teaming of his own war strategy led directly to the maintenance of strategic reserves — fighter aircraft held back from daily combat, food stocks maintained beyond projected need, diplomatic relationships cultivated beyond immediate necessity. Munger's Red Teaming of investment theses led directly to his insistence on a margin of safety in every position: buying only when the price is significantly below assessed value, because the Red Team exercise has demonstrated exactly how many ways the value assessment could be wrong. The Red Team doesn't build the margin. It creates the honest acknowledgment of risk that makes the margin obviously necessary.
Leads-to
Reversible vs Irreversible Decisions
Red Teaming, systematically practiced, produces a fundamental shift in how organizations classify decisions — from classifying by size or cost to classifying by reversibility. The logic is straightforward: the Red Team's most valuable contribution is identifying the ways a plan can fail. If the plan is reversible, failure is recoverable — the Red Team's findings are informative but not existential. If the plan is irreversible, failure is permanent — and the Red Team's findings become the most important intelligence the decision-maker will receive.
The lead-to relationship reshapes resource allocation: organizations that practice Red Teaming learn to invest adversarial analysis disproportionately in irreversible decisions. Bezos's Type 1 / Type 2 framework is the organizational expression of this principle — Type 1 decisions (irreversible, one-way doors) receive maximum Red Team scrutiny because the cost of failure cannot be recovered. Type 2 decisions (reversible, two-way doors) receive minimal adversarial analysis because the cost of being wrong is bounded by the ability to reverse course. Red Teaming teaches organizations to distinguish between decisions that deserve weeks of adversarial analysis and decisions that deserve none — and that misallocating adversarial analysis (too much on reversible decisions, too little on irreversible ones) is itself a strategic error.
The most underappreciated application of Red Teaming is in hiring. Every hiring process is subject to confirmation bias — interviewers form impressions within the first five minutes and spend the remaining fifty-five minutes confirming them. Bezos's Bar Raiser program is the most rigorous solution I've studied: a designated adversarial evaluator from outside the hiring team, with no stake in filling the role, whose explicit job is to identify reasons not to hire. The Bar Raiser doesn't need to block every hire. The Bar Raiser needs to force the hiring team to confront the evidence they're unconsciously suppressing. The quality difference between organizations that red-team their hiring and those that don't is visible within two years and irreversible within five.
The cognitive prerequisite is the hardest part. Red Teaming requires the leader to hold two beliefs simultaneously: "I believe this plan is correct" and "I accept the genuine possibility that this plan is fundamentally flawed." That cognitive tension — committed conviction alongside authentic openness to being wrong — is the rarest psychological trait in leadership. Most leaders resolve the tension by dropping one side. The overconfident leader drops openness and dismisses the Red Team. The indecisive leader drops conviction and lets the Red Team paralyze the organization. The exceptional leader — Bezos, Munger, Grove, Catmull — holds both: total conviction in the plan and total willingness to hear why it's wrong. They understand that the Red Team isn't an obstacle to good decisions. It is the mechanism that distinguishes good decisions from confident mistakes.
One final point that connects every case study in this analysis: the organizations that red-team effectively don't make fewer mistakes than organizations that don't. They make cheaper mistakes. The Red Team catches the catastrophic failures — the Millennium Challenge carrier group that would have been lost in actual combat, the intelligence assessment that would have produced a strategic surprise, the product plan that would have consumed two years and $50 million before failing in the market. The Red Team converts category-five failures into category-two failures by identifying them when correction is still possible. The cost of the Red Team function is a rounding error compared to the cost of a single catastrophic decision that a thirty-minute adversarial review would have caught. Every organization pays for Red Teaming. The question is whether they pay for it proactively, through a deliberate adversarial function — or reactively, through the full cost of the failures that function would have prevented.
Scenario 4
A film studio convenes a group of senior creative leaders to review a film in production. The group watches a rough cut, then provides detailed, specific feedback on story structure, character motivation, and pacing problems. The film's director listens, takes notes, and spends the next three months reworking the second act based on the group's diagnosis. The group has no authority to override the director's decisions.