Red-Teaming Your Strategy

April 23, 2026

1:10

Imagine pilots landing at what looks like the right runway — confidence is high, instruments are checked, everything feels correct. But suddenly, they realize they’ve landed at the wrong airport, and it’s almost too late. Mike Fisher highlights that this isn’t about technical failure, but about human nature — our brains tend to stop questioning when things seem certain. Just like pilots rely on checklists and cross-checks, organizations need procedures that force us to challenge our assumptions, especially when decisions feel obvious. The danger isn’t being wrong; it’s trusting your gut too early and stopping your verification too soon. Fisher points out that red-teaming isn’t about personalities — they’re systems designed to introduce friction and prevent irreversible mistakes. Tools like the 'Disconfirming Evidence Check' or the 'Two-Runway Test' are practical ways to keep teams honest. The lesson? When confidence soars, that’s exactly when you need to double down and check again. Because the subtle shift from certainty to overconfidence can be the real danger in any high-stakes decision.

On January 12, 2014, two experienced Southwest Airlines pilots were descending toward Branson, Missouri, late at night. The flight had been routine. The weather was good. The destination was programmed correctly into the aircraft’s systems. Air traffic control had cleared them for the approach. Everything, by all accounts, was normal.

As the plane descended, the pilots transitioned from an instrument-guided approach to a visual one. Below them, runway lights appeared in the darkness. The alignment looked right. The approach felt right. The confidence was complete.

The wheels touched down.

Only after heavy braking did the problem reveal itself. The runway was far shorter than expected. The plane slowed with little margin to spare. The aircraft came to a stop just before the end of the pavement.

The pilots hadn’t landed at Branson Airport at all. They had landed at a nearby, much smaller airport, M. Graham Clark Downtown Airport, by mistake.

What makes this story unsettling isn’t that something exotic or chaotic happened. There was no equipment failure. No bad weather. No rogue behavior. No dramatic lapse in training. These were competent professionals operating in a routine environment.

The error occurred at the exact moment when everything felt most certain.

Certainty Is Not the Same as Correctness

It’s tempting to summarize this story as “mistakes happen.” Aviation accidents are often reduced to a list of contributing factors: lighting conditions, airport proximity, human error. But that framing misses the deeper lesson.

This wasn’t a one-off mistake. It was a repeatable failure mode. Unchecked certainty under time compression.

The Southwest crew had procedures available to cross-check their visual identification against their navigation displays. The correct airport was depicted on their cockpit screens the entire time. The NTSB found that once the crew visually acquired what they thought was the runway, they stopped referencing those displays. This is the uncomfortable truth about procedural safeguards: they only work if people keep using them under pressure. Which is why the design question isn’t just “do we have a checklist?” It’s “what makes the checklist unskippable at the exact moment skipping feels justified?”

Once the pilots visually acquired a plausible runway, their brains did what human brains are very good at doing: they stopped looking for disconfirming evidence. The visual matched the expectation. The expectation reinforced the visual. Confidence closed the loop.

This is expectation bias and confirmation bias working together. Once you think you know what you’re seeing, you selectively notice evidence that supports that belief and ignore what contradicts it. The problem is not that this bias exists. The problem is that it feels like clarity.

Aviation learned this lesson the hard way, decades ago. That’s why modern cockpits are filled with checklists, callouts, cross-checks, and redundancy. Pilots are trained not to trust a single sense, a single system, or a single moment of confidence, especially during high-speed, high-consequence phases of flight.

Now contrast that with leadership.

Strategy, product decisions, and organizational bets are almost always made with imperfect information. The data is partial. The signals are noisy. The environment is changing faster than our models. And yet, we routinely make decisions that carry enormous downstream consequences based on what “looks right” in the moment.

Experience doesn’t protect us from this. In many cases, it makes it worse. The more experienced you are, the more coherent your mental models become. The more patterns you’ve seen, the easier it is to snap new situations into familiar shapes. Confidence becomes efficient. It also becomes dangerous.

The pilots who landed on the wrong runway weren’t inexperienced, they were experienced enough to stop checking.

The real risk in leadership isn’t being wrong, it’s becoming confident too early.

The Product Leader’s “Visual Approach”

In aviation, a visual approach means transitioning from instrument-based navigation to visually identifying the runway. It’s not inherently unsafe, but it requires additional verification precisely because visual cues can be misleading.

Product organizations make this transition all the time, often without noticing.

A product “visual approach” happens when teams move from instrumented verification, data, experiments, structured learning, into narrative-driven reasoning. The dashboards fade into the background. The anecdotes get louder. The decision begins to feel obvious.

You’ve seen this pattern, even if you haven’t named it. A team says, “Users love it,” based on a handful of enthusiastic conversations, while adoption data is thin or ambiguous. A roadmap item gains momentum because a competitor shipped something similar, not because the underlying problem has been validated. A strategy hardens because a senior leader endorsed it early, and now revisiting it feels like dissent rather than diligence.

Sometimes it shows up in ambiguous markets. Two customer segments look similar at first glance. Early traction exists in both. The signals overlap. Instead of slowing down to differentiate, the team commits to one path and retrofits the story to justify it.

In each case, the move away from verification feels reasonable. It feels efficient. It feels like progress. And that’s the trap.

Many product failures don’t come from bad ideas. They come from stopping verification too soon. The moment a narrative becomes internally consistent is often the moment teams stop asking, “What would prove this wrong?”

Why Red-Teaming Disappears When You Need It Most

At this point, someone usually says, “That’s why we encourage healthy debate,” or “That’s why we hire smart people who challenge assumptions.” That sounds good but it rarely works.

Red-teaming, actively testing assumptions and searching for disconfirming evidence, cannot be a personality trait. If it depends on someone being brave, contrarian, or stubborn enough to speak up, it will vanish under pressure.

When timelines compress, when stakes rise, when launch dates approach, organizations don’t become more reflective. They become more decisive. Consensus hardens. Momentum takes over. The social cost of slowing things down increases.

If skepticism is optional, it will be skipped. This is why red-teaming has to be procedural, not a cultural aspiration. It has to be something the system does, not something individuals occasionally attempt.

Properly designed, it’s a service to the decision. It exists to protect teams from the very human tendency to confuse confidence with correctness.

Aviation doesn’t ask pilots to “be more skeptical.” It gives them checklists. Product organizations should take the hint.

The Southwest crew in Branson had every procedural tool they needed. Their navigation displays showed the correct airport. Their standard operating procedures called for verifying visual identification against those displays. What the NTSB found is that once the pilots visually acquired a plausible runway, they stopped consulting the instruments entirely. The checklist existed. The cross-check existed. Under the quiet pressure of a routine approach, both were abandoned.

This is what makes procedural design hard. It is not enough to have the procedure. The procedure has to survive the exact conditions that make people want to skip it.

The Forced Cross-Check Toolkit

The goal of red-teaming is not to slow decisions down indiscriminately. It’s to introduce just enough friction to prevent irreversible errors. What follows are practical mechanisms that create disconfirming evidence on purpose.

Not all of these need to be used all the time. Think of them as tools, not doctrine.

1. The Disconfirming Evidence Check (DEC)

Before committing to a major decision, ask a simple question:

What would have to be true for this to be wrong?

Then assign someone, not the primary advocate, to actively look for that evidence. This is not a rhetorical exercise. The owner’s job is not to agree. It’s to try to falsify the assumption. If they can’t, confidence increases legitimately. If they can, you’ve learned something before it’s expensive. Most teams only collect supporting evidence. DEC forces symmetry.

2. The Two-Runway Test

When two options look similar early, two markets, two architectures, two positioning strategies, require a one-page answer to this question:

How will we know which runway we’re actually on?

That page should include: distinct signals that differentiate the options, how long it should take for those signals to emerge, and a clear decision rule for what happens next. Without this, teams tend to rationalize whatever outcome occurs as “what we expected.”

3. The Time-Boxed Pre-Mortem

Run a pre-mortem, but keep it short. Set a timer for 15–20 minutes and ask:

It’s six months from now and this failed. Why?

Capture the top three reasons. Then convert each into a concrete test or mitigation. If you can’t mitigate a risk, at least acknowledge it explicitly. Pre-mortems work because they temporarily suspend optimism and social pressure. The time box prevents them from becoming unstructured sprawl.

4. Kill Criteria Up Front

Most teams define success criteria. Very few define stopping criteria. Before committing, explicitly state:

What evidence would cause us to stop?
What signals would tell us this is not working?
Who has the authority to call it?

Without kill criteria, projects tend to continue by inertia. Stopping feels like failure rather than discipline.

5. Independent Verification

Separate the advocate from the validator for the most critical assumption. The person who wants something to be true should not be the same person validating whether it is true. This separation doesn’t require a new org structure. It requires role clarity. Aviation learned long ago that self-verification is fragile under pressure.

6. The Assumption Ledger

Maintain a lightweight ledger tracking each assumption, its confidence level, evidence type (data, anecdote, experiment), and next validation date.

This sounds bureaucratic until you try it. The act of writing assumptions down exposes how many are based on belief rather than evidence. It also prevents outdated assumptions from quietly hardening into “facts.”

7. The Slow-Down-at-the-Brink Rule

This is counterintuitive but critical. The closer you are to launch, irreversible commitment, or public positioning, the more explicit your verification should become, not less. This is exactly the moment when teams are most tempted to say, “We’re too far along to question this.” It’s also the moment when mistakes become hardest to unwind. Final approach is when checklists matter most.

Doing This Without Killing Velocity

The immediate objection to all of this is speed. Yes, these practices introduce friction. That’s the point. The question is not whether to have friction, but where to put it. Not every decision deserves heavy red-teaming. Proportionality matters. Reserve the strongest checks for bets that are high-downside, high-ambiguity, and hard to reverse.

For everything else, templates and short rituals beat open-ended debate. A 10-minute DEC is often more effective than an hour-long meeting. A one-page Two-Runway Test beats a dozen slides of narrative justification.

The deeper challenge is cultural. Red-teaming must be framed as service, not defiance. The goal is not to challenge authority, but to strengthen decisions. When leaders model this, by inviting disconfirming evidence and rewarding clarity over agreement, it becomes safe to slow down at the right moments.

Speed doesn’t come from skipping checks. It comes from skipping surprises.

Landing on the Right Runway

The unsettling thing about the Southwest incident is how normal it felt right up until the end. The pilots didn’t feel reckless. They felt confident. That’s what makes it such a powerful metaphor for leadership. The moment something “looks right” is often the moment verification matters most.

Confidence should not trigger commitment. It should trigger constraints.

Red-teaming isn’t about being negative or pessimistic. It’s about acknowledging that humans are very good at convincing themselves they’re right, especially when they’re under pressure and moving fast.

The best leaders don’t rely on smart people to notice mistakes in time. They build systems that assume confidence will arrive before correctness, and they design for that reality.

So here’s the real question:

Where are you most certain right now?And what, exactly, is proving you right, or wrong?

Because landing safely isn’t about believing you’re on the right runway. It’s about checking.

Subscribe now

Audio Transcript