Klein: Trust Your Gut (Sometimes)
Why Gary Klein’s Research on Expert Intuition Explains When Your Best People Are Right and When They Are Dangerously Wrong
The previous article argued that your organisation’s decisions scatter more than anyone believes, and that noise, the random variability in professional judgment, is at least as damaging as bias. The natural response is to structure everything: rubrics, algorithms, checklists, mechanical aggregation. Remove the human. Remove the variability. This is half right and half catastrophic. Gary Klein spent thirty years studying people who make life-or-death decisions under time pressure, people whose intuition works, and his research shows that the impulse to replace expert judgment with process will destroy exactly the capability your organisation needs most.
Klein is a cognitive psychologist who founded the Naturalistic Decision Making movement. Where Kahneman studied decision-making in the laboratory, Klein studied it in burning buildings, intensive care units, military command posts, and offshore oil platforms. Where Kahneman found systematic error, Klein found systematic competence. The same cognitive mechanism, System 1 pattern recognition, produces both. The difference is not in the person. It is in the environment. This distinction, which Klein and Kahneman eventually agreed on after years of adversarial collaboration, is the most useful framework in the decision science literature for anyone trying to figure out which of their organisation’s experts to trust and which to overrule.
1. Recognition-Primed Decision: How Experts Actually Decide
Klein’s central finding is that experts do not decide the way decision theory says they should. They do not generate multiple options, weigh them against criteria, and select the best. They recognise the situation, generate a single course of action based on pattern recognition, mentally simulate it to check whether it will work, and act. If the simulation reveals a problem, they modify the action or generate the next most plausible option. The process is serial (one option at a time), not parallel (comparing multiple options simultaneously).
Klein calls this the Recognition-Primed Decision model. He discovered it by studying fireground commanders: people who make decisions about where to send crews into burning buildings, with lives at stake, under extreme time pressure, with incomplete and changing information. These commanders almost never compared options. They looked at the fire, recognised a pattern from their experience, knew what to do, and did it. When Klein asked them to explain their decisions, they often could not articulate the reasoning. They said things like “it just felt right” or “I could see it was going to go bad.” This is not mysticism. It is pattern recognition operating below conscious articulation but above random guessing.
The model has three levels. At Level 1, the situation is immediately recognised and the action is obvious: the experienced firefighter sees a backdraft pattern and orders evacuation without deliberation. At Level 2, the situation requires diagnosis: the pattern is not immediately clear, so the expert runs mental simulations until one fits. At Level 3, the situation is complex enough that the expert must evaluate a course of action by imagining its consequences, modify it if the simulation reveals problems, and iterate. Even at Level 3, the process is not comparison. It is generation, simulation, and modification of a single line of action.
For the series, this matters because it describes how the best people in your organisation actually work. The senior architect who looks at a system design and says “that won’t scale” is not guessing. They are recognising a pattern from hundreds of systems they have seen before. The domain expert who reads a specification and says “that’s not how we do it” is not being obstructive. They are matching the specification against a library of domain situations built over years. The experienced leader who walks into a struggling team and senses something is wrong before anyone has said a word is reading cues that their pattern library can decode and their conscious mind cannot yet articulate.
2. The Pattern Library: What Expertise Actually Is
Klein’s research redefines expertise. It is not superior analytical ability. It is a richer, more accurate library of situation-action patterns built through experience. Simon estimated that expertise requires roughly 50,000 chunks of domain knowledge, accumulated over approximately ten years of deliberate practice. Klein’s fieldwork confirms this: the expert’s advantage is not that they think harder but that they see more. They perceive cues that novices miss. They recognise patterns that novices have never encountered. They generate expectations about what will happen next, and when those expectations are violated, they know something has changed.
Four elements activate simultaneously when an expert recognises a situation: cues (what they notice in the environment), expectancies (what they predict will happen next), goals (what they are trying to achieve), and actions (what to do about it). These do not fire sequentially. They fire as a package. The firefighter does not first perceive the cue, then predict the trajectory, then identify the goal, then select the action. They perceive the situation and know what to do, in a single cognitive act. This is what “intuition” means when it works: not a feeling disconnected from evidence, but compressed expertise recognising a familiar pattern and activating the appropriate response.
The implication for organisations is that expert judgment is not a soft skill to be tolerated. It is an asset to be cultivated, protected, and deployed strategically. The organisation that replaces expert judgment with checklists in domains where expertise is valid has destroyed its most valuable decision-making resource. The organisation that defers to expert judgment in domains where expertise is invalid has handed its future to confident pattern-matchers operating in an environment that does not reward pattern-matching.
The question, as always, is which domains are which.
3. When Intuition Works: High-Validity Environments
The Kahneman-Klein adversarial collaboration, published in 2009 after years of argument, produced a resolution that is more useful than either position alone. They agreed: intuition is trustworthy when two conditions are met.
First, the environment must be sufficiently regular that patterns exist to be learned. Chess is regular: the same positions recur and the rules do not change. Firefighting is regular: fire behaviour follows physical laws, and while each fire is different, the patterns are learnable. These are high-validity environments. There is a stable, underlying structure that rewards pattern recognition.
Second, the decision-maker must have had prolonged practice with valid feedback. The feedback must be prompt (you learn quickly whether your decision was right), clear (the outcome is unambiguous), and connected to the decision (you can attribute the outcome to your choice, not to luck or other factors). A chess player gets immediate, unambiguous feedback after every move. A surgeon gets feedback within hours: the patient recovers or does not. A firefighter gets feedback within minutes: the building behaves as predicted or it does not.
When both conditions are met, intuition is not just acceptable. It is superior to analytical methods. The expert operating in a high-validity environment with years of valid feedback will consistently outperform the checklist, the algorithm, and the committee. This is Klein’s core finding, and it has been replicated across domains from military command to intensive care nursing to chess.
Evans’s knowledge crunching produces high-validity environments by design. When developers and domain experts work together iteratively, testing the model against reality and refining it through feedback, they are building the conditions Klein describes: a domain with learnable regularities and prompt, clear feedback. The domain expert who has been through months of knowledge crunching has valid intuition about the domain model. Their judgment about what the specification should say is trustworthy, because it has been calibrated by the exact process Klein’s research describes.
4. When Intuition Fails: Low-Validity Environments
Kahneman’s contribution to the collaboration was equally important. He insisted, and Klein agreed, that many professional environments do not meet the two conditions. The environment is irregular, the feedback is delayed, or the feedback is ambiguous. In these environments, expert intuition is unreliable regardless of the expert’s experience or confidence.
Stock picking is a low-validity environment: the market is too complex and too influenced by other actors for patterns to be reliably learnable. Political prediction is a low-validity environment: the feedback is delayed by years and confounded by countless variables. Long-range strategic forecasting is a low-validity environment: the outcome depends on factors the forecaster cannot observe or control.
AI strategy is a low-validity environment. The technology changes faster than any executive can accumulate valid experience. The feedback is delayed by months or years. The feedback is ambiguous: when an AI initiative fails, it is never clear whether the failure was caused by the strategy, the implementation, the technology, the culture, or the timing. The executive who says “I have a gut feeling about where AI is heading” is exhibiting exactly the confident pattern-matching that Kahneman’s research shows is unreliable in environments this novel.
This does not mean the executive’s judgment is worthless. It means their judgment about AI strategy should be treated differently from the domain expert’s judgment about specification quality. The first operates in a low-validity environment and should be structured, tested, and challenged. The second operates in a high-validity environment and should be trusted, protected, and amplified. The organisation needs both. The decision architecture must distinguish between them.
Beer’s System 3* (the audit channel) is the architectural mechanism for making this distinction. The audit channel provides direct, unfiltered access to what is actually happening. In Klein’s terms, it tests whether the environment is providing valid feedback. If the audit reveals that the domain expert’s intuitions are consistently confirmed by the AI-generated output, you are in a high-validity environment and the expert’s judgment should be trusted. If the audit reveals that the strategic forecast is consistently wrong, you are in a low-validity environment and the judgment should be structured.
5. The Pre-Mortem: Klein’s Most Practical Tool
Klein’s most widely adopted contribution is the pre-mortem. The method is simple: before a decision is implemented, the team imagines it has already been implemented and has failed. Each member independently writes down the reasons for the failure. The results are collected and discussed.
The pre-mortem works because it inverts the cognitive dynamics that Kahneman identified. WYSIATI (What You See Is All There Is) suppresses awareness of what could go wrong, because the plan is coherent and the team is committed. The pre-mortem gives explicit permission to name what could go wrong, bypassing the social pressure to agree. Overconfidence is reduced because the team has been asked to generate failure narratives, not success narratives. And because the exercise is individual before it is collective, it captures the disagreement that Kahneman’s decision hygiene requires: the independent judgment that group dynamics would otherwise suppress.
For AI transformation, the pre-mortem has a specific application. Before deploying an AI-assisted workflow, before rolling out a specification-driven development process, before restructuring teams around domain boundaries, imagine it has failed. What went wrong? The answers will surface the assumptions the plan relies on but has never tested. They will name the dependencies the plan assumes but has never verified. And they will reveal the political objections that will emerge once the plan threatens the people whose roles depend on the current architecture.
The pre-mortem is Argyris made structural. Argyris showed that defensive routines suppress the information the organisation needs. The pre-mortem creates a structured context in which the undiscussable becomes not just discussable but expected. It is not a cultural intervention. It is an architectural one. And it works in organisations that would resist Argyris’s deeper prescription, because it does not require anyone to change their defensive routines. It requires only that they answer a question.
6. Klein’s Limits
Klein must be read with his limitations visible. His model is descriptive, not normative: it describes how experts do decide, not how they should. The expert whose pattern library contains bad patterns will execute those patterns with the same speed and confidence as the expert whose library is good. RPD does not distinguish between valid and invalid expertise. The Kahneman-Klein resolution does, but only by stepping outside Klein’s framework and asking about the environment.
Klein’s model also struggles with genuinely novel situations. If no pattern exists in the expert’s library, RPD has nothing to work with. The expert in an entirely new domain, the experienced insurance underwriter encountering AI-generated risk models for the first time, has no relevant patterns. Their intuition will default to the closest available analogue, which may be dangerously wrong. Taleb’s Black Swan territory is precisely where Klein’s model has least to offer and where structured, humble, experimental approaches have most value.
The deepest tension is with Simon. Simon says: design the decision environment so the right premises reach the right people. Klein says: trust the expert who has been calibrated by the right environment. These are not contradictory. They are complementary, and the organisation needs both. Design the environment (Simon) to produce experts with valid pattern libraries (Klein), and then trust those experts to decide. The architecture creates the conditions for expertise. The expertise produces the decisions. Neither works without the other.
(An Organisational Prompt is something you can do now....)
Run a pre-mortem on your next AI decision.
Before you approve the next initiative, the next team restructuring, gather the people who will be affected. Tell them: “Imagine it is six months from now and this has failed completely. What went wrong?” Give them five minutes to write independently. Then read the answers aloud. The things they write will be the things they already know but have not been able to say. The pre-mortem does not require courage. It requires only a question and five minutes of silence. The information that emerges will be more valuable than the analysis that preceded it, because the analysis was constructed by people who wanted the plan to succeed, and the pre-mortem was constructed by people who were given permission to imagine it failing.
Further Reading
Gary Klein: Sources of Power: How People Make Decisions - The foundational text on naturalistic decision-making. The RPD model, the fireground studies, and the argument that expertise is pattern recognition, not analysis. The most important book on how experts actually decide.
Gary Klein: Seeing What Others Don’t: The Remarkable Ways We Gain Insights - How breakthroughs happen: by challenging assumptions, making connections, and noticing contradictions. The insight research that complements the RPD model.
Gary Klein: Streetlights and Shadows: Searching for the Keys to Adaptive Decision Making - Ten claims about how we should make decisions, and the research that challenges each one. The best Klein book for a reader sceptical of the “trust intuition” message.
Daniel Kahneman and Gary Klein: Conditions for Intuitive Expertise: A Failure to Disagree - The adversarial collaboration. When intuition works and when it does not. The single most useful paper in the decision science literature for practitioners.
I write about the industry and its approach in general. None of the opinions or examples in my articles necessarily relate to present or past employers. I draw on conversations with many practitioners and all views are my own.

