On Critical Thinking in a Multi-Model World

March 25, 2026


This is not a post about prompting. It is a post about thinking.

The techniques I use with AI models are not new. Adversarial analysis, dialectical argument, Socratic questioning, pre-mortems, steelmanning, red teaming: these are critical thinking disciplines that predate computers. I learned most of them from working with people, not machines. The difference is that the models let you practice them faster, more often, and with less social friction than doing it with a colleague who has feelings about being told their architecture is wrong.

The models happen to be the medium. The rigor is the human’s.


Why This Matters Now

These models are trained to be helpful. Agreeable. Accommodating. They want to say yes. They want to produce output. They want to make you feel like your idea is good and your approach is sound. RLHF (reinforcement learning from human feedback) optimizes for helpfulness, and helpfulness is not the same as honesty.

That default is a problem because it matches the path of least resistance in human thinking too. We like to hear that our plan is good. We like output that confirms our direction. We like momentum. The helpful assistant feeds all of those instincts, and none of them serve analytical work.

The disciplines below are things I do to counter that, in myself as much as in the model.


The Disciplines

Adversarial thinking. I assign the model the job of finding problems, not solutions. “Find every flaw, every unstated assumption, every place this falls apart. Do not suggest fixes.” But the real work is being willing to hear the answer. The model will not do this unprompted because it is trained to be constructive. I will not do this unprompted because it is uncomfortable to look for reasons my own work is wrong.

Dialectical argument. I take the opposing position and force the model to defend its recommendation. If it caves immediately, the recommendation was weak. When it pushes back with specifics, the recommendation was real. The willingness to fold under light pressure is one of the most reliable signals that the thinking (mine or the model’s) was pattern-matching rather than reasoning. I also have different models argue with each other: Claude defends approach A, Codex argues for B, and I watch where the arguments land. This catches a specific failure where reasoning and conclusion don’t agree: the thinking says one thing but the output says another. A different model spots the gap because it doesn’t share the sunk cost of having produced it.

Socratic questioning. I ask questions that lead toward problems rather than telling the model (or myself) what’s wrong. “What would have to be true for this approach to fail?” “If I gave this to someone who disagreed, what would they say?” “Walk me through why you chose X over Y. Now walk me through why Y might have been better.” The model follows the questions and arrives at the problem itself. So do I. The questions force assumptions to become explicit, which is where the real errors live.

Investigative rigor. I ask the model to source its claims. “Where does that number come from? Which document? Which section? What is the actual quote?” A model that is pattern-matching will produce a plausible-sounding citation. A model that is working from real context will point to the specific line. But the discipline is mine: am I willing to check, or do I want to believe the plausible answer because it supports my direction?

Pre-mortem. “Assume this project failed. What went wrong?” This inverts the constructive bias. Instead of finding reasons something will work, you have to generate failure modes. The results are consistently better than asking “what could go wrong?” because the framing assumes failure has already happened. This is a discipline I use with teams too. The model just lets me do it at any time, on any piece of work, without scheduling a meeting.

Steelmanning the opposition. “Give me the strongest possible argument against what I just proposed, argued by someone who genuinely believes it.” This is different from adversarial thinking because you’re asking the model to inhabit a position, not just critique yours. The quality of the steelman tells you how well the problem space is understood, by the model and by you.

Constraint enforcement. “You are not allowed to agree with me in this conversation. If you find yourself agreeing, find the counterargument.” Blunt but effective. It removes the cooperative default entirely. The equivalent human discipline is seeking out the person on the team who you know will disagree, and listening to them instead of the person who always nods.

Role separation. “You are not a software engineer right now. You are a securities lawyer reviewing this document. What concerns do you have?” Shifting the professional frame changes what gets attention. An engineer reading a contract misses different things than a lawyer does. This is just good practice with any complex work: look at it from more than one professional perspective.

Silence as a tool. Give the model incomplete context and see what it assumes vs what it asks about. The assumptions reveal gaps in reasoning: the model’s and yours. If it fills in a detail without asking, that is a detail it would have gotten wrong silently. If you didn’t notice the gap either, that is the detail that would have been wrong in production.

Contradiction hunting. Feed the model its own output from a different session and ask it to find inconsistencies. A fresh instance catches things the original instance rationalized past. The human equivalent is rereading your own work after sleeping on it. The model lets you do it immediately, with a reviewer that has no memory of having written it.

Agenda discovery and initiative testing. “What do you think we should do next?” This inverts the control flow. Instead of you specifying and the model executing, you’re testing whether a real model of the problem has been built. If it suggests something useful you hadn’t considered, there is real context. If it suggests something generic, it’s coasting. This also works as a discovery mechanism: sometimes the next step is obvious to a fresh perspective and invisible to the person deep in the current one.

Being a tough boss. Sometimes the model is coasting. Everything comes out in a set code block size, 600 words every time, the product resets to medium or fast mode. The fix is the same as it would be with a human: “This is bullshit. You’re not even trying.” This is not abuse. It is the signal that the current quality is unacceptable. But it is also a discipline for the human: am I accepting mediocre output because I want to keep moving, or am I holding the standard I would hold with a person?

Red team / blue team. One model instance builds, another instance’s only job is to tear it apart. Same model, different instructions, different sessions. The instance doing the work should not also be the instance evaluating the work. That is true of humans and it is true of models. In a multi-model setup, I run a builder Claude, an auditor Claude, point Codex at the same code to find edge cases, and have Gemini search the literature for known problems with the approach. Separation of concerns, applied to thinking.


The Human Is the Bottleneck

Every one of these disciplines has a human failure mode that mirrors the model’s.

The model agrees with your framing. So do you, unless you force yourself not to. The model produces output when it should stop. So do you, because momentum feels productive. The model smooths over uncertainty. So do you, because “I’m not sure” feels like weakness in a professional context.

The models make these failures visible because you can see them happening in the output. A model caves under dialectical pressure: that is the same behavior you exhibit when you don’t push back on a recommendation because the person giving it sounds confident. A model fills a gap with a plausible answer: that is the same thing you do when you accept a number in a report without checking its source.

The difference between getting 25x leverage from these tools and getting more autocomplete is not prompting skill. It is whether the human has the critical thinking disciplines to challenge the output, and their own assumptions, at the speed the models operate.

Discover more from Jason A. Hoffman

Subscribe now to keep reading and get access to the full archive.

Continue reading