Back to writing
Essay

Multiple Choice Is the Wrong Shape for Critical Thinking

Multiple-choice questions train recognition inside a fixed frame. Real critical thinking starts one level earlier: questioning the frame itself.

TL;DR.

  • Multiple-choice questions train recognition inside a fixed frame.
  • Critical thinking, in practice, lives one layer up: noticing that the menu is incomplete, ambiguous, or downstream of a bad assumption.
  • The research points in the same direction — when educators want deeper evidence of thinking, they typically add rationale, challenge, or open-ended response.
  • In the AI era this matters more, not less: recognition is precisely the layer of cognition machines outsource best.
Stick-figure students at desks facing a multiple-choice question: 'What is the capital of France? A) London B) Berlin C) Rome'. One student raises a hand and asks: 'What happens if the answer is not here?'

I have never liked multiple-choice questions very much.

Not because they are bad in every case. They are efficient, easy to grade, and often adequate for checking whether someone remembers a fact. The problem is more specific than that: in practice, they are typically the wrong shape for the thing people say they care about, which is critical thinking.

Critical thinking is not, as a rule, selecting the best answer from a fixed menu. It is, more often, noticing that the menu is incomplete, that two options are simultaneously defensible, that the question is asking the wrong thing, or that the real answer must be constructed rather than recognized.

An important point, to me, is this: in actual work, the hard part is typically not selecting among four options. The hard part is figuring out whether the right option is even on the page.

The real world does not come with four choices

A lot of school assessment quietly trains a very specific skill: given a pre-bounded problem, pick the best answer from a small set of candidate answers.

That is a real skill. It is also a much narrower one than people typically admit.

In actual work, problems rarely arrive pre-framed so neatly. Typically one has to do at least one of the following first: (i) decide what the problem actually is, (ii) generate candidate solutions, (iii) reject the framing of the question, (iv) ask for missing information, (v) notice that multiple answers could work depending on the goal, or (vi) conclude that none of the available choices is acceptable.

That is already a different cognitive task.

A multiple-choice question tells you, implicitly, that reality has been reduced to four clean possibilities and that exactly one of them deserves your confidence. Outside an exam room, that is typically false. In some cases, the best answer is a fifth option no one wrote down; in others, two options are both reasonable but depend on assumptions that were never stated; in still others, the correct move is not to answer yet.

Good judgment, in practice, begins with resisting premature closure. Standard multiple choice rewards the opposite instinct.

Recognition is not the same as reasoning

Multiple choice is, structurally, a recognition task.

You look at a prompt. You look at some candidate answers. You compare them. You try to identify which one looks most right.

That can involve real thought, and I do not want to overstate the case. Well-written multiple-choice questions can demand careful reading and non-trivial inference.

An important point, though, is the structural difference between recognizing a good answer and generating one from scratch.

When you generate an answer, you build the path yourself. You retrieve knowledge, decide what matters, organize your reasoning, and commit.

When you recognize an answer, a substantial fraction of that work has already been done for you. The space of possibilities has been narrowed. The language has been supplied. In many cases, even the wrong choices act as hints about what category of answer the test writer wants.

That matters because in real thinking, nobody hands you the candidate solutions in advance.

Students can get the right answer for the wrong reason

This is another thing multiple choice hides.

If someone picks the correct option, what exactly have we learned?

Maybe they understood the concept. Maybe they used elimination. Maybe they spotted a wording pattern. Maybe they guessed. Maybe they reached the right answer through reasoning that would fail on the next problem.

The format collapses all of these possibilities into the same visible outcome: a filled-in bubble.

In practice, this is why standard multiple-choice assessment is typically weak as a measure of critical thinking. Critical thinking is not just arriving at a conclusion; it is having a defensible path to that conclusion.

If the path stays invisible, the assessment can only weakly measure the thing it claims to value.

Multiple choice locks the frame too early

To me, this is the deepest problem.

A large part of mature thinking happens one layer above the answer itself. Before you solve the problem, you have to inspect the problem.

What is being assumed here? What counts as success? What information is missing? What tradeoff is being hidden? What if the premise is wrong?

Standard multiple choice typically freezes all of that in place before the student ever starts. It says: here is the question, here are the options, now operate inside this box.

While that framing works for a narrow class of problems, many of the best real-world thinkers are effective precisely because they do not stay inside the box so obediently. They reframe.

They say things like:

  • "These are not actually the relevant options."
  • "The question is underspecified."
  • "Option B is best only if we optimize for speed rather than reliability."
  • "All four choices are downstream of a bad assumption."
  • "We should not answer this until we check one more fact."

That is not test-taking cleverness. That is judgment.

And judgment is a lot of what people mean when they say they want more critical thinking.

This matters even more in the AI era

In the age of search engines, calculators, and language models, the value of pure recognition has dropped considerably.

Machines are very good at pattern-matching among candidates. In many contexts, they are better than we are.

The human edge, then, shifts upward.

The valuable skill is less "can you recognize the right-looking answer when options are supplied?" and more:

  • can you tell when the answer is suspicious?
  • can you generate a better alternative?
  • can you explain your reasoning?
  • can you challenge the framing?
  • can you decide that none of the proposed options should be trusted?

This is one reason I am increasingly skeptical of assessments that over-rely on multiple choice while claiming to cultivate higher-order thinking. In effect, they may be training students for the exact layer of cognition that is easiest to outsource.

The research points in the same direction

The critique is not just intuitive; it is consistent with the research literature.

Researchers who defend carefully designed multiple-choice questions still tend to improve them by adding back the very things the basic format leaves out.

Dennis Kerkman and Andrew Johnson, in Challenging Multiple-Choice Questions to Engage Critical Thinking (2014), found value in asking students to challenge the question itself when they believed it was ambiguous or flawed. The critical-thinking move, it turns out, was not just selecting an answer; it was evaluating the structure of the question.

Molly Bassett, in Teaching Critical Thinking without (Much) Writing: Multiple-Choice and Metacognition (2016), describes pairing multiple-choice questions with required rationales. Again, the fix is revealing the student's thought process, rather than treating answer selection alone as sufficient evidence.

Stephen Norris, writing about multiple-choice critical-thinking tests in 1988, argued that validity and fairness problems remain unless one gathers more direct evidence of student thinking.

More recent work in domain-specific assessment makes a similar point from another angle: some practices simply do not fit comfortably into multiple-choice form. Justin Gambrell and Eric Brewe, in their 2024 study of computational thinking in introductory physics, found that practices like analyzing data, generating data, working in groups, and affective dispositions were not well captured by the multiple-choice format — even when others were.

The pattern is consistent. Multiple choice can test slices of thinking, but when educators want deeper evidence, they typically have to bolt on explanation, reflection, challenge, or open-ended response.

That should tell us something.

To be fair, multiple choice is not useless

I do not think the conclusion is "ban multiple choice forever."

Multiple choice has real advantages:

  • it is fast
  • it is scalable
  • it is easier to grade consistently
  • it can sample broadly across a body of knowledge
  • it can work well for factual recall and some kinds of conceptual discrimination

A well-written multiple-choice question can, in fact, be much better than a poorly written open-ended one.

But that is different from saying it is a strong default tool for cultivating critical thinking. It is better understood as a limited instrument: useful for some purposes, weak for others, and typically over-credited because it is convenient.

What seems better

If the real goal is critical thinking, I think assessment should more often ask students to do at least one of the following:

  • explain why an answer is correct
  • explain why the other answers are wrong
  • generate an answer before seeing options
  • identify missing assumptions
  • compare tradeoffs between multiple defensible solutions
  • challenge the wording or framing of the question
  • say what additional information would change their conclusion

Even a small change in format can help.

A multiple-choice question plus a short rationale is already better than a multiple-choice question alone. A question that allows the student to argue that none of the options is correct is better. A question that asks for the student's own answer first, and only then shows options, is better.

All of these formats push the student closer to the shape of real thought.

My bottom line

The biggest drawback of multiple choice is not that it is "too easy" or "too shallow," though sometimes it is both.

The deeper problem is that it trains people to think that intelligence means choosing well from a fixed menu.

But much of real thinking is menu design.

What are the options? Are these even the right options? What constraint are we optimizing for? What if the true answer is missing? What if the right move is to reject the question and reframe it?

That is what school should train harder.

Because in the real world, the most important moments are often the ones where the correct answer is not A, B, C, or D.

It is noticing that whoever wrote A, B, C, and D misunderstood the problem.

References