TL;DR.
- Multiple-choice questions train recognition inside a fixed frame.
- Critical thinking lives one layer up: noticing that the menu is incomplete, ambiguous, or downstream of a bad assumption.
- The research points the same way — when educators want deeper evidence of thinking, they add rationale, challenge, or open-ended response.
- In the AI era this matters more: recognition is the layer of cognition machines outsource best.
Multiple-choice questions aren't bad. They're efficient, easy to grade, and fine for checking whether someone remembers a fact. The problem is more specific: they're the wrong shape for what people say they care about, which is critical thinking.
Critical thinking is rarely selecting the best answer from a fixed menu. It's noticing that the menu is incomplete, that two options are simultaneously defensible, that the question is asking the wrong thing, or that the real answer must be constructed rather than recognized.
In actual work, the hard part isn't selecting among four options. It's figuring out whether the right option is even on the page.
The real world does not come with four choices
School assessment quietly trains a narrow skill: given a pre-bounded problem, pick the best answer from a small set.
Real problems rarely arrive pre-framed. You typically have to first (i) decide what the problem actually is, (ii) generate candidate solutions, (iii) reject the framing, (iv) ask for missing information, (v) notice that multiple answers work depending on the goal, or (vi) conclude that none of the choices is acceptable.
A multiple-choice question tells you, implicitly, that reality has been reduced to four clean possibilities and that exactly one deserves your confidence. Outside an exam room, that's usually false. Sometimes the best answer is a fifth option no one wrote down. Sometimes two options are both reasonable but depend on unstated assumptions. Sometimes the correct move is not to answer yet.
Good judgment begins with resisting premature closure. Standard multiple choice rewards the opposite instinct.
Recognition is not reasoning
Multiple choice is structurally a recognition task. You read a prompt, compare candidate answers, and identify which one looks most right.
That can involve real thought — well-written questions demand careful reading and non-trivial inference. But there's a structural difference between recognizing a good answer and generating one from scratch.
When you generate an answer, you build the path yourself. You retrieve knowledge, decide what matters, organize your reasoning, and commit. When you recognize an answer, a substantial fraction of that work has already been done for you. The space of possibilities has been narrowed, the language has been supplied, and the wrong choices often act as hints about what category of answer the test writer wants.
In real thinking, nobody hands you the candidate solutions in advance.
Students can get the right answer for the wrong reason
If someone picks the correct option, what have we learned?
Maybe they understood the concept. Maybe they used elimination. Maybe they spotted a wording pattern. Maybe they guessed. Maybe they reached the right answer through reasoning that would fail on the next problem.
The format collapses all of these into the same visible outcome: a filled-in bubble. Critical thinking isn't just arriving at a conclusion; it's having a defensible path to that conclusion. If the path stays invisible, the assessment can only weakly measure what it claims to value.
Multiple choice locks the frame too early
A large part of mature thinking happens one layer above the answer itself. Before you solve the problem, you inspect it.
What is being assumed? What counts as success? What information is missing? What tradeoff is being hidden? What if the premise is wrong?
Standard multiple choice typically freezes all of that in place before the student ever starts. Here is the question, here are the options, now operate inside this box.
Many of the best real-world thinkers are effective precisely because they don't stay inside the box. They reframe:
- "These aren't actually the relevant options."
- "The question is underspecified."
- "Option B is best only if we optimize for speed rather than reliability."
- "All four choices are downstream of a bad assumption."
- "We shouldn't answer this until we check one more fact."
That's judgment. And judgment is a lot of what people mean when they say they want more critical thinking.
This matters more in the AI era
With search engines, calculators, and language models, the value of pure recognition has dropped considerably. Machines are very good at pattern-matching among candidates — often better than we are.
The human edge shifts upward. The valuable skill is less "can you recognize the right-looking answer when options are supplied?" and more:
- Can you tell when the answer is suspicious?
- Can you generate a better alternative?
- Can you explain your reasoning?
- Can you challenge the framing?
- Can you decide that none of the proposed options should be trusted?
Assessments that over-rely on multiple choice while claiming to cultivate higher-order thinking may be training students for the exact layer of cognition easiest to outsource.
The research points in the same direction
Researchers who defend carefully designed multiple-choice questions still tend to improve them by adding back the very things the basic format leaves out.
Dennis Kerkman and Andrew Johnson, in Challenging Multiple-Choice Questions to Engage Critical Thinking (2014), found value in asking students to challenge the question itself when they believed it was ambiguous or flawed. The critical-thinking move wasn't selecting an answer; it was evaluating the structure of the question.
Molly Bassett, in Teaching Critical Thinking without (Much) Writing: Multiple-Choice and Metacognition (2016), describes pairing multiple-choice questions with required rationales — revealing the student's thought process rather than treating answer selection as sufficient evidence.
Stephen Norris, writing in 1988, argued that validity and fairness problems remain unless one gathers more direct evidence of student thinking.
More recent work makes a similar point from another angle. Justin Gambrell and Eric Brewe, in their 2024 study of computational thinking in introductory physics, found that practices like analyzing data, generating data, working in groups, and affective dispositions were not well captured by the multiple-choice format.
The pattern is consistent. Multiple choice can test slices of thinking, but when educators want deeper evidence, they have to bolt on explanation, reflection, challenge, or open-ended response.
To be fair, multiple choice is not useless
Multiple choice is fast, scalable, consistent to grade, samples broadly, and works well for factual recall and some conceptual discrimination. A well-written multiple-choice question can be much better than a poorly written open-ended one.
But that's different from saying it's a strong default for cultivating critical thinking. It's a limited instrument: useful for some purposes, weak for others, and typically over-credited because it's convenient.
What seems better
If the real goal is critical thinking, assessment should more often ask students to:
- explain why an answer is correct
- explain why the other answers are wrong
- generate an answer before seeing options
- identify missing assumptions
- compare tradeoffs between multiple defensible solutions
- challenge the wording or framing of the question
- say what additional information would change their conclusion
Even small format changes help. A multiple-choice question plus a short rationale is better than one alone. A question that lets a student argue none of the options is correct is better. A question that asks for the student's own answer first, and only then shows options, is better.
My bottom line
The deeper problem with multiple choice isn't that it's too easy or too shallow, though sometimes it's both. It's that it trains people to think intelligence means choosing well from a fixed menu.
Much of real thinking is menu design. What are the options? Are these even the right options? What constraint are we optimizing for? What if the true answer is missing? What if the right move is to reject the question and reframe it?
In the real world, the most important moments are often the ones where the correct answer isn't A, B, C, or D. It's noticing that whoever wrote A, B, C, and D misunderstood the problem.
References
- Bassett, Molly H. "Teaching Critical Thinking without (Much) Writing: Multiple-Choice and Metacognition." Teaching Theology & Religion 19, no. 1 (2016).
- Gambrell, Justin, and Eric Brewe. "Analyzing interviews on computational thinking for introductory physics students: Toward a generalized assessment." Physical Review Physics Education Research 20, no. 1 (2024).
- In'nami, Yo, and Rie Koizumi. "A meta-analysis of test format effects on reading and listening test performance: Focus on multiple-choice and open-ended formats." Language Testing 26, no. 2 (2009).
- Kerkman, Dennis, and Andrew Johnson. "Challenging Multiple-Choice Questions to Engage Critical Thinking." InSight: A Journal of Scholarly Teaching 9 (2014): 92-97.
- Norris, Stephen P. "Controlling for Background Beliefs When Developing Multiple-Choice Critical Thinking Tests." Educational Measurement: Issues and Practice 7, no. 3 (1988): 5-11.