{{ briefLead }}
| Score | Lane | Next move |
|---|---|---|
| {{ band.range }} | {{ band.label }} | {{ band.shortAction }} |
This chart splits the seven items into correct answers, intuitive lure picks, and other misses so the total score is easier to interpret.
Use it to separate first-instinct capture from other reasoning slips before you compare this pass to another run.
This section chart keeps the classic trio and the extension quartet side by side so the trap families are easier to compare.
A balanced score here suggests the two sections behaved similarly. A gap points to the section that deserves the next review pass.
The table keeps the keyed answer, intuitive lure, and reset move together so you can review exactly which trap family fired on each item.
| # | Item | Section | Your response | Intuitive lure | Keyed answer | Status | Reset move |
|---|---|---|---|---|---|---|---|
| {{ row.id }} | {{ row.short }} | {{ row.sectionLabel }} |
{{ row.responseLabel }}
Changed {{ row.changedCount }} time{{ row.changedCount === 1 ? '' : 's' }}
|
{{ row.lureLabel }} | {{ row.correctLabel }} | {{ row.statusLabel }} | {{ row.checkMove }} |
These notes keep the CRT focus narrow: what the trap was, what the keyed answer was, and which reset move is worth repeating.
{{ card.context }}
{{ card.lead }}
{{ card.detail }}
Cognitive reflection is the ability to stop an answer that feels obvious, test it, and only then commit. The Cognitive Reflection Test was built around that moment. Each item looks short and familiar, yet the first response that comes to mind is often wrong, so the useful signal is whether a slower check takes over in time.
A seven-item version broadens the original three-question set with four additional problems that keep the same basic conflict while covering more trap families. Instead of leaning only on the famous bat-and-ball, machines, and lily-pad items, the expanded set also samples combined rates, ordinal counting, buy-sell netting, and changing percentage bases.
That wider coverage helps because a short reflection score can otherwise be fragile. Someone may handle the famous classic items yet still slip when the same conflict appears in a less familiar form. Another person may look stronger than they really are simply because the best-known items have already been seen before.
CRT-7 is still a brief screen rather than a full psychometric profile. It can support teaching, self-review, and quick comparison of reasoning traps, but it should not be treated as a diagnosis, a hiring filter, or a complete statement about intelligence.
Frederick's original CRT became influential because the wrong answers are not random mistakes. The lure answers tend to dominate when people rely on the first appealing interpretation instead of rebuilding the relationship in the prompt. The seven-item expansion was developed to preserve that same override problem while reducing dependence on only three famous questions.
In the expansion study, Toplak, West, and Stanovich reported that the added four-item form behaved similarly to the original and that combining both sets into a seven-item test produced stronger reliability than the three-item form alone. That is why a split between the classic trio and the extension quartet can be informative: it shows whether the reflective habit is holding across different kinds of bait.
This assessment classifies every final response into one of three buckets: Correct, Intuitive lure, or Other miss. The first bucket means the keyed answer survived. The second means the standard tempting answer won. The third means the obvious lure was avoided, but the setup, arithmetic, or final comparison still drifted off course. Lure avoidance is then summarized as the share of items that did not end in the lure bucket.
| Item | Keyed answer | Intuitive lure | Main trap family |
|---|---|---|---|
| Bat and Ball | 5 cents | 10 cents | Difference equation |
| Machines | 5 minutes | 100 minutes | Rate scaling |
| Lily Pads | 47 days | 24 days | Doubling sequence |
| John and Mary | 4 days | 9 days | Combined rates |
| Jerry | 29 students | 30 students | Ordinal indexing |
| Pig Trade | $20 | $10 | Cash-flow accounting |
| Simon | Same amount of money | 5% more money | Percentage base shift |
| Section | Items | What a weak section often points to |
|---|---|---|
| Classic trio | 3 |
The best-known CRT lures are still catching the first response |
| Extension quartet | 4 |
The reflective habit is less stable once the conflict appears in newer formats |
Response format matters too. The classic CRT was open-ended, whereas this version uses four guided options per item. That makes scoring and debriefing more consistent, but it also means the result should be compared cautiously with published work that used open response formats or different scoring conventions.
The strongest first pass is a fresh one. Answer all seven items without looking up familiar solutions, then set Prior exposure honestly before deciding what the score means. The total becomes much easier to trust when the items still feel new, and much easier to misuse when several answers were already known.
After the last answer, read the result in layers instead of jumping straight from 5/7 or 6/7 to a broad conclusion. This tool is built for that slower review:
CRT-7 Reflection Brief gives the headline score, band, section totals, lure count, and session context.CRT-7 Outcome Split shows whether misses were mostly intuitive lures or other errors.CRT-7 Section Outcome Map compares the classic trio with the extension quartet side by side.Response table and Debrief notes keep the keyed answer, trap family, and reset move together for each item.Debrief depth is the practical control that changes how much explanation you want after scoring. Compact brief is enough for a quick summary. Teaching notes is better when you want the reasoning line and the reset move spelled out for each miss. If you need a durable record, the response table can be copied as CSV or exported as CSV and DOCX, and the full result can be copied or downloaded as JSON.
The most common reading mistake is to treat every finished state as equally meaningful. A live run can show rechecked items and answers rescued after a rethink. A replayed link or audit sample can still show the final score and item outcomes, but it cannot reconstruct the original revision history. That difference matters when the question is not just "How many were right?" but also "Did the correction happen in real time?"
Begin Assessment and answer the seven items in one sitting. Use the first serious answer you can defend, not a guessed classroom memory.100%.Prior exposure and Debrief depth before interpreting the pass. Those controls do not change the keyed answers, but they do change how cautiously the score should be read and how much explanation is shown.CRT-7 Reflection Brief, then check Current score lane, the weaker section, and the lure count before drilling into any single item.Response table, and Debrief notes for review, then export only what you need with Copy CSV, Download CSV, Export DOCX, Copy JSON, or Download JSON.The raw score runs from 0 to 7, but the total is only the first pass at meaning. A lure-heavy result points to fast-answer capture. A result with few lures but several other misses usually means the person slowed down enough to resist the bait, yet still lost the thread on setup, arithmetic, or percent reasoning. Those are different coaching problems even when the headline score matches.
| Score | Lane | Best next read |
|---|---|---|
0-1 |
Heavy first-impression pull | Work on pausing and writing the governing relationship before choosing an answer. |
2-3 |
Mixed override | Check which trap families still survive the second look instead of chasing the total alone. |
4-5 |
Mostly reflective | Use the section split and item notes to find the last unstable question types. |
6-7 |
Strong override | Treat the result cautiously if the items were familiar, because memory can inflate a near-perfect pass. |
The section split adds another important clue. A stronger classic trio than extension quartet often means the best-known items are no longer the hardest part, while the newer traps still need deliberate checking. The reverse pattern suggests that the traditional CRT lures remain sticky even when the added items are going better. An even split usually means the reflective habit is landing at about the same level across both families.
Live-process signals are worth attention when they exist. Rescued after rethink means an initially wrong answer was corrected before the run finished, which is directly relevant to what CRT items are trying to capture. Rechecked without a rescue means the person revisited an item but the revision did not turn into a correct answer. Replay and audit states preserve the final score, section rows, and item outcomes, but they should be read as static records rather than full process traces.
Prior exposure remains the strongest qualifier on almost any strong score. A fresh pass is the cleanest signal. A partly familiar pass can still show which trap families are stable or unstable. A heavily familiar pass is better used as a teaching review than as a benchmark of spontaneous reflection.
Suppose a respondent finishes at 5/7, solves the classic trio cleanly, and misses two extension items. The headline score looks solid, but the section map shows that the newer rate, ordinal, cash-flow, or percentage traps still need work. That is a better next target than generic advice to "be more careful."
Imagine two people both finishing at 4/7. One has three intuitive lures. The other has one lure and two other misses. The first pattern says the fast answer is still winning too often. The second says the lure was often resisted, but the later setup or arithmetic still broke down. The tool's lure split matters because those two 4/7 results should not be coached the same way.
A shared result link opens with a finished score, section totals, and full debrief cards, yet there is no rescued-after-rethink count. That is not a scoring error. It means the state was replayed from saved answers rather than observed live, so the tool can restore outcomes but not the original sequence of answer changes. Use the record for review, not for process analysis.
No. A high score says the person handled seven short reflection traps on this pass. That can be useful, but it is not a complete measure of intelligence, judgment, diagnosis, or long-run decision quality.
The result panel appears only after all seven items are answered. If it is missing, use the question navigator and finish the item without a check mark so the progress bar reaches 100%.
Those metrics depend on a live run. If the result was reopened from a saved URL or loaded as an audit sample, the final answers can still be scored, but the original revision history is not available to rebuild.
Routine scoring stays in the browser and there is no server-side scoring helper for this assessment. Exported CSV, DOCX, or JSON files and shared result links can still reveal answers and interpretation settings, so treat saved results as sensitive assessment notes.
Only cautiously. The stems and keyed answers are aligned with CRT literature, but this assessment uses four guided options per item. That makes review cleaner and scoring more consistent, yet it is still a different response format from the classic open-ended administration.