This chart keeps the three scored outcomes together so it is obvious whether the run was mostly clean overrides, classic lure captures, or other misses.
{{ interpretationLead }}
{{ interpretationFollowThrough }}
| Held up on this pass | Review next |
|---|---|
|
{{ row.heldUpLabel }}
{{ row.heldUpNote }}
|
{{ row.reviewLabel }}
{{ row.reviewNote }}
|
| Score | Signal | Interpretation use |
|---|---|---|
| {{ row.score }} | {{ row.signal }} | {{ row.useCase }} |
| # | Item | Your answer | Outcome | Correct | Check habit | Copy |
|---|---|---|---|---|---|---|
| {{ row.id }} | {{ row.short }} | {{ row.answerDisplay }} | {{ row.statusLabel }} | {{ row.correctLabel }} | {{ row.checkHabit }} |
{{ row.explanation }}
Export the scored result, settings, and per-item outcomes in a machine-readable format.
Cognitive reflection is the habit of stopping an answer that feels obvious, checking it, and only then deciding whether it really fits the problem. That matters because many reasoning errors come from accepting a tempting first response too quickly, not from lacking the information needed to solve the problem. This assessment uses the classic three-item Cognitive Reflection Test (CRT-3) to show how often that first response survives a second look.
The package does more than count correct answers. Every response can land as Correct, as an Intuitive lure, or as an Other miss. That distinction is useful because a lure answer is not just wrong. It is the particular wrong answer the item was built to invite. Separating lure picks from other mistakes gives the result more teaching value than a bare score out of three.
Before the questions begin, you can choose how harshly lure picks should affect the summary, what style of coaching should appear afterward, and how detailed the review should be. Once all three items are answered, the package reports a reflection index from 0 to 100, a correct-lure-miss breakdown, a level label, a short recommendation list, a per-item review table, a small split chart, and a JSON record of the run.
That makes the tool useful for self-observation, classroom discussion, and short reasoning drills. A learner can see why the bat-and-ball answer felt plausible, and a trainer can compare whether an equation-first prompt helps more than a final sanity check.
The output still needs restraint. CRT items are widely circulated, repeated exposure changes what later scores mean, and this package does not adjust for prior familiarity or provide population norms. Scoring stays in the browser, but the current answer state is also written into the page link, so a copied URL can reproduce the response pattern.
For a first run, leave Lure penalty weight at 2x, choose Balanced coaching, and keep Review depth at Forensic detail. Answer the three items in one sitting and do not rush. The package has no timer, so speed is not part of the result.
The best way to read the result is to treat Correct, Lures, and Other misses as the primary signal, then use Reflection Index as the weighted summary. If arithmetic traps caught you, try Equation-first on a later run. If the mistake was accepting a plausible answer without checking the quantities, Sanity-check first is usually the better follow-up.
This package is a good fit for reflection and teaching. It is a poor fit for high-stakes decisions about intelligence, diagnosis, or suitability. A practical next step is to open Trap Review, find the exact item style that caught you, and practice that habit after a short break rather than chasing the headline index alone.
The CRT-3 bundle uses three fixed multiple-choice items: bat and ball, machines and widgets, and lily pads. Each item has one keyed correct answer and one predefined lure answer. The scoring model therefore distinguishes between a response that falls for the classic trap and a response that is wrong for some other reason. That is why the result surface exposes both Lures and Other misses instead of only a score.
The first derived quantity is the correct count C, which ranges from 0 to 3. The second is the lure count L, also from 0 to 3. A user-selected lure weight w can be 1, 2, or 3. The package turns those counts into a reflection index by rewarding correct answers, subtracting weighted lure picks, dividing by 9, clamping to the 0 to 1 range, and scaling to 0 through 100. A heavier lure weight lowers the index when trap answers appear, but it does not change which responses were scored as correct.
The Level label is a separate rule set. It depends only on how many answers were correct: 3 correct gives Consistently reflective, 2 gives Mostly reflective, 1 gives Mixed intuitive/reflective, and 0 gives Intuition-dominant. That means Level and Reflection Index can point in slightly different directions. Two users can both be labeled Mostly reflective while showing different reflection index values if one of them selected a lure and the other made a non-lure mistake.
The remaining settings affect interpretation rather than scoring. Coaching mode changes the wording of the recommended next steps. Review depth controls whether the package stops at the main table or also lists lure picks and misses in separate panes. The answer state is encoded into the r parameter as three characters, with one position per question and - for unanswered slots. If that code is malformed, the package ignores it.
The reflection index is best read as a weighted trap-avoidance score, not as a direct synonym for the correct-answer count.
| Symbol | Meaning | Range or source |
|---|---|---|
| C | Number of correct answers | Integer from 0 to 3 |
| L | Number of predefined lure answers selected | Integer from 0 to 3 |
| w | Lure penalty weight | 1, 2, or 3 |
| Rindex | Reflection Index | Integer from 0 to 100 |
Suppose a user gets 2 items correct and chooses 1 lure answer with Lure penalty weight = 2x.
The same answer pattern still earns the Mostly reflective level because the level rule is based on 2 correct answers, not on the weighted index.
| Correct answers | Level | Interpretation |
|---|---|---|
| 3 | Consistently reflective | All three keyed answers were solved on this run. |
| 2 | Mostly reflective | Reflective checking succeeded more often than it failed. |
| 1 | Mixed intuitive/reflective | The run showed both successful checking and clear trap exposure. |
| 0 | Intuition-dominant | Initial intuitive responses drove the whole result. |
| Status | What it means | Why it matters |
|---|---|---|
| Correct | The chosen option matches the keyed answer for that item. | Raises both the correct count and the reflection index. |
| Intuitive lure | The chosen option matches the item's predefined tempting wrong answer. | Lowers the reflection index according to the selected lure weight. |
| Other miss | The answer is wrong but is not the item's lure answer. | Lowers the correct count without adding lure penalty. |
Set the scoring context before you answer the first item.
r code still has three positions and only uses 0 to 3 or -; invalid codes are ignored.Read Correct, Lures, and Reflection Index together. Level is a rule on the correct count only: 3 correct means Consistently reflective, 2 means Mostly reflective, 1 means Mixed intuitive/reflective, and 0 means Intuition-dominant. Reflection Index is different. It falls when lure answers are weighted more heavily, so it tells you how costly those trap responses were under the chosen scoring strictness.
The main false-confidence warning is familiarity. A strong result after repeated exposure can reflect remembered items as much as reflective thinking. A practical verification step is to note whether the questions felt familiar before you started, then use Trap Review to decide whether the run taught you something new or simply confirmed what you already knew.
A user leaves Lure penalty weight at 2x, answers 2 items correctly, and picks 1 intuitive lure. The package reports Correct: 2/3, Lures: 1, Other misses: 0, Reflection Index: 44, and Level: Mostly reflective. That shows why the weighted index and the level should be read together.
If the answer pattern is identical but Lure penalty weight is raised to 3x, the user still gets Level: Mostly reflective because the correct count is unchanged. The reflection index drops to 33 because the lure now carries a larger penalty. It means the package is interpreting the same lure pick more harshly.
Suppose someone copies the page URL, trims part of the query string, and sends it to a classmate. If the r code no longer contains exactly three characters from 0 to 3 or -, the package ignores it and the assessment opens without restored answers. The corrective path is to share a complete link or use an exported result instead of a hand-edited URL.
No. This package is a short cognitive reflection check built around three well-known trap questions. It does not diagnose anything and it does not claim to measure intelligence broadly.
Level depends only on how many answers were correct. Reflection Index also subtracts weighted lure picks. A user can therefore keep the same level label while the index moves up or down with the selected lure weight.
Scoring does not change. The package simply reveals extra panes that list intuitive lure picks and all missed items, which is useful when you want to inspect the exact error pattern.
Routine scoring stays in the browser, but the response state is still written into the page link. That means a shared URL can expose the answer pattern even though the package does not need a server helper to score the test.
The r parameter has to contain exactly three characters, each one either 0, 1, 2, 3, or -. If the code is malformed, the package ignores it and starts with unanswered items.