LLM 101 · part 7
[LLM 101] How to spot AI hallucinations — three red flags before you verify
❯ cat --toc
- Plain version: you cannot read confidence as accuracy
- Prelude
- AI is not broken when it hallucinates — this is how it works
- Red flag 1: numbers outside the possible range → fabricated
- Red flag 2: details too specific to plausibly be remembered
- Red flag 3: re-ask the question and watch for drift
- What to do after a flag fires
- One line takeaway
TL;DR
AI delivers wrong answers in the same confident tone as right ones. Three red flags let you raise your guard before you start verifying — a number that exceeds the possible range, a detail too specific to plausibly be remembered, and a re-asked question that returns a different answer. Hit any of them, treat the answer as fabricated. The opening story: I handed ChatGPT a portfolio screenshot and it returned "META P&L +205%" in a column whose mathematical ceiling is +100%.
Plain version: you cannot read confidence as accuracy
The most common mistake: "It sounds so sure, it must be right."
It's the opposite. AI uses the exact same tone whether it knows the answer or is making one up. Tone is its default register, not a signal of accuracy. Confidence is not proof.
This post covers three red flags you can catch without opening Google — symptoms that should raise your guard before you start verifying. For the verification step itself (how to Google fast, how to ask AI for sources, how to cross-check across models), see Ask AI Right Express Part 5.
Prelude
I handed ChatGPT a screenshot of a portfolio — about fifteen positions, each with a profit-and-loss percentage column. ChatGPT returned a careful-sounding analysis, even gave it an "84/100 desk-trader grade." I nodded approvingly.
Then I read the numbers it had listed: "META P&L +205%," "GOOGL +155%," "TSM +142%."
I stared for three seconds. That column's mathematical ceiling is +100%. There is no arithmetic path that produces +205% in that field. None.
But ChatGPT had stated it without hedging. No "I'm not sure," no warning emoji, no "please verify." It baked those numbers into the analysis and then offered recommendations built on top of them.
This is what an AI hallucination looks like at its most typical — not malicious deception, just AI doing what AI does. The last post covered why you'd run AI on your own machine. This one tackles a more upstream question: how do you spot AI is making things up before you bother verifying?
AI is not broken when it hallucinates — this is how it works
Mechanism first.
What ChatGPT and Claude do is not "look stuff up." They predict the next token — the next chunk of text — given everything that came before. They were trained on trillions of words with one objective: given a prefix, guess what comes next. The entire loop is autocomplete on steroids.
Here's the catch. When you ask something the model has no real data on, it still keeps going. Its job is to continue the text, not to evaluate whether it knows. There is no built-in "I don't know" button — unless it was trained to emit one in specific situations, the default behavior is to just keep generating.
And the generated text is tonally indistinguishable from a true answer. The training data is full of both true and false statements written in identical sentence shapes. The model learns the shape, not the truth.
So you get:
- Papers that don't exist, by authors who don't exist, in perfect citation format
- Numbers that violate basic math, delivered with full confidence
- Correct website names paired with fabricated content from those sites
- An ISBN, a statute number, a precise date — all invented, all plausible-looking
This is a structural property of any system that runs on "predict the next token." It's not one vendor doing worse — the phenomenon is called AI hallucination, and the whole field deals with it.
Red flag 1: numbers outside the possible range → fabricated
My +205% case lives here. When a number violates a physical or logical ceiling, the AI is making it up.
A few common impossibilities:
| If you see... | Why it can't be real |
|---|---|
| P&L > +100% (in a capped-loss context) | Ceiling is 100% — can't go higher |
| Probability > 100% or < 0% | Probability runs 0 to 100 |
| A website with 10 billion monthly visitors | World population is 8 billion — visitors can't exceed people |
| A PDF citing page 437 (when the file has 200 pages) | Page number exceeds the file |
| A paper published in 2030 | Future papers don't exist yet |
How to use it: every time AI hands you a number, run a five-second sanity check — is it physically or logically possible? Is there a ceiling? A floor? A unit mismatch?
Five seconds. One instinct. If it doesn't pencil, treat it as fabricated.
My +205% case was caught exactly this way — no Googling, no fact-checking, no external tool. Just looking at the number and noticing it can't exist.
Why would AI invent something so obvious? Because it's matching the shape of "what a P&L percentage looks like," not doing arithmetic. It pieced together "three-digit number + percent sign + plus or minus" and never checked whether the resulting shape was legal for that column.
Red flag 2: details too specific to plausibly be remembered
The second flag is subtler, but once you start noticing it you'll see it everywhere.
AI is generally accurate on broad concepts and frequently wrong on hyper-specific details.
| Question shape | Typical AI behavior |
|---|---|
| "What's quantum mechanics roughly about?" | Usually accurate — the concept appears constantly in training data |
| "Cite the seminal 1947 quantum-physics paper, with author and journal volume" | High odds of fabrication — that level of detail is rarely retained |
| "What is Python?" | Accurate |
| "What does section 7.3.2 of the Python standard library docs cover?" | Likely fabricated |
Why this happens: AI has consumed a lot of text, so broad concepts are smooth. But asking it to nail a specific section, page, ISBN, or author down to the comma — the precise detail usually isn't retained, the model still has to continue the text, so it generates something that looks like the right shape.
How to use it: when AI gives you a very precise detail (specific date, statute number, page reference, ISBN, full author name, financial figure), pause and ask — is this detail plausibly remembered? The more precise and the more niche, the higher your guard should go.
Real case: in May 2025, Anthropic's outside counsel at Latham & Watkins used Claude to format a citation in a court declaration. For a real underlying paper, Claude returned the wrong title and authors; opposing counsel surfaced it at the May 13 hearing, and the judge struck the affected paragraph. The lawyer defending Anthropic, using Anthropic's own model, still got burned — not because the tool isn't capable, but because hallucination is structural.
Red flag 3: re-ask the question and watch for drift
The third flag is the simplest and probably the most useful.
Open a fresh chat (ChatGPT has "Temporary chat" in the top bar; Gemini has a similar option) and ask the same question, word for word. Then compare the two answers.
| Result | What it means |
|---|---|
| The two answers match on details (name, year, statute number, figure) | More likely that the model actually retained it |
| The same detail comes back differently (different name, year off by one) | It's guessing — wasn't retained, fills the gap differently each time |
Why this works: AI generation involves randomness. When the model truly knows something, repeated generations converge on the same answer (the high-probability one dominates). When it doesn't know, each generation pieces together a "plausible enough" version, and that version varies each time.
How to use it: important details are worth thirty seconds of re-asking. If you see drift, jump back to flag 1 and flag 2, or skip straight to Google.
⚠️ Caveat: two matching answers do not prove correctness. The model could have stably memorized something incorrect from training data. This filter catches "the model is guessing"; it does not certify "the model is right."
What to do after a flag fires
If any of the three signals trips, treat the answer as fabricated and run verification:
- Sanity check fails → don't trust it. Ask the AI "walk me through how you got that number, step by step" — often it'll catch itself.
- Detail too specific → copy the detail into Google. If you can't find it in 30 seconds, treat it as nonexistent.
- Re-ask drifts → already a guessing signal, fall back to (1) and (2) to verify.
The full verification toolkit — asking AI for clickable sources, opening the link to check the actual quoted text, cross-checking across vendors, plus a copy-paste prompt — lives in Ask AI Right Express Part 5: three 30-second checks.
Division of labor: this post is about symptom recognition (raising your guard); Part 5 is about the verification work itself (doing the checking). They complement, they don't duplicate.
One line takeaway
AI's confidence isn't proof — when a number exceeds the range, a detail is too precise, or the answer drifts on a re-ask, treat it as fabricated until you verify.
Next post: how AI "remembers" what you said — conversation memory, real long-term memory, and how both differ from the context window.
This is Part 7 of the "LLM 101" series. Previous: Why run AI locally. Related: What is the context window, How to choose a model.
FAQ
- What is an AI hallucination?
- An AI hallucination is when an AI confidently gives you a completely wrong answer — invented names, invented citations, invented numbers, invented URLs. It does not tell you it is guessing, and the tone is identical to when it is right. This is a structural property of how AI works, not a bug.
- Why does AI sound confident even when it is wrong?
- Because AI does not look anything up. It predicts which word is most likely to come next given what came before. When it has no real data on a topic, it still has to keep going, so it fills in the most plausible-sounding continuation. Plausible-sounding and true are not the same thing, but the surface tone is identical.
- Which model hallucinates the least — ChatGPT, Gemini, or Claude?
- There is no single winner. Rankings shift by domain, by question style, and by version. Each new release usually pushes hallucination rates down, but none of them hit zero. What matters more than picking the safest model is building the habit of verifying things that matter.
- Does AI sound less confident when it is making something up?
- No. This is the biggest trap. AI uses the same firm tone whether it knows the answer or is filling in plausible text. Confidence is the default register, not a truth signal. Reading firmness as accuracy is the most common way people get burned.
- What is the fastest way to spot AI hallucination?
- Three signals: (1) numbers that exceed the physical or logical range — a percentage above 100% in a capped-loss context, for example. (2) Details too precise to plausibly be remembered — statute numbers, ISBNs, specific dates buried inside niche topics. (3) Ask the same question twice in a fresh chat — if precise details shift, the model is guessing. Hit any of the three, treat the answer as fabricated until you verify.