The Day The Robot Dreamed (On AI Hallucination)
by Dan Roque | Reading Time: 10 minutes | In Future of Work
Today, I want to invite you to step away from the headlines.
Let’s step up to the virtual chalkboard, pick up a piece of chalk, and look at
the actual gears turning. Specifically, we’re going to talk about the
"ghost in the machine": AI Hallucinations.
Why should you care? Because if your job involves a
computer, you are likely already using—or will soon use—these tools.
Understanding why an AI "lies" isn't just a fun trivia fact for
coders; it’s a foundational literacy for the modern professional. We’re moving
past the "doom and hype" to understand the mechanism, the risks, and
surprisingly, the hidden benefits of these digital fictions.
AI is a tool we need to learn, not a magic trick to fear. To
use it safely, we first have to understand the difference between madness and
math.
It’s Not Madness, It’s Math
The first thing we need to do is stop using the word
"know." An AI doesn't know that George Washington was the
first U.S. President in the way you do. It calculates that the word
"Washington" is the most statistically probable sequence to follow
"George" in that specific context.
When an AI hallucinates, it isn't "losing its
mind." It’s simply doing exactly what it was built to do: predicting
the next word. Researchers like those at OpenAI and Lilian Weng (Lil’Log)
categorize these into two flavors: in-context (contradicting the
information you just gave it) and extrinsic (making up facts not found
in its training data).
Sketching the Board: How Hallucinations are Born
- Probability
at All Costs: The model is a pattern-completion engine. If the signal
is weak, it doesn't naturally hit a "stop" button. It fills the
gap with the most plausible-sounding word to keep the pattern going.
- The
"Bullshit" Factor: Philosopher Harry Frankfurt defined
"bullshit" as being indifferent to the truth. Because AI is
fundamentally a statistical engine, it is indifferent to reality. It cares
about fluency, not factuality.
- The
Knowledge Trap: Here is the technical deep-dive. Not all data is
created equal. Research from Lil’Log shows that models learn "Highly
Known" facts (common knowledge) quickly. However, they learn
"Unknown" examples (new or rare facts) substantially slower.
When we force a model to learn "Unknowns" through fine-tuning,
its tendency to hallucinate actually increases. It begins to guess
because it has "memorized" the shape of an answer without
"understanding" the grounding.
The Analogy: Think of a hallucination like a human
filling a memory gap with a reconstructed story. If you can’t remember what you
had for lunch three Tuesdays ago, your brain might offer a
"plausible" memory of a sandwich because you often eat sandwiches.
The AI does the same, but with the entire internet's worth of
"statistically probable sandwiches."
This is why "plausible-sounding falsehoods" are so
dangerous—they wear the costume of truth so well that we forget they are just
math-driven guesses. And as the legal world is finding out, those guesses have
a high price tag.
957 Reasons to Double-Check
Let’s move from the theory to the courtroom. I’m sketching
the legal landscape here using the AI Hallucination Cases Database
compiled by Damien Charlotin. As of February 2026, this database has tracked 957
cases where AI-generated fabrications met the cold reality of a judge’s
gavel.
When probabilistic patterns meet high-stakes professions,
the results aren't just "glitches"—they are career-ending events.
Surprising Outcomes from the Database:
- The
Default Judgment: In the 2026 case Flycatcher v. Affable Avenue,
a lawyer used NotebookLM and vLex to draft submissions. When the AI
fabricated case citations, the court found the lawyer had acted in bad
faith or with "conscious avoidance." The result? The court
struck the filings and entered a Default Judgment against the
client.
- The
$5,000 Fine: In the now-infamous Mata v. Avianca case, lawyers
were fined for standing by fictitious case precedents that ChatGPT
insisted were real.
- "Arguments
Deemed Waived": In Landmark Development Group v. LuPardus,
courts threw out entire legal arguments because they were built on
AI-generated "gibberish."
- The
Bereavement Blunder: Air Canada was ordered to pay damages after its
chatbot hallucinated a refund policy. The airline tried to argue the bot
was a "separate legal entity," a defense the court firmly
rejected.
So What? These cases represent a massive responsibility
shift. In the past, machines handled execution (doing exactly what
we told them). Now, machines handle generation, but the human must
handle the judgment. If you outsource your judgment to the machine, you
are the one who pays the fine—or loses the case.
The Incentive Problem: Teaching to the Test
Why do these models keep "lying"? Because we are
effectively teaching them to guess. OpenAI’s research from September
2025 highlights that our current "scoreboards" reward a lucky guess
more than a humble "I don't know."
Imagine a student taking a multiple-choice test with no
"negative marking." If they leave a question blank, they get 0. If
they guess, they might get lucky. Naturally, the student guesses every time. AI
models are currently ranked on Accuracy Rate (how many right answers
they get), which pushes them to be "Confident Guessers."
The Scoreboard: Confident Guesser vs. Humble Thinker
Look at this data from OpenAI’s SimpleQA evaluations:
- The
"Guesser" (o4-mini): Abstains from answering only 1% of
the time. This sounds great, until you see the 75% error rate. It’s
almost always guessing.
- The
"Thinker" (gpt-5-thinking-mini): Abstains 52% of the
time. It only speaks when it’s sure, resulting in a much lower 26%
error rate.
The Fix: We need to change the grading system. If we
want fewer hallucinations, we must reward Abstention (saying "I
don't know") and penalize confident errors more heavily than silence.
Until we "fix the scoreboard," the math will always favor the gamble.
The Counterintuitive Twist: Hallucination as Discovery
Here is where the chalkboard gets interesting:
Hallucinations aren't always a bug. In the right hands, they are Artificial
Imagination.
In 2024, David Baker won the Nobel Prize in Chemistry for
protein design. His lab used AI "hallucinations" to dream up ten
million brand-new proteins that do not exist in nature. While the Nobel
Committee carefully avoided the word "hallucination," referring to it
instead as "imaginative protein creation," the underlying
mechanism is the same: the model filling gaps in a sequence.
Where Hallucination is a Feature:
- Medical
Innovation: Researchers at Caltech used AI to design
"sawtooth" catheter geometries that prevent bacterial growth—a
shape no human engineer had considered.
- Weather
Forecasting: Scientists generate thousands of "hallucinated"
variations of weather patterns to find the subtle variables that lead to
extreme storms.
- Creative
Arts: In gaming and VR, these "imagination" engines generate
surreal, dream-like environments and novel art styles.
The key difference? Constraints. A chatbot is unconstrained—it
can say anything. Scientific discovery uses physics-constrained
hallucinations, where the AI’s "imagination" must still follow the
rules of biology or weather patterns.
From Execution to Judgment
So, how do you use these tools at your desk without becoming
Case #958 in the database? You move from being a "user" to being a manager.
Pro-Tips for Your AI Workflow:
- The
Indirect Query: Don't just ask, "Is this a real paper?"
The AI will likely say yes to please you. Instead, ask, "Who are
the authors of this paper and what are three other things they’ve
written?" Hallucinations fall apart when you ask for auxiliary
details.
- Explicitly
Allow Uncertainty: End your prompts with: "If you are unsure
or don't have the data, please say 'I don't know' instead of
guessing."
- Chain-of-Verification
(CoVe): Ask the model to draft a response, then ask it to
"identify the facts in your response and design a verification
question for each one." Finally, have it revise the answer based on
those checks.
- Grounding
(RAG): Whenever possible, provide the "source of truth." Use
Retrieval-Augmented Generation to force the AI to look at a specific PDF
or database before it speaks.
The Tool in Your Hand
AI hallucinations are not a mysterious glitch; they are the
inevitable shadow cast by the way these models are built. They are the
"creative gap-filling" of a system designed to keep the conversation
going at all costs.
But as we’ve seen, this shadow is manageable. Whether it’s
through better evaluation metrics that reward humility or through
"Human-in-the-Loop" verification, the future of AI isn't about
finding a perfectly truthful machine—it’s about becoming a more skeptical,
expert human.
AI is a tool we need to learn, not a magic trick to fear.
When you understand the math, the "magic" of the hallucination
disappears, leaving you with something far more useful: a powerful, if
occasionally imaginative, partner in your work.
Works Cited
Charlotin, Damien. “AI Hallucination Cases Database.” Damien Charlotin, n.d., https://www.damiencharlotin.com/hallucinations/. Accessed 18 Feb. 2026.
IBM. “What Are AI Hallucinations?” IBM, n.d., https://www.ibm.com/think/topics/ai-hallucinations. Accessed 18 Feb. 2026.
Kalai, Adam, et al. “Why Language Models Hallucinate.” OpenAI, 5 Sept. 2025, https://openai.com/index/why-language-models-hallucinate/. Accessed 18 Feb. 2026.
Weng, Lilian. “Extrinsic Hallucinations in LLMs.” Lil’Log, 7 July 2024, https://lilianweng.github.io/posts/2024-07-07-hallucination/. Accessed 19 Feb. 2026.
“Hallucination (Artificial Intelligence).” Wikipedia, Wikimedia Foundation, 19 Feb. 2026, https://en.wikipedia.org/w/index.php?title=Hallucination_(artificial_intelligence)&oldid=1338960629. Accessed 19 Feb. 2026.
OpenAI. GPT-5 System Card. 13 Aug. 2025, https://cdn.openai.com/papers/gpt-5-system-card.pdf. Accessed 18 Feb. 2026.
Kalai, Adam Tauman, Ofir Nachum, Santosh S. Vempala, and Edwin Zhang. “Why Language Models Hallucinate.” arXiv, 4 Sept. 2025, https://arxiv.org/abs/2509.04664. Accessed 20 Feb. 2026.
“Mata v. Avianca, Inc., No. 1:2022cv01461 — Document 54 (S.D.N.Y. 2023).” Justia US Law, 22 June 2023, https://law.justia.com/cases/federal/district-courts/new-york/nysdce/1%3A2022cv01461/575368/54/. Accessed 20 Feb. 2026.
Cecco, Leyland. “Air Canada Ordered to Honor Refund Policy Invented by Chatbot.” The Guardian, 16 Feb. 2024, https://www.theguardian.com/technology/2024/feb/16/air-canada-chatbot-lawsuit. Accessed 20 Feb. 2026.
Dajose, Lori. “Aided by AI, New Catheter Design Prevents Bacterial Infections.” Caltech, 5 Jan. 2024, https://www.caltech.edu/about/news/aided-by-ai-new-catheter-design-prevents-bacterial-infections. Accessed 21 Feb. 2026.
Frankfurt, Harry G. On Bullshit. Princeton UP, 2005.
Weary_Reply. “What AI Hallucination Actually Is, Why It Happens, and What We Can Realistically Do About It.” Reddit, r/artificial, n.d., https://www.reddit.com/r/artificial/comments/1pjgh5w/what_ai_hallucination_actually_is_why_it_happens/. Accessed 21 Feb. 2026.
Watch and/or Listen

Comments
Post a Comment