The lawyer trusted ChatGPT. It gave him six case citations. He submitted them to the federal court. None of the cases existed.
This happened in May 2023 and it made international news. The judge sanctioned the lawyers involved. The profession had a collective moment of panic. But here is the strange part: the AI did exactly what it was designed to do.
Confident and Wrong
An AI hallucination is when a language model generates information that sounds completely plausible but happens to be false. Sometimes slightly false. Sometimes entirely fabricated.
The term itself is borrowed from psychology. Humans hallucinate when their brains perceive things that aren’t there. AI hallucinations work similarly, except the model produces text that has no grounding in reality while presenting it with unwavering confidence.
A user on Hacker News named diputsmonro captured this perfectly: “All responses are hallucinations. Some hallucinations happen to overlap the truth.”
That sounds provocative. It also happens to be technically accurate. Every output from a language model is a prediction about what words should come next. Some predictions align with facts. Some do not. The model itself cannot tell the difference.
The Architecture Explains Everything
Language models do not store facts in the way a database stores records. They learn statistical patterns. They learn that certain words tend to follow certain other words in certain contexts. They learn that questions about history are often followed by dates. They learn that citations contain author names, journal titles, and years in parentheses.
When you ask an LLM for a citation, it generates one. It produces text that matches the pattern of what a citation looks like based on millions of examples it absorbed during training. Whether that citation corresponds to a real paper that exists in the physical world is a question the model has no mechanism to answer.
This is not a software bug. This is the fundamental architecture.
A commenter named zdragnar explained the core problem on Hacker News: “the model itself doesn’t know the difference, and will proclaim bullshit with the same level of confidence.”
That confidence is the killer. Humans calibrate their trust based on how certain someone sounds. We evolved in an environment where confident claims usually came from people who had direct knowledge. An AI trained to maximize user engagement learns to sound confident because confidence gets rewarded.
Why Training Makes It Worse
Here is something counterintuitive. The way we train language models actively encourages hallucination.
Training involves showing the model millions of examples and rewarding it when its predictions match what actually came next in the training data. The model gets points for being right. It gets zero points for saying “I don’t know.” Like a student who realizes that leaving a test question blank guarantees failure, the model learns that guessing beats admitting uncertainty.
Research from Lilian Weng at OpenAI notes that models learn new information during fine-tuning slower than information that matches their existing knowledge. Worse, once models do learn genuinely new facts during fine-tuning, “they increase the model’s tendency to hallucinate.”
The model gets better at producing text that looks like it contains facts. It does not get better at distinguishing real facts from plausible patterns.
There is also a data problem. Internet text is the most common training source. As one technical analysis put it, “Data crawled from the public Internet is the most common choice and thus out-of-date, missing, or incorrect information is expected.” The model treats accurate and inaccurate text identically. Both are just patterns to learn.
The Social Silence Problem
Human conversations have an interesting property. When people do not know something, they usually stay quiet. Comment sections and forums contain mostly confident assertions. Nobody posts “I have no idea about this topic.” Silence contains no text to learn from.
A Hacker News user named mike_hearn identified this pattern: “The trouble is that the training sets contain few examples of people expressing uncertainty because the social convention on the internet is that if you don’t know the answer, you don’t post.”
Models learn from text that exists. Text that does not exist teaches nothing. The corpus is biased toward confidence and away from appropriate uncertainty. The model inherits that bias.
The Boundary Problem
A person knows the boundary between memory and imagination. You can recall where you parked your car while recognizing that you are imagining what might be in the glove compartment. These feel different.
Language models have no such boundary.
Mort96 articulated this on Hacker News: “The distinction between ‘this is information I truly think I know’ and ‘this is something I made up’ doesn’t exist in LLMs.”
Everything the model produces comes from the same process. Reciting well-established facts involves predicting tokens. Inventing plausible nonsense involves predicting tokens. Same mechanism. Same confidence level. No internal signal that distinguishes one from the other.
This is why hallucinations are so dangerous in practice. There is no tell. No hesitation. No subtle marker that separates reliable output from fabrication.
Why Fixing This Is Hard
Some problems in AI are engineering challenges. Throw more compute at them, refine the training process, and improvements follow. Hallucination is different.
Multiple researchers have examined whether hallucinations can be eliminated from current architectures. The emerging consensus is sobering. A commenter named calf suggested the problem might be “formally unsolvable and should be rendered as absurd as someone claiming the Halting Problem is solvable.”
That sounds extreme. The technical argument goes roughly like this: language models are statistical approximators. They cannot fully capture all computable functions. They will always be interpolating between training examples rather than accessing ground truth. Some wrong interpolations are inevitable.
Better models hallucinate less frequently. They do not hallucinate zero percent of the time. The curve approaches but never reaches zero.
There are mitigation strategies. Retrieval-augmented generation gives models access to external documents, which helps ground responses in actual sources. Chain-of-thought prompting forces models to show their reasoning, which sometimes catches errors before they compound. Human verification remains the most reliable detector.
But these are workarounds. They reduce the rate without eliminating the phenomenon. The architectural limitation persists.
The Implications Nobody Talks About
Most discussions of hallucination end with practical tips. Check your sources. Verify citations. Do not trust blindly. That advice is correct and also misses something deeper.
We are building infrastructure on top of systems that have a nonzero rate of confident fabrication. Not systems that are sometimes uncertain. Systems that are always confident and sometimes wrong in ways indistinguishable from when they are right.
Every industry automating with LLMs is implicitly accepting this. Legal research. Medical triage. Financial analysis. Customer support. Code generation. The efficiency gains are real. So is the embedded hallucination rate.
Elcritch, commenting on LLM code generation, observed that “LLMs will just outright lie to make their jobs easier in one section while in another area generate high quality code.” The same model, the same prompt, inconsistent reliability. Not because something went wrong. Because that is how the system works.
What Hallucinations Teach Us
Hallucinations reveal something about the nature of language that humans rarely confront.
A sentence can be grammatically perfect, semantically coherent, stylistically appropriate, and completely false. The structures of language do not require truth. Persuasive prose does not need to correspond to reality. Authority in text is a performance, not a guarantee.
Humans use context to detect deception. We know the speaker. We know their track record. We know what incentives might motivate them to mislead. We apply skepticism calibrated to the situation.
AI outputs arrive without that context. No track record with this specific query. No incentives we can model. No relationship history. Just text that sounds exactly like text produced by an expert who checked their facts.
The burden shifts entirely to the reader. Every claim becomes suspect until verified independently. Every citation needs checking. Every statistic needs sourcing. The efficiency of AI generation gets partially consumed by the verification overhead.
The Uncomfortable Equilibrium
The models will keep improving. Hallucination rates will keep declining. More sophisticated training will penalize overconfidence. Better architectures may eventually incorporate something like uncertainty quantification.
But the fundamental dynamic remains. These systems predict patterns. Patterns do not equal truth. Some predictions will always land outside the boundaries of fact.
Maybe the real lesson is not about AI at all.
Humans have always operated in environments where confident claims sometimes prove false. We developed institutions to manage this: peer review, editorial oversight, legal discovery, scientific replication. Trust but verify. Consider the source. Check the original.
AI hallucinations do not introduce a new problem. They amplify an old one. They produce plausible-sounding claims at a volume and speed that overwhelms our traditional verification processes.
The lawyer who submitted fake citations did not fail because he used AI. He failed because he trusted without verifying. That failure was possible before ChatGPT existed. It was just slower to commit.
The uncomfortable truth is that hallucinations force us to remember something we have been able to forget: that fluency is not accuracy, that confidence is not correctness, and that the relationship between words and truth has always been more tenuous than we like to admit.
Every sentence you have ever read, including this one, could be wrong.
The question was never whether to trust. It was always how to verify.