What causes LLM hallucinations and how can they be reduced?
Hallucinations occur because an LLM is trained to produce plausible next tokens, not verified facts — it has no internal truth-checking mechanism, only statistical patterns. Common causes include rare or conflicting training data, overconfident decoding, and prompts that lead the model to extrapolate beyond what it learned. Mitigation strategies include retrieval-augmented generation, grounding responses to retrieved sources, lowering temperature, and calibrated refusal training.
How to think about it
An LLM predicts the next token based on patterns learned during pretraining. It has no access to a fact database at inference time and no mechanism to distinguish between what it knows confidently and what it is confabulating. The result is fluent, confident-sounding text that can be partially or entirely fabricated.
Root causes
Parametric knowledge limits. Facts learned during pretraining are compressed into billions of floating-point weights. Long-tail facts (obscure names, niche statistics) are underrepresented and easily mangled or invented.
Distributional mismatch. If the prompt asks about events after the training cutoff, or in a format the model rarely saw, it fills the gap with the most statistically plausible continuation — which may be wrong.
Sycophancy pressure. RLHF can inadvertently reward confident-sounding answers regardless of accuracy, because human raters often prefer fluent responses over hedged ones.
Decoding dynamics. High temperature or nucleus sampling increases diversity but also raises the probability of selecting an unlikely (incorrect) token chain that is locally coherent but globally false.
Mitigation strategies
| Technique | How it helps |
|---|---|
| Retrieval-augmented generation (RAG) | Grounds answers in fresh, retrieved documents; reduces reliance on parametric memory |
| Citation enforcement | Prompt the model to quote source text; unverifiable claims become visible |
| Temperature reduction | Reduces sampling randomness; model stays closer to its highest-confidence continuations |
| Refusal calibration | Train or prompt the model to say “I don’t know” when evidence is absent |
| Self-consistency / ensemble sampling | Generate N answers; majority vote filters outliers |
| Chain-of-thought prompting | Forces explicit reasoning steps; reasoning errors surface earlier |