datarekha
NLP & LLMs Medium Asked at OpenAIAsked at AnthropicAsked at GoogleAsked at Meta

What causes LLM hallucinations and how can they be reduced?

The short answer

Hallucinations occur because an LLM is trained to produce plausible next tokens, not verified facts — it has no internal truth-checking mechanism, only statistical patterns. Common causes include rare or conflicting training data, overconfident decoding, and prompts that lead the model to extrapolate beyond what it learned. Mitigation strategies include retrieval-augmented generation, grounding responses to retrieved sources, lowering temperature, and calibrated refusal training.

How to think about it

An LLM predicts the next token based on patterns learned during pretraining. It has no access to a fact database at inference time and no mechanism to distinguish between what it knows confidently and what it is confabulating. The result is fluent, confident-sounding text that can be partially or entirely fabricated.

Root causes

Parametric knowledge limits. Facts learned during pretraining are compressed into billions of floating-point weights. Long-tail facts (obscure names, niche statistics) are underrepresented and easily mangled or invented.

Distributional mismatch. If the prompt asks about events after the training cutoff, or in a format the model rarely saw, it fills the gap with the most statistically plausible continuation — which may be wrong.

Sycophancy pressure. RLHF can inadvertently reward confident-sounding answers regardless of accuracy, because human raters often prefer fluent responses over hedged ones.

Decoding dynamics. High temperature or nucleus sampling increases diversity but also raises the probability of selecting an unlikely (incorrect) token chain that is locally coherent but globally false.

Mitigation strategies

TechniqueHow it helps
Retrieval-augmented generation (RAG)Grounds answers in fresh, retrieved documents; reduces reliance on parametric memory
Citation enforcementPrompt the model to quote source text; unverifiable claims become visible
Temperature reductionReduces sampling randomness; model stays closer to its highest-confidence continuations
Refusal calibrationTrain or prompt the model to say “I don’t know” when evidence is absent
Self-consistency / ensemble samplingGenerate N answers; majority vote filters outliers
Chain-of-thought promptingForces explicit reasoning steps; reasoning errors surface earlier

Keep practising

All NLP & LLMs questions

Explore further

Skip to content