What is the difference between factuality and faithfulness errors in AI hallucinations?

Factuality errors occur when a model fabricates information not present in its training data, while faithfulness errors happen when a model has access to correct source documents but misrepresents or distorts them in its output. Both are hallucinations, but they require different mitigation strategies: factuality errors need external grounding through RAG, while faithfulness errors require better prompting and model fine-tuning to improve source fidelity.

Why do larger AI models sometimes hallucinate more on factual questions?

On open-ended factual recall tasks where models must rely on training data rather than provided context, larger and more capable models like OpenAI's o3 series show 33-51% hallucination rates. This happens because training objectives reward confident, fluent responses rather than factual accuracy. More sophisticated models are better at producing plausible-sounding answers, which can make their errors harder to detect. This is why hallucination risk is situational rather than uniform across all task types.

How much does RAG reduce AI hallucinations?

Research shows RAG grounding reduces hallucination rates by 40-71% depending on the study and implementation quality. Organizations with high-quality underlying data achieve 95%+ accuracy on grounded summarization tasks, while those with poor data quality see accuracy drop to 60-70%. The effectiveness of RAG depends heavily on document quality, retrieval accuracy, and how well the model is prompted to stay faithful to retrieved sources.

AI Hallucinations: Why They Happen and What Works

Hallucinations are not a bug. They're the predictable result of training models to always produce confident answers, even when the honest response would be "I'm not sure."

That distinction matters more than it might seem.

The term "hallucination" describes when an LLM generates content that sounds plausible but is factually wrong or unsupported by source material. This covers everything from fabricated citations to subtle distortions of provided documents. Lakera's analysis breaks this into two categories: factuality errors (stating incorrect facts) and faithfulness errors (misrepresenting sources the model was given). The first means the model made something up. The second means it had the right information but mangled it anyway. Both are hallucinations, but they require different fixes.

Rewarding Fluency Over Truth

LLMs are trained on objectives that reward confident-sounding responses. When a model says "I don't know," it typically gets penalized during training. When it produces a fluent, authoritative-sounding answer, it gets rewarded. Whether that answer is true doesn't factor in.

Lakera describes this as an incentive problem. The model is a "sophisticated prediction engine" optimizing for statistically probable text, not factually verified content. It's doing exactly what it was trained to do. The training just didn't prioritize truth.

The Frontiers in Artificial Intelligence survey adds another layer: some hallucinations are prompting-induced (caused by vague or ambiguous prompts) while others are model-intrinsic (baked into the training and architecture). You can fix the prompt. You can't prompt your way out of architectural limitations.

Task Type Changes Everything

The most important finding from recent benchmarks is that hallucination rates vary dramatically based on task type. This isn't a uniform problem.

Analysis of 2026 benchmark data shows top models achieving 0.7-1.5% hallucination rates on grounded summarization tasks, a significant improvement from the 1-3% rates seen in 2024. For constrained tasks where the model has access to source documents, the problem is largely solved.

Now the uncomfortable part: on open-ended reasoning tasks, newer models are actually performing worse. The OpenAI o3 series shows 33-51% hallucination rates on factual questions like PersonQA, where the model must rely on its training data rather than provided context.

Bigger and more capable does not mean more reliable on these tasks.

So hallucinations become a situational risk rather than a blanket concern. The question isn't "does this model hallucinate?" but "does this model hallucinate on this specific type of task?"

The effectiveness ranking across the research is fairly clear. RAG grounding provides the biggest reduction. Giving the model relevant documents to work from rather than relying on parametric knowledge reduces hallucinations by 40-71% depending on the study. Glean's enterprise analysis notes that organizations with high-quality underlying data achieve 95%+ accuracy, while those with poor data quality see accuracy drop to 60-70%. Garbage in, garbage out still applies.

Chain-of-thought prompting roughly halves error rates. The same Glean analysis found that chain-of-thought reduced hallucinations from 38.3% to 18.1%. Forcing the model to show its reasoning seems to reduce confident bullshitting. This is free and easy to implement.

Targeted fine-tuning shows dramatic results in controlled studies. One study cited by Lakera found 90-96% reduction using hallucination-focused training datasets. Medical QA research showed prompt-based mitigation reducing rates from 53% to 23%. The catch: this requires significant investment and domain expertise.

Human oversight remains essential for high-stakes decisions. The arXiv survey documents various detection approaches (uncertainty estimation, consistency checking, retrieval verification, internal probing) but none are reliable enough to fully automate quality control. Human-in-the-loop workflows aren't a failure of the technology; they're acknowledgment of its current limits.

None of these are silver bullets. But stacked together, they make hallucinations a manageable engineering problem rather than an existential one.

Teaching Models to Say "I'm Not Sure"

The mature approach isn't eliminating hallucinations but managing them through calibrated uncertainty. Instead of training models to always sound confident, the goal becomes models that signal when they're uncertain.

This means confidence scores attached to outputs. It means models that say "I found three sources that partially address this, but none directly answer your question" rather than synthesizing a confident-sounding response from fragments. It means treating LLM outputs as starting points for verification, not final answers. Lakera's analysis frames this as an industry shift from a "zero hallucinations" goal to transparent confidence scoring. The former is impossible given how these systems work. The latter is achievable and genuinely useful.

If you're building on LLMs, the practical implications are straightforward.

Match the task to the risk profile. Grounded summarization with quality source documents is reliable. Open-ended factual recall is not. Design your system accordingly.

Implement RAG for anything factual. Don't rely on the model's training data for facts that matter. Provide the documents. This single change addresses most production hallucination issues.

Use chain-of-thought prompting because it's free and measurably reduces errors. Build verification into your workflow because for high-stakes outputs, assuming the model may be wrong isn't pessimism; it's engineering.

And watch the confidence signals. As models increasingly support calibrated uncertainty, use those signals. A model that says "I am 60% confident" is more useful than one that's always 100% confident and wrong 30% of the time.

Our read: The hallucination problem isn't going away. But the framing has shifted from "when will AI stop lying?" to "how do we build reliable systems on probabilistic foundations?" That's a more productive question, and we're making real progress on answers.