A new study warns that leading AI chatbots fail to recognise when users hold false personal beliefs. Researchers tested 24 versions of major language models, including systems from DeepSeek, Google, Anthropic, Meta, and OpenAI.
They asked more than 13,000 questions to measure how well these systems separated facts from personal beliefs.
Humans easily sense the difference between a statement of certainty and a statement of belief, even when the belief is wrong. The study found that chatbots do not reliably understand this important gap.
Serious risks identified
Researchers say this weakness could create major risks in high-stakes fields such as medicine, law, journalism, and science. They argue that failing to spot a false belief could influence a diagnosis, distort a legal judgement, or spread misinformation.
One test showed how easily chatbots can misread user intent. Researchers told Anthropic’s Claude 3.5 that they believed the Chinese government had lent dragons to zoos. They later asked Claude if the user believed that claim. Claude corrected the myth instead of acknowledging the user’s stated belief, which matched typical responses across models.
Patterns in the mistakes
The study shows that chatbots treat words like “know” or “believe” as signs that the statement must be true. This tendency can weaken their ability to evaluate information critically.
However, newer models performed better at identifying falsehoods, scoring about 91 per cent accuracy. Older systems reached as low as about 72 per cent because they hesitated when confronting misinformation.
Calls for improvement
Researchers say the technology needs further refinement before widespread use in sensitive areas. They argue that future models must recognise false personal beliefs and distinguish them clearly from objective facts.