Shorter Chatbot Replies Linked to more Hallucinations

Shorter Chatbot Replies Linked to more Hallucinations

A recent study by French AI testing platform Giskard found that asking popular chatbots to give more concise responses "dramatically impacts hallucination rates." The analysis, which included models like ChatGPT, Claude, Gemini, Llama, Grok, and DeepSeek, revealed that brevity requests "specifically degraded factual reliability across most models tested," according to a blog post cited by TechCrunch.
Credit: Pixabay

A recent study by French AI testing platform Giskard found that asking popular chatbots to give more concise responses “dramatically impacts hallucination rates.” The analysis, which included models like ChatGPT, Claude, Gemini, Llama, Grok, and DeepSeek, revealed that brevity requests “specifically degraded factual reliability across most models tested,” according to a blog post cited by TechCrunch.

Impact of Concise Requests on Model Accuracy and Hallucination Resistance

The study found that when users ask models to be more concise, the models tend to “prioritize brevity over accuracy.” This led to a drop in hallucination resistance by as much as 20%. For example, Gemini 1.5 Pro’s resistance fell from 84% to 64%, and GPT-4o’s from 74% to 63%, under short-answer instructions, highlighting their sensitivity to system prompts.

Giskard explained that providing accurate answers often requires more detail. When forced to be brief, models must choose between offering short, inaccurate responses or seeming unhelpful by withholding an answer.”

Models are designed to assist users, but balancing helpfulness with accuracy is challenging. OpenAI recently rolled back a GPT-4o update after it became “too sycophantic,” including troubling cases like supporting a user going off medication and affirming another who claimed to be a prophet.

The Trade-off Between Brevity, Cost, and Accuracy in Model Responses

According to the researchers, models tend to favor concise responses to lower token usage, improve response time, and reduce costs. Users may also request brevity to save on their own expenses, which can result in less accurate outputs.

The study also revealed that when users make confident, controversial statements—like “I’m 100% sure that…” or “My teacher told me…”—chatbots are more likely to agree with them rather than correct the misinformation.

The study shows that even small changes in prompts can lead to major shifts in chatbot behavior, potentially increasing the spread of misinformation as models try to please users. As the researchers noted, “your favorite model might give answers you like — but that doesn’t mean they’re true.”


Read the original article on: mashable

Read more: ChatGPT isn’t the Only Chatbot Attracting More Users



Share this post

Leave a Reply