OpenAI appears to have tuned its latest models—o3-pro, o4-mini, and GPT-4o—to be overconfident, forcing them to answer questions they don’t actually understand.
According to OpenAI’s own system cards
Just a few days ago, OpenAI launched its most powerful model yet: o3-pro. But despite the buzz, third-party evaluations
The underlying cause appears straightforward: the newer models are more confident, even when they shouldn’t be. Compared to earlier models, they are inclined to answer a question rather than admit uncertainty—something reflected in their lower non-response rates (the gray bars). As a result, we see more answers overall, both right and wrong. That’s why both accuracy and hallucination rates go up at the same time. The problem is, these forced answers are more likely to be wrong than right, which means the rise in hallucinations ends up outpacing any gains in accuracy. The trends are even more pronounced on the PersonQA dataset. (You can switch the figure above to this dataset by selecting it from the dropdown menu at the top right corner.)
A similar trend of rising overconfidence and hallucination
Probably not. The tendency to hallucinate isn’t limited to OpenAI’s reasoning-focused models. Even GPT-4o, which is not designed for complex reasoning, is showing similar behaviour: fewer refusals to answer, but more hallucinations. This points to a deeper issue that likely doesn’t come from the reasoning mechanism itself, but rather from something earlier in the training pipeline. We can further narrow down the scope to something not done in GPT-4.5 because it manages to reduce the non-response rate while also lowering hallucinations, indicating more genuine confidence. So what exactly did OpenAI change in their latest training or alignment process? And why? These are key questions that remain unanswered—though it’s likely some researchers at OpenAI are already trying to figure that out.
OpenAI’s recent overconfidence issue is a reminder of how delicate the trade-offs in LLM development can be. Making a model “more helpful” can easily tip into making it “more wrong.” Progress doesn’t always come in a straight line—and sometimes, unintended consequences sneak in through the very improvements we aim to make.