Why Do OpenAI’s Flagship Models Keep Making Things Up?
Explore why OpenAI's cutting-edge models consistently generate entirely new creations, pushing the boundaries of innovation.
Troubling Hallucinations in OpenAI’s New AI Models
OpenAI, a leading player in artificial intelligence, recently unveiled two new models, o3 and o4-mini. These models, some of the most advanced to date, however, exhibit a significant issue: they tend to “hallucinate”.
Understanding AI Hallucinations
“Hallucinating” in AI terms means producing incorrect or misleading results. This phenomenon is not uncommon and has been observed in most existing AI models. Nonetheless, the o3 and o4-mini models hallucinate more frequently compared to OpenAI’s previous models.
Hallucinations can be harmless, such as when a chatbot is asked to create a poem using only words starting with ‘b’ and includes the word “tree”. However, they can also pose risks, such as when AI suggests bread for a person with gluten intolerance.
Higher Hallucination Rates in o3 and o4-mini Models
OpenAI’s technical report reveals that these two models underperformed in evaluating their hallucination rates. Specifically, the o3 model hallucinated in response to 33% of queries, about double the rate of OpenAI’s previous models.
This raises concerns about whether such a high rate of hallucinations could be problematic in the future, particularly for businesses considering significant investments in these models.
A Challenge for the Future of AI Models
It is crucial to note that these models are still new and might see significant improvements in their hallucination rates as testing continues. However, if this trend persists in OpenAI’s future AI models, it could pose a major challenge in convincing potential clients of their effectiveness and reliability.