A recent study has uncovered significant accuracy problems with Google's AI Overviews, the summaries that appear at the top of search results. These AI-generated answers are reportedly producing millions of incorrect responses each hour, casting doubt on the reliability of AI-powered search information.

AI Startup's Findings on Google's AI Overviews

Research conducted by the AI startup Oumi analyzed 8,652 search results from Google's Gemini AI models. The study examined both Gemini 2 and the more advanced Gemini 3 models. While the reported accuracy rates of 85% and 91% might seem high, they become problematic when scaled to Google's massive search volume.

Extrapolated to Google's projected over 5 trillion searches in 2026, these error rates could translate to hundreds of thousands of mistakes every minute. The inaccuracies observed ranged from simple factual errors to more complex misinformation.

Examples of AI Overview Inaccuracies

Specific errors highlighted in the study include incorrect dates for Bob Marley's home becoming a museum, wrong death years for former MLB pitcher Dick Drago, and false claims about cellist Yo-Yo Ma's induction into the Classical Music Hall of Fame, despite his actual 2007 induction.

Concerns for Publishers and Information Integrity

The findings intensify the ongoing conflict between traditional news publishers and Google. Since the introduction of AI Overviews in 2024, organic search links have been pushed lower on the results page, reducing publisher visibility. Publishers accuse Google of using their content to train AI without adequate credit or compensation.

Oumi's research also noted that AI Overviews frequently rely on questionable sources like Facebook pages, blogs, and Wikipedia entries, presenting them as authoritative. This practice heightens the risk of spreading misinformation and makes the system susceptible to manipulation.

System Vulnerability to Manipulation

An example demonstrated how easily the system could be influenced. A blog post created by a BBC podcast host, claiming to be a top tech journalist, was incorporated by Google's AI within a day. The AI summary falsely stated the journalist was known for competitive eating prowess.

Health Information Accuracy Under Scrutiny

The study also flagged significant inaccuracies in health-related summaries, posing potential risks to users. One concerning instance involved incorrect information about liver function tests. Experts deemed this specific error dangerous due to potential patient consequences.

When users searched for normal liver blood test ranges, AI Overviews provided numbers without crucial context. The summaries failed to consider vital factors like a patient's nationality, sex, ethnicity, or age, all of which can affect normal test result ranges.

Google's Response to the Study

Google has contested the study's conclusions, with a spokesperson stating, "This study has serious holes." The company representative added that the research "doesn’t reflect what people are actually searching on Google."

Methodology and Key Findings

Oumi's analysis, conducted from October to February, employed the SimpleQA benchmark test, a standard method for assessing AI model accuracy developed by OpenAI. A key finding was the increase in ungrounded answers, where provided links do not support the AI summary's content. This metric rose from 37% in Gemini 2 to 51% in Gemini 3, indicating a potential trade-off between answer volume and information reliability.