How ChatGPT Search Suffers From Inconsistency, As Its AI Remains 'Unpredictable' And 'Inaccurate'

From its inception, Google moved swiftly, quickly becoming the largest internet search engine.

It remains the go-to tool for finding virtually anything the World Wide Web has to offer. Google has left all its competitors in the dust, surpassing them by vast margins. Google Search is king, and nothing comes close.

But when OpenAI introduced ChatGPT, it sparked a competition like never before. Sooner than later, tech companies, large and small, engage in an arms race towards supremacy.

Google was scared. It was a "code red" at Google because the company knows how generative tools with its natural language processing ability could shift people’s way of looking for information online.

Then, OpenAI introduced ChatGPT Search, its own take of an internet search engine, powered by its powerful AI.

For Google, this was its nightmare that turned into reality.

And that’s when people realized: a Large Language Model as a search engine is still far from offering the reliability of Google-level results.

According to tests by Columbia’s Tow Center for Digital Journalism researchers, OpenAI’s ChatGPT search tool has significant issues with providing accurate information.

Launched for subscribers in October, OpenAI touted the tool as a source for "fast, timely answers with links to relevant web sources."

However, the researchers found that ChatGPT Search struggles to accurately identify quotes from articles, even from publishers that had agreements to share data with OpenAI.

The researchers tested the tool by asking ChatGPT to identify the source of "two hundred quotes from twenty publications."

Forty of those quotes came from publishers that had blocked OpenAI’s search crawler from accessing their sites.

Despite this, the chatbot confidently provided incorrect information, rarely acknowledging uncertainty about the details it presented.

Here, the researchers noted that despite its confident tone, OpenAI’s chatbot for the AI-powered search engine provided "partially or entirely incorrect responses" to the majority of their requests.

What happens here is that, OpenAI does allow publishers to decide whether they want their content to be included in ChatGPT Search's results by specifying their preferences in a robots.txt file on their website.

But whether or not OpenAI obeys the rules that are given to ChatGPT, the nature of Large Language Models retains ChatGPT's tendency to hallucinate.

Despite not having enough information to answer users, ChatGPT Search would deliberately invent or otherwise misrepresent information, and make everything sound convincing.

And even when publishers do permit OpenAI Search to crawl their websites, the LLM will still not cite them correctly every time.

When prompted with the same query multiple times, ChatGPT might answer correctly on one occasion and incorrectly on another.

The researchers said that:

"What we found was not promising for news publishers. Though OpenAI emphasizes its ability to provide users “timely answers with links to relevant web sources,” the company makes no explicit commitment to ensuring the accuracy of those citations. This is a notable omission for publishers who expect their content to be referenced and represented faithfully."

The issue stems from the fact that traditional search engines, like Google and Bing, return visual indication that the search engine has located the sources to answer users' queries.

If they cannot, they will show a message to inform users that there are no results.

LLMs like ChatGPT however, are not known to say "no" for an answer.

ChatGPT will do whatever it can to answer a users' query, as long as it's still allowed by OpenAI's policy, even when there's no answer to that query.

Instead of saying nothing, the LLM will resort to making false assertions.

This lack of transparency about its confidence in an answer can make it difficult for users to assess the validity of a claim and understand which parts of an answer they can or cannot trust.

As a result, ChatGPT’s false confidence will mislead users, and also cause reputational damage to publishers.

The tests suggest that ChatGPT shows a "great deal of variability in the accuracy" and that its output "doesn't neatly match up with publishers' crawler status or affiliation with OpenAI."

"Enabling crawler access does not guarantee a publisher's visibility in the OpenAI search engines either," the researchers continued.

If OpenAI is serious about sustaining a good-faith collaboration with news publishers, it would do well to ensure that its search product represents and cites their content in an accurate and consistent manner and clearly states when the answer to a user query cannot be accessed.

Published:

04/12/2024

Dark Mode

Search form

How ChatGPT Search Suffers From Inconsistency, As Its AI Remains 'Unpredictable' And 'Inaccurate'