Reddit Is Making Itself Invisible To Search Engines And AI Crawlers, Except Google

Google - Reddit

Reddit, the "front page of the internet" is a huge trove of information.

With it being around for more than a decade, gathering some of the most enthusiastic online citizens, Reddit is where many information originates and spreads. For more than often, discussions on Reddit have made some topics viral on the web, social media, and beyond.

Due to how valuable information within Reddit can be, many of its pages are ranked high on search engines.

And following the rise of generative AI trends, which happened following the introduction of OpenAI's ChatGPT, which created an arms race between tech giants, Reddit has also become the source for training data for various AI products.

This time, thanks to an 'AI deal,' Google is the only search engine that works on Reddit.

Other major search engines, like DuckDuckGo, Bing, Mojeek, and others are no longer returning full Reddit results any more.

What this means, Google is now the only search engine that can surface results from Reddit.

Because of the deal, Reddit managed to turn itself into one of the web’s most valuable repositories of user generated content exclusive to the internet’s already dominant search engine.

It's worth noting that older results still show up.

The deal only makes newer content to not be able to be crawled. The deal means that Google is the only search engine that can show up results from Reddit going forward.

According to reports searching for Reddit still works on Kagi, an independent, paid search engine that buys part of its search index from Google.

The news essentially shows how Google’s near monopolistic business in the search industry is actively hindering others' ability to compete at a time when Google is facing increasing criticism over the quality of its search results.

This exclusion of other search engines also comes after Reddit locked down access to its site to stop companies from scraping it for AI training data, which at the moment only Google can do as a result of a multi-million dollar deal that gives Google the right to scrape Reddit for data to train its AI products.

“They’re [Reddit] killing everything for search but Google,” said Colin Hayhurst, CEO of the search engine Mojeek.

Google - Reddit

To do this, Reddit updated its robots.txt file, preventing other search engines than Google to be able to crawl its website.

Reddit also uses IP detection to show different version of the file, depending on whether the visitor is a bot, or a human.

This approach is commonly known as "cloaking."

While the update was done on July 1, neither Reddit or Google informed the public about this.

The news only began covering the news, after realizing that some search engines started dropping results coming from Reddit.

"This is not at all related to our recent partnership with Google. We have been in discussions with multiple search engines. We have been unable to reach agreements with all of them, since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI," explained Reddit spokesperson Tim Rathschmidt.

Read: The Deal Between StackOverflow And Reddit With Google: AI Companies Will Pay For Data

Published: 
29/07/2024