Someone Created An 'Infinite Maze' To Trap Unsuspecting AI Content Scrapers In Oblivion

Infinite maze

AI is only as powerful as the data it’s trained on. The more it learns, the better it performs.

Because of this, AI has become a vital tool in fields like astronomy, space exploration, physics, biology, medicine, environmental science, climate research, mathematics, and beyond. However, its growing presence is making many people uneasy.

One major concern is the rise of AI-powered web crawlers—bots deployed to scour the internet, scraping and consuming vast amounts of online content to fuel AI models. These digital harvesters roam relentlessly, gathering data without pause.

What started with OpenAI’s ChatGPT. others have followed, and triggered an arms race, with multiple companies competing for AI dominance. Their justification? More data equals better tools.

But not everyone is on board with this. Many website owners and content creators are pushing back, unwilling to have their work freely harvested to train AI systems without consent.

One of these people, is a programmer, who wish to help end this AI quest, by creating and releasing an open source "tar pit" to indefinitely trap AI training web crawlers in an infinite, randomly-generating series of webpages.

The program, called 'Nepenthes,' named after the genus of carnivorous tropical pitcher plants which trap and consume their prey, can be deployed by website owners who wish to protect their content from these AI scrapers.

According to its creator who goes by by the name Aaron B:

"It's also sort of an art work, just me unleashing sheer unadulterated rage at how things are going."

"I was just sick and tired of how the internet is evolving into a money extraction panopticon, how the world as a whole is slipping into fascism and oligarchs are calling all the shots—and it's gotten bad enough we can't boycott or vote our way out, we have to start causing real pain to those above for any change to occur."

With Nepenthes, webmasters and site owners can shield their content from AI scrapers—or even lure them into a trap, forcing them to waste time and drain the computing resources of the companies behind them.

According to its website, the tool can be deployed in two ways.

When deployed defensively, Nepenthes ensures that human visitors can still access the website in its entirety, with all features and functionality intact. Meanwhile, AI crawlers are met with an endless maze, keeping them occupied but gaining nothing of value.

When detecting AI crawlers, Nepenthes will "flood out valid URLs within your site's domain name, making it unlikely the crawler will access real content."

And if deployed offensively, Nepenthes will ignore its list of known crawlers' IP addresses, and have the bots "suck down as much bullshit as they have diskspace for, and choke on it."

The idea is to trap these crawlers into an infinite maze.

"They are still consuming resources, spinning around doing nothing helpful, unless they find a way to detect that they are stuck in this loop," Aaron B added.

Nepenthes

Aaron B, said that:

"It's less like flypaper and more an infinite maze holding a minotaur, except the crawler is the minotaur that cannot get out. The typical web crawler doesn't appear to have a lot of logic."

"It downloads a URL, and if it sees links to other URLs, it downloads those too. Nepenthes generates random links that always point back to itself—the crawler downloads those new links. Nepenthes happily just returns more and more lists of links pointing back to itself."

Tech companies behind many popular AI tools have said that webmasters and web owners can block their web crawlers that gather data to train Large Language Models using robots.txt file.

With it, it's possible for them to ask specific bots not to crawl a webpage or their entire website.

However, different companies use different bots, and that the names of those bots often change.

Making things worse, some companies do not obey whatever parameter is put inside the robots.txt file, or find ways to get around them.

Nepenthes here tries to end this by putting these pesky pests into a doom loop.

As for the claims that bots can skip over these traps, Aaron B seems confident.

"I've several million lines of access log that says even Google Almighty didn’t graduate," the person said.

Published: 
30/01/2025