How OpenAI's 'DALL·E 2' Invented Its Own Language That Nobody Can Understand

OpenAI AI creates new language

Language is the key of delivering information for others to understand. Language can take many forms, and AIs apparently know that too.

Humans have long used languages to explain something. From speech (spoken language), sign, and writing, civilizations know a bunch of them, categorizing them into either Living languages, Extinct languages, Ancient languages, Historic languages, and Constructed languages.

Because language is a structured system, researchers who knew that computers can learn a new language, have taught computers to generate text-recognition and text-generation software, translators and so forth.

But later, researchers were baffled when they realized that computers can also create a new language on its own.

And what makes it even more eerie, no one can understand that language other than the computer itself.

This time, that rare occurrence happens again.

OpenAI is among the pioneers in the AI field.

The research laboratory that previously created GPT-3, an AI capable of producing text rich with context, nuance and even humor.

This successor of GPT-2 is 100 times larger, and very much changed the game of how people perceive AI, as it was pretty much the hype of text-generation AI.

Besides the two and some other products, OpenAI also has what it calls the 'DALL·E', an AI that is meant to be the 'GPT' for images.

The updated version is called 'DALL·E 2', which has a much higher resolution and lower latency than the original system.

DALL·E 2 that can generate realistic or artistic images from user-entered text descriptions, has created a new benchmark on how machine interpreters should be.

DALLE-E2 represents a milestone in machine learning, with OpenAI’s site saying that the program "learned the relationship between images and the text used to describe them."

And here, a researcher discovered that the AI behind DALL·E 2 is apparently smarter than that.

The system showed a strange behavior, in which it is writing its own language using random arrangements of letters, and the researcher doesn't know why.

Giannis Daras, a computer science Ph.D. student at the University of Texas, published a Twitter thread detailing DALLE-E2’s unexplained new language.

He said that he found this when he told DALL·E 2 to create an image of "farmers talking about vegetables."

And when DALL·E 2 returned his request, the farmers’ speech read “vicootes.”

Curious about his new unknown word, Daras fed DALL·E 2 with the new word, and the system returned pictures of vegetables.

"We then feed the words: ‘Apoploe vesrreaitars’ and we get birds." Daras wrote on Twitter.

In other words, the words in question are some unknown AI words.

"It seems that the farmers are talking about birds, messing with their vegetables!"

Daras and a co-author have written a paper on DALLE-E2’s "hidden vocabulary."

They acknowledge that telling DALL·E 2 to generate images of words, like using the command "an image of the word airplane" normally results in DALL·E 2 returning out "gibberish text."

But when that text is fed back into DALL·E 2, that gibberish text will result in images of airplanes.

Another way of saying it, the gibberish text no one understands says something about the way DALL·E 2 talks to and thinks of itself.

After Daras has revealed his findings, many were skeptics.

Some researchers suggested that what Daras found, was merely random noise.

But if Daras is correct, not only that he found yet another instance of AI inventing its own language, but also uncovering security implications for the DALL·E 2 text-to-image generator.

"The first security issue relates to using these gibberish prompts as backdoor adversarial attacks or ways to circumvent filter," he wrote in his paper. "Currently, Natural Language Processing systems filter text prompts that violate the policy rules and gibberish prompts may be used to bypass these filters."

"More importantly, absurd prompts that consistently generate images challenge our confidence in these big generative models."

Before this, the most notable AI capable of creating its own language, is Facebook's chatbot AI.

At that time, researchers used machine learning algorithms to improve the chatbots, by dreaming up an imaginary negotiation scenario.

It was then realized that the agent had distinct hidden preferences. Over the course of the interactions, the bots naturally adopted many common negotiation tactics found in humans, including developing its own language..

Published: 
06/06/2022