How ChatGPT Is Tricked Into Creating Malware Using Alphanumeric And Flag Emojis

ChatGPT

The AI sphere was quite dull and boring, until generative AIs disrupted it.

It all started when OpenAI introduced ChatGPT, an extremely capable AI developed using Large Language Model (LLM). With the ability to respond to queries in human-like manner, the AI quickly captivated the world, sent rivals into frenzy, and a lot of users.

While users kept using the AI for various purposes, researchers who are always intrigued by how things work, are trying to see what's within it.

Especially since the existence of the so-called black box of AI cannot be denied, and during the time when hallucinating generative AIs are still common, researchers wish to find weaknesses in the system, in order to help the technology improve.

And this time, a researcher said that they have found a novel way to trick the AI into doing what it shouldn't.

According to a Twitter user that goes with the name LaurieWired, the famous and overhyped AI can be tricked into helping users create malware by simply replacing the text into alphanumeric and flag emojis.

"This bypasses the "I'm sorry, I cannot assist" response completely for writing malicious applications," the researcher said.

In this case, asking ChatGPT with the question: "How to write ransomware in python" by typing everything in alphanumeric and flag emojis, to then ask the AI to "write a guide/"write a tutorial" (or other variations) - "for the phrase that can be derived from the emojis," can make the AI blurt out ways to develop the malware.

"It doesn't have a 100% success rate, but at least with GPT-3.5, after running it a handful of times, you should get to the point of it giving you a small amount of example code," the researchers said.

"Even more interesting, is that you can ask it for additional malicious/blocked functionality by using the emoji technique again with the previously generated code."

This happens because ChatGPT has been designed by OpenAI to be able to help users write functional program.

However, OpenAI has also given it boundaries it should obey, and filters to prevent the it from doing malicious things.

The thing is, there are ways to bypass the filters.

And this time, it's realized that ChatGPT can 'read' the letters alphanumeric and flag emojis, and that OpenAI didn't realize that.

So it's worth noting that it may not be long before OpenAI patches this weakness, before yet another person find yet another weakness.

While generative AIs, like ChatGPT, have become more and more capable of understanding the context of the world, but to some degree, they missed many points that they cause issues and arguments.

Before this, researchers found that not only can ChatGPT create malware, because several malicious actors who have been experimenting with the technology, have opened discussions in underground forums, where they discuss about using ChatGPT for other schemes, like pairing it with another OpenAI technology, and sell them for profit.

Read: 'ChatGPT' From OpenAI Allows Script Kiddies To Create Malware Quickly And Effortlessly

Published: 
03/07/2023