Researchers Create Hate Speech Dataset For AIs To Learn, From Toxic Places In Reddit And Gab

Training AIs need a lot of data. And that data needs to come from somewhere.

While it's certainly true that the internet has a lot of data. But when considering that the data needs to be more specific, the sources are narrowed to the point that they are actually scarce, especially when considering AI's hunger for labeled information.

This is why researchers from UC Santa Barbara and Intel opt to find them in places where others rarely think of considering:

The "front page of the internet" Reddit, and the alt-right website Gab.

The two websites are known to host trolls and haters, and according to the team’s research paper, what they needed to do, was to just grab all of Gab's posts, and Reddit posts from usual subreddits:

"To retrieve high-quality conversational data that would likely include hate speech, we referenced the list of the whiniest most low-key toxic subreddits… r/DankMemes, r/Imgoingtohellforthis, r/KotakuInAction, r/MensRights, r/MetaCanada, r/MGTOW, r/PussyPass, r/PussyPassDenied, r/The_Donald, and r/TumblrInAction."
Hate speech illustration
Hate speech illustration between two users, and the interventions collected for the datasets

After collecting more than 33,000 comments from Gab and more than 22,000 comments from Reddit, the researchers want to start "countering online hate speech" by "the use of Natural Language Processing (NLP) techniques."

But there was something the researchers learned before that: while both sources are equally reprehensible, but they go about their bigotry in different ways.

The Gab dataset and the Reddit dataset have similar popular hate keywords, but the distributions of the words are very different. Because all the statistics indicate that the characteristics of the data collected from these two sources are very different, thus the challenges of doing detection or generative intervention tasks on the dataset from these sources should also be different.

This explains why it is difficult for social media websites to intervene hate speech in real-time.

While in many cases, the same people can have different accounts, the way they deliver their posts can be different from one platform to another. AIs failed many times here because they are less capable of detecting the patterns. And with online companies having not enough human resources to do this job, hate speech will continue to flow and spread.

Hate speech on Reddit and Gab
The distributions of the top 10 keywords in the hate speech collected from Reddit and Gab

For these reasons, the researchers decided to try a different strategy: automating the intervention.

They took the data they gathered from Gab and Reddit, and sent them to Amazon Turk workers to label. Once the instances of hate speech were identified, they asked the workers to come up with phrases that AIs could use to deter users from posting similar hate speech in the future.

After receiving the data from Amazon Turk's workers, the researchers then ran this new dataset and its database of interventions through various machine learning technology and NLP systems, to create a prototype of an online hate speech intervention AI.

While the research is still at its early stage, the results are fortunately astounding.

The AI in theory, should be able to detect hate speech in real-time, and immediately send a message to the poster to tell them why they shouldn’t post things that are considered hate speech.

In other words, the concept shows that the AI could extract the context of hate speech just like it is purposed to do, and work beyond the usual keyword detection methods used by modern content filters.

But unfortunately, according to the researchers, the system is not ready for its prime time yet. And just like most early AI projects, this project can take a long time to perfect. It also needs a much larger data sets and a lot of developments and tweaks, before it's good enough to be deployed in the real world of the internet where hate thrives.

Published: 
26/09/2019