Google Releases Huge Data Set Of Deepfake Videos: Contributing To Its Detection Research

Deepfake videos are plaguing the web. From celebrities with porn stars' bodies, to revenge porn and politicians saying what they're not suppose to say.

The many varieties of deepfake videos, made by the many developments and software that improved the original one, have being used to harm people mentally and even financially.

There should be a way to end this, and Google is trying to do its part.

"While many are likely intended to be humorous, others could be harmful to individuals and society," said Google on its blog post.

As the search giant of the web that is also a tech company popular for its multiple products, it considers these issues seriously.

This is why researchers at the company want to help others in developing algorithms to curb out those videos, by releasing a huge and free data set of deepfake videos, available for download on GitHub.

"We are committed to developing AI best practices to mitigate the potential for harm and abuse," said Google.

Google compiled this database with the help of Jigsaw, its own tech incubator, and FaceForensics Benchmark, a program developed by the Technical University of Munich and the University Federico II of Naples, that helps researchers create techniques to detect artificially-made videos.

"To make this dataset, over the past year we worked with paid and consenting actors to record hundreds of videos. Using publicly available deepfake generation methods, we then created thousands of deepfakes from these videos. The resulting videos, real and fake, comprise our contribution, which we created to directly support deepfake detection efforts."

A sample of videos from Google’s contribution to the FaceForensics benchmark. To generate these, actors were selected randomly to create the deepfakes. (Credit: Google)

The initial release of this data set includes 3,000 manipulated videos from 28 paid and consenting actors in various scenes.

Google also included a fourth manipulation method that does face manipulation using GANs and Neural Textures.

The company is not taking chances, because "the field is moving quickly."

"We'll add to this dataset as deepfake technology evolves over time, and we’ll continue to work with partners in this space. We firmly believe in supporting a thriving research community around mitigating potential harms from misuses of synthetic media, and today's release of our deepfake dataset in the FaceForensics benchmark is an important step in that direction."

Deep learning has given rise to technologies that would have been thought impossible only a handful of years ago. Modern generative models are just one example of these, capable of synthesizing hyperrealistic images, speech, music, and even video.

And deepfakes here, are produced by deep generative models that can manipulate video and audio clips, often for malicious purposes.

Google is playing its card to help others, hoping that people would in turn do their part and help the community.

Published:

25/09/2019

Search form

Google Releases Huge Data Set Of Deepfake Videos: Contributing To Its Detection Research