Facebook Uses Billions Of Instagram Photos To Train Its AI: Without Prior Notification

When something is free, there must be something that works behind the shadows. There should be a catch.

The social media network by Facebook is a free service. Anyone can register to use its service, and anyone can use it. And here the benefit is immediate and obvious. So, what's in return?

Facebook and its properties have allowed users to limit what other users can see. But there is no saying what Facebook can do with things that are uploaded to its servers. For short, everything it has about its users, is for financial reward for the company. Users don't know exactly how the company use this massive trove of data.

Here, Facebook is giving a glimpse: it uses photos users have uploaded, to train its AI.

An artificial intelligence experiment of unprecedented scale has been disclosed by Facebook. This shows how users' social lives provide valuable data for training machine-learning algorithms.

It’s a priceless resource that could help Facebook compete with Google, Amazon, and other tech giants with their own AI ambitions.

During the company’s annual F8 developer conference, Facebook said that its researchers use 3.5 billion public Instagram photos and 17,000 hashtags to train algorithms to understand images by themselves. This number sets a record on a test that challenges computer software to assign photos to 1,000 categories.

The entire AI project took 22 days and required the power of 330 graphical processing units, or GPUs.

Dogs - face tagged

The result of the training, according to Facebook's AI and machine-learning director Srinivas Narayanan, is the AI can recognize the content of images with 85.4 percent accuracy, compared to Google’s 79.2 percent, only by learning from 1 billion images.

One of the biggest problems Facebook is facing here, is not having enough properly labeled photos to train the computers to understand what is in them.

To anticipate issues, the AI goes on a pre-training sequence where the researchers focused on developing systems for finding relevant hashtags first. What this means, the team should discover which hashtags were synonymous while also making the AI learn to prioritize more specific hashtags over the more general ones. This led to what the researchers called the "large-scale hashtag prediction model."

This method is to reduce the excessive "noise" in order to create data that is effective for training materials.

Instagram privacy

While Facebook can definitely train AI better with that huge amount of data, there is some privacy implications here.

First of all, Facebook said that it's only using data that are public (not from private accounts). But when users post a photo on Instagram, do they know that they're contributing to Facebook's AI program? Secondly, while the AI is centered on a larger scale (not specific like predicting something specific), there is no saying what Facebook is capable in developing by evolving this AI.

Neither Instagram nor Facebook users were warned about this practice before the company mined their photos.

And this unexpected Instagram photo gathering activity comes only weeks after Facebook's biggest data scandal in its history with political consultancy Cambridge Analytica.

Published: 
02/05/2018