Meet Project Adam, Microsoft's Artificial Brain for Deep Learning

Microsoft Project AdamAs the tech world advances, artificial intelligence (AI) is becoming a common phrase. From the works of academic researchers, the biggest names in the tech industry - including Google, IBM, Apple, Facebook - are embracing the new powerful form of AI known as "deep learning".

The assumption is that Google is out in front can be seen when the search giant company employs researcher at the heart of the deep learning movement, the University of Toronto's Geoff Hinton. It has discussed the real world progress of its new AI technologies, including the way deep learning has changed voice search on Android smartphones. And besides its powerful search engine, technologies behind Google is known for its accuracy in speech recognition and computer vision.

However, the software giant Microsoft is proving that its new deep learning AI called "Project Adam", is a better 'being' with its ability to perform visual recognition tasks faster than any other system currently available.

What began in early 2013, Project Adam is an initiative by Microsoft's team of researchers and engineers, aims to demonstrate that large-scale, commodity distributed systems can train huge deep neural networks effectively. The lead men behind the project are Karthik Kalyanaraman, Trishul Chilimbi, Johnson Apacible and Yutaka Suzue.

Microsoft Research, a research division of the company, says that Project Adam enhances the learning ability of computers. With Project Adam computer systems can 'absorb' data, leaving the assumption that AI can actually learn.

The company first demonstrated Project Adam at its Faculty Summit in Redmond on < href="http://www.eyerys.com/articles/timeline/microsofts-project-adam">July 14th, 2014. As part of the demonstration, the company brought dogs of different breeds and showed how technology can distinguish between them in real time. According to Microsoft, Project Adam is twice as fast as previous systems at recognizing images, while using 30 times fewer machines. For proof, the researchers created the world's best photograph classifier, using 14 million images from ImageNet, an image database divided into 22,000 categories.

"Adam is an exploration on how you build the biggest brain," said Peter Lee, the Head of Microsoft Research.

This test deals with a database of 22,000 types of images, and before Project Adam, only a handful of artificial intelligence models were able to handle this massive amount of input. One of them was Google Brain, a system that provides AI calculations to services across Google's online database. And on the benchmark test, Project Adam beats Google.

Project Adam at Work

Project Adam's 'brain' isn't actually new. The way it works is similar to other similar deep learning systems, where the AI runs across an array of computer servers and mimics the way a human brain works by creating neural networks - systems that behave. In this case, Project Adam uses Microsoft's Azure cloud computing service.

To mimic a human brain, these neural networks require a large number of servers. And what makes Project Adam different, it its AI uses a technique called asynchronous precessing.

As computing systems get more and more complex. More by having more data, it gets more difficult (takes longer) for AI to manage all those, and getting various parts to trade information with each other. Asynchronous can eliminate this problem. Asynchronous is about splitting a system into parts that can pretty much run independently of each other, before sharing their calculations and merging them into a whole.

An example for asynchronous is an telephone conversation. Both parties can talk whenever they like. If the communication were synchronous, each party would be required to wait a specified interval before speaking.

The difficulty with asynchronous communications is that the receiver must have a way to distinguish between valid data and noise. In computer communications, this is usually accomplished through a start bit (a bit added prior of the data) and stop bit (a bit added at the end of the data). For this reason, asynchronous is sometimes called start-stop transmission.

The trouble for this technique is that asynchronous has worked well in smartphones and single computers where calculations are spread across different computer chips, but running across many servers has not yet proven successful. Other tech companies have toyed with asynchronous system for years, but Microsoft is taking advantage of this work by using a technology called "HOGWILD!"

HOGWILD! was initially designed to make processors work more independently. But because different chips could write to the same memory location, they are capable to overwrite each other. With most other systems, this is can cause data collisions. But in some situations, this can also work well, enhancing the data process. The chance of data collision is relatively low in small computing systems, it can significantly speed things up. Microsoft took this idea and applied the asynchronous method of HOGWILD! to the whole Project Adam's system.

Since the neural network isn't at all a small computer, the risk of data collision is significantly higher. However, implementing HOGWILD! into the system works because the collision tend to result in the same calculation that would have reached if the system had carefully avoided any collisions.

For example, one machine decided to add a "1" to a preexisting value of "5," and afterwards, add an additional "4". Another machine decided to add "4" to the "5" before adding the "1". Rather than carefully controlling which machine updates the value first, the system just lets each of them update it whenever they can. Whichever machine goes first, the end result is still "10".

To make that happen, Project Adam optimizes the way its machines handle data and tunes the communications between them more effectively. The system uses traditional processors rather than using Graphics Processing Unit (GPU) that is originally designed for graphics processing. Many deep learning systems are moving away from using GPUs because neural networks work on massive amounts of data - a lot more than a standard Central Processing Unit (CPU) can handle. That's the reason why they get spread across many machines. Another option, is to run things on GPUs, which can crunch the data more quickly. However, if the AI model doesn’t fit entirely on one GPU or a single server running several GPUs, the system can stall. The communications systems in data centers aren't fast enough to keep up with GPUs' ability to handle information, creating data gridlocks. That’s why, some experts say, GPUs aren't ideal right now for scaling up very large neural nets. Chilimbi, who helped design the vast array of hardware and software that powers Microsoft's Bing search engine, is one of them.