Background

Baidu's WARP-CTC Artificial Intelligence That Rivals People at Speech Recognition, Is Open-Sourced

Baidu brainArtificial Intelligence (AI) is regarded the future of technology. Based on how the human's brain work, AI teaches computers to think and respond like human, giving them the ability they never had before.

Google has open-sourced TensorFlow, followed by Facebook with open-sourcing Big Sur. China's Baidu that is leading AI in Asia, is joining the two with the exact same strategy.

Baidu is China's leading internet search company. With its resources and influence, it has been investing heavily in popular and powerful machine-learning technology called deep learning. By releasing some of its codes to the public, Baidu is making its AI software an open-source.

Recently, the company was working on a speech-recognition system called Deep Search 2. The system that was primarily developed by a team in California, is especially significant in how it relies entirely on machine learning for translation. Whereas older voice-recognition systems include many handcrafted components to aid audio processing and transcription, Baidu's system is able to recognize words from scratch from simply listening to thousands of hours of transcribed audio.

The result: Deep Speech 2 is able to recognize speech in both English and Mandarin. In some cases, the system is even better at speech recognition than humans.

"Historically, people viewed Chinese and English as two vastly different languages, and so there was a need to design very different features," said Andrew Ng, a former Stanford Professor and Google researcher, and now Chief Scientist for Baidu. "The learning algorithms are now so general that you can just learn."

As for the AI system, He added that "For short phrases, out of context, we seem to be surpassing human levels of recognition."

The deep learning algorithm is indeed useful to Baidu as it offers better ways for its users to access its services in many available platforms. Especially on mobile, many users in China are already preferring voice recognition input rather than traditional typing for short messages or to search the web. Because typing Chinese characters on a smartphone can be complex, Baidu's feature has help many users in easing the what was once tricky to master.

The system and its AI are used in many of Baidu's services, including its Duer digital assistant.

Baidu's code for the system is called WARP-CTC. It's essentially an implementation of the CTC algorithm for CPUs and Nvidia GPUs, improved to implement deep-learning algorithm so it can run very quickly on the latest computer chips. This tool can plug into existing machine learning frameworks to significantly speed up AI development - up to 400 times faster than previous implementations. This is the tech powering Baidu's Deep Search 2.

When developing Deep Speech 2, Baidu also created new hardware architecture for deep learning that runs seven times faster than the previous version.

WARP-CTC is built upon an AI fundamental called connectionist temporal classification. Since finding existing CTC AI functions is slow, the team at Baidu paralleled the CTC algorithm for increased speed and functionality. Warp-CTC's release comes with the aim to make end-to-end deep learning easier and faster, so researchers can make quicker progress. The WARP-CTC software released includes a simple C interface and the bindings for Torch, a scientific computing framework. This should allow users to easily incorporate WARP-CTC into existing deep learning projects.

Baidu in open-sourcing WARP-CTC, is the first time for SVAIL to offer open-source code to the machine learning community. SVAIL was founded in 2014 when Andrew Ng and Adam Coates joined Baidu, as Chief Scientist of Baidu and Director of SVAIL, respectively. The lab's research is built on a combination of deep learning, large datasets and high performance computing.

As its goal, Baidu wants to benefit public research with its AI, and it's expecting to release more AI tools in the future.

The codes for WARP-CTC is available on GitHub.

Further reading:
Baidu Sets a Dream: Becoming the Leader of Artificial Intelligence
Racing For The Future Of AI, Baidu's Supercomputer For Deep Learning Is Comparable To Google's