Nvidia Unveils 'Maxine', An AI-Powered Platform For Video-Conferencing Apps

Nvidia logo, Maxine AI cloud platform

Nvidia is an multinational company popular among gamers and professionals. It has unveiled a tool that should come helpful to video-conferencing apps.

Calling it 'Nvidia Maxine', it's an AI-powered platform the company said can cut bandwidth consumed by video calls by a factor of 10.

It does this by slashing down the bandwidth requirements of the H.264 video compression standard by using AI to analyze the "key facial points" of people rather than rendering the entire pixels.

Maxine can also use its AI to make gaze correction, super-resolution, noise cancellation, and even face relighting.

"These capabilities are fully accelerated on NVIDIA GPUs to run in real time video streaming applications in the cloud," wrote Nvidia on Maxine's web page.

Breaking the technology down, Maxine can identify key facial points of each person inside a video call.

Maxine can then use these points on a still image, to reanimate the person's face on the other side of the call using generative adversarial networks (GANs).

By understanding the key points, Maxine essentially has the face data of the subject.

This way, it can make alignment of the face, like for example, rotating the face so the person appear to be facing each other during a call, when the person isn't.

It can also correct gaze to help simulate eye contact, even when the person's eyes aren't aligned to the camera.

"Developers can also add features that allow call participants to choose their own avatars that are realistically animated in real time by their voice and emotional tone," Nvidia added.

But its main selling point here, is its AI-powered super-resolution and artifact reduction that can convert lower resolution to higher resolution in real time, while at the same tine reducing bandwidth.

And on top, Maxine-based applications run in the cloud, meaning that it won't consume significant resources on users' devices when it runs. What this means, low-powered devices should be able to use apps that are powered by Maxine, when they are available.

Nvidia Maxine, face alignment
By understanding the key facial points of a person, Maxine can align the face when needed. (Credit: Nvidia)

Ian Buck, Vice President and General Manager of Accelerated Computing at Nvidia, said that:

"Video conferencing is now a part of everyday life, helping millions of people work, learn and play, and even see the doctor."

"Nvidia Maxine integrates our most advanced video, audio, and conversational AI capabilities to bring breakthrough efficiency and new capabilities to the platforms that are keeping us all connected."

Open to developers, Maxine can also be used to enhance virtual assistants, translations, closed captioning, transcriptions, and animated avatars on their video conferencing apps.

Computer vision developers, software partners, startups, and computer manufacturers creating audio and video apps and services can apply for early access to the Maxine platform.

According to Nvidia, Maxine-based applications can use NVIDIA Jarvis, a fully accelerated conversational AI framework with models that are already optimized for real-time performances. Using Jarvis, developers can integrate virtual assistants to take notes, set action items, and answer questions in human-like voices.

The announcement came following the immense needs and popularity of video calls caused by the 'COVID-19' coronavirus pandemic that forces many people to remain indoors.

Nvidia said that more than 30 million web meetings are taking place every single day, and that video conferencing has increased tenfold since the beginning of the 2020.

And this Maxine technology should help cut down the bandwidth, which can then result to a reduced costs for providers and a smoother experience for consumers.

Published: 
06/10/2020