Amazon Demonstrates How Its Alexa Can Mimic The Voice Of Anyone, Dead Or Alive

Amazon Alexa, human-like empathy

When trained properly using the correct and useful data, AIs can do lots of things.

And sometimes, the many things Artificial Intelligence agents can do, can be creepy, if not astonishing. This time, tech giant Amazon demonstrates how its Alexa digital assistant can literally mimic the voice of anyone, dead or alive.

At Amazon’s annual re:MARS conference in Las Vegas, U.S., the company showed off how Alexa is able to read a bedtime story to a child using the voice of his dead grandmother.

In the demo, a child asks Alexa, "Can grandma finish reading me ‘The Wizard of Oz’?" Alexa responds, "Okay," in her typical effeminate, robotic voice.

But after that, Alexa changed its to that of the child's grandmother and starts reading the L. Frank Baum's tale.

Here, Amazon uses the emotional trauma of the ongoing pandemic and grief to sell interest.

"In this companionship role, human attributes of empathy and affect are key for building trust," said Amazon’s head Alexa AI scientist Rohit Prasad. "These attributes have become even more important in these times of the ongoing pandemic, when so many of us have lost someone we love. While AI can't eliminate that pain of loss, it can definitely make their memories last."

Prasad added that the feature "enables lasting personal relationships."

The Amazon executive said that the feature is meant to highlight Alexa’s "human attributes" which have become more important "in these times of the ongoing pandemic when so many of us have lost someone we love."

While nothing can eliminate "that pain" of losing a relative, but AI can help carry on the memories of the deceased.

At the moment of the introduction, Amazon didn't provide any technical detail or timeline about the feature.

Prasad only said Amazon is "working on" the Alexa capability, and that the demo was more of a proof-of-concept.

Amazon was only trying to highlight the underlying voice technologies it has muster.

"This required invention where we had to learn to produce a high-quality voice with less than a minute of recording versus hours of recording in a studio," he said. "The way we made it happen is by framing the problem as a voice-conversion task and not a speech-generation task."

“We are unquestionably living in the golden era of AI, where our dreams and science fiction are becoming a reality,” he said.

In the past, Amazon has already developed voice-synthesizing technology to let Alexa mimic the voices of celebrities, including Shaquille O’Neal and Melissa McCarthy.

But that feature has previously required an individual to record dozens of hours of audio.

This time, Amazon has developed that technology in a way that it can replicate a voice using less than one minute of recorded speech, which the company’s engineers was able to do “by framing the problem as a voice-conversion task, and not a speech-generation task.”

While this shows how researchers have advanced AI, it also raises concern about deepfake AIs, which have earned controversial reputation.

Deepfaked audio recordings have long been used to scam people and do other nefarious activity.

And for those who are concerned, this feature released to the public would allow the creation of deepfake audios to be as easy as speaking.

But if putting those things aside, this is certainly an advancement in AI technology.

"Unlike deepfakes, if you’re transparent about what it’s being used for, there is a clear decision maker and the customer is in control of their data and what they want it to be used for, I think this is the right set of steps," Prasad explained. "This was not about ‘dead grandma.’ The grandma is alive in this one, just to be very clear about it."

Read: A Reddit User Starts 'Deepfake'

Published: 
24/06/2022