Background

OpenAI Delays 'Voice Mode,' But Leaked An Advanced GPT-4o-Powered Voice Demo

ChatGPT

When something is too good to be true, always approach such situation with a healthy dose of skepticism.

OpenAI, the AI research company behind ChatGPT, has pushed back the launch of its 'Voice Mode,' a standout upcoming feature that made waves when it was showed off at its product update.

"We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch," the company said.

In other words, the company cannot release the feature in a limited alpha in time as promised, and has to push back the plan to July.

"We’re improving the model’s ability to detect and refuse certain content," the company explained.

"We’re also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses."

Its Voice Mode was one of the most compelling announcement at the OpenAI event, unveiled alongside the release of GPT-4o.

It allowed users to interact with the chatbot via voice and engage in a natural-sounding conversation.

ChatGPT’s advanced Voice Mode can understand and respond with emotions and non-verbal cues, moving interactions much closer to real-time, natural conversations with AI.

The demonstration also invited comparisons to the science fiction film Her, featuring a virtual companion voiced by actress Scarlett Johansson, who later threatened OpenAI with legal action over the similar-sounding voice, which prompted the company to remove the sound from its library.

Soon after announcing the delay of Voice Mode, the company quickly received harsh criticism from users and the AI community.

Users didn’t like what they heard, and OpenAI’s announcement quickly met a wave of disdain.

Many critics were quick to point out OpenAI's history of overpromising and underdelivering, comparing its track record to its competitors.

"Be like Anthropic," tweeted an AI enthusiast. "They don’t demo and create hype only to go silent for 3-4 months."

And with OpenAI making its GPT-4o available to free users, even with limitations, the delay further led to many users questioning the value of their ChatGPT Plus subscriptions.

Read: OpenAI Makes Pretty Much All Of ChatGPT Free, Making Paying Less Important

But not all is lost.

A number of ChatGPT users reported that they received a sneak peak of what is possible with the next-generation voice assistant.

Reddit user RozziTheCreator was one of the people who got an early taste of the GPT-4o-powered voice assistant.

In his case, he found that the AI can create a compelling story, complete with sound effects that tied to the story, such as thunder and footsteps.

At first, Rozzi thought that he was getting this access due to a mistake, but OpenAI later made a statement, confirming that some users were temporarily given access to the model by accident.

"It just suddenly came up, it did look the same the only difference was the voice," RozziTheCreator said.

The discovery happened when RozziTheCreator was trying to ask the chatbot a question: “Boom I discovered the change.”

It only lasted a few minutes and, according to RozziTheCreator, "it was very buggy," so there wasn’t time to get much out, but they managed to record a snippet of this amazing story.

"It started going insane repeating and replying to things I didn't say," explained RozziTheCreator, before things returned back to normal.

Published: 
29/06/2024