Background

OpenAI Introduces The 'GPT-4o', Its 'Flagship' AI That Goes Beyond GPT-4

GPT-4o

The AI was dull, boring, and barely made ripples outside its own industry.

But since OpenAI introduced ChatGPT, things changed. Pretty much all tech companies, large and small, began the arms race to either partner or rival the generative AI.

In the ever-shifting world, the competition is fierce, and the stakes are high.

To keep on distinguishing itself from rivals, OpenAI has to move fast, and this time, the company has announced the successor of its powerful GPT-4 multimodal large language model.

The company calls it the 'GPT-4o'.

In a website post, the company said that it's its "new flagship model that can reason across audio, vision, and text in real time."

In another website post, OpenAI said that:

"In line with our mission, we are focused on advancing AI technology and ensuring it is accessible and beneficial to everyone. Today we are introducing our newest model, GPT-4o, and will be rolling out more intelligence and advanced tools to ChatGPT for free."

"GPT-4o is our newest flagship model that provides GPT-4-level intelligence but is much faster and improves on its capabilities across text, voice, and vision."

The idea is to bring GPT-4-class chat to the OpenAI app, for everyone with no subscription needed.

For starters, according to OpenAI's post, its GPT-4o is already showing an astounding abilities at understanding and discussing the images users share.

For example, users can take a photo of a menu in a different language and make GPT-4o translate it, learn about the food's history and significance, and get recommendations.

In other words, GPT-4o is able to "see" the world around it, by better understanding the context better than ever before.

In the future, OpenAI plans to improve this large language model, to make it more natural, real-time voice conversation and the ability to converse with ChatGPT via real-time video.

For example, users could show ChatGPT a live sports game and ask it to explain the rules.

OpenAI also plans to launch a new 'Voice Mode'.

This is where GPT-4o is more intriguing.

The model sounds far more human in both tone and intonation.

What's more, its real-time capability means that users don't have to wait for it to complete its statement before they can jump in and cut it off. The speech synthesis can even harmonize its voices, as well as provide what people could consider to be “normal” conversational interactions, translations, and more.

In comparison, previously, users could use Voice Mode to talk to ChatGPT with latencies of 2.8 seconds in the GPT-3.5, and 5.4 seconds in GPT-4, on average.

To make advanced AI more accessible and useful worldwide, GPT-4o's language capabilities are also improved across both quality and speed.

ChatGPT also now supports more than 50 languages across sign-up and login, user settings, and more.

"We are beginning to roll out GPT-4o to ChatGPT Plus and Team users, with availability for Enterprise users coming soon," said OpenAI.

Free users can use the GPT-4o model with all of its knowledge and power through their free ChatGPT account, but with a limit imposed.

According to OpenAI, the GPT-4o model supports web browsing, Memory, and also the GPT Store.

“There will be a limit on the number of messages that free users can send with GPT-4o depending on usage and demand,” OpenAI said. “When the limit is reached, ChatGPT will automatically switch to GPT-3.5 so users can continue their conversations.”

"GPT-4o reasons across voice, text, and vision. And with this incredible efficiencies, it also allows us to bring the GPT-4o intelligence to our free users. This is something we have been trying to do for many, many months.," said CEO chief technology officer Mira Murati, at the OpenAI Spring Update event.

"It’ll be free for all users, and paid users will continue to have up to five times the capacity limits of free users," she said.

Murati added that GPT-4o is twice as fast as, and half the cost of GPT-4 Turbo.

OpenAI CEO, Sam Altman, said that "So far, GPT-4 class models have only been available to people who pay a monthly subscription. This is important to our mission; we want to put great AI tools in the hands of everyone."

Before this, another notable variation of the GPT-4, is the GPT-4 Turbo, which supports 128K context and has fresher knowledge than GPT-4.

Published: 
14/05/2024