OpenAI finally offers the first glimpse of the rumored 'Strawberry' AI model.
After launching ChatGPT and its accompanying models, which sparked an arms race among tech companies to develop the best Large Language Models, OpenAI is once again distancing itself from the competition.
Following the massive improvements over GPT-4 that it introduced with GPT 4o, OpenAI has officially launched the highly anticipated 'OpenAI o1' model.
Welcome to the world, OpenAI o1https://t.co/PeZ7SyNTwQ
— ChatGPT (@ChatGPTapp) September 12, 2024
Accompanying the smaller o1 and the more affordable o1-mini, this o1 model takes a significant step toward OpenAI's goal of creating human-like AI.
In a post on its website, OpenAI said that the reason is because the o1 model is the first in a series of advanced "reasoning" models designed to handle complex queries more efficiently than its predecessors.
Read: 'Project Strawberry' Being Teased, Suggesting Where OpenAI Wants To Advance
We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond.
These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. https://t.co/peKzzKX1bu— OpenAI (@OpenAI) September 12, 2024
The major difference between o1 and GPT-4o is its superior ability to handle complex tasks like coding and multistep math problems.
For example, in another post, OpenAI said that o1 excelled in the AP math test and even solved 83% of problems on the International Mathematics Olympiad qualifying exam - compared to GPT-4o’s 13%. In online programming competitions like Codeforces, o1 placed in the 89th percentile, with future updates expected to rival PhD students in subjects like physics and chemistry.
According to Jerry Tworek, OpenAI’s research lead, this is possible because of o1's training differs from previous models.
While earlier GPT versions focused on mimicking patterns from training data, o1 uses reinforcement learning, which teaches the model to solve problems by rewarding or penalizing outcomes.
This approach enables the model to process questions using a step-by-step “chain of thought” similar to how humans solve problems.
The result is a more accurate model.
In a demonstration, o1 solved a complex age puzzle, providing a step-by-step breakdown of its thought process.
What’s notable is how human-like the model’s reasoning seemed, with phrases like "I’m curious about" and "I’m thinking through" giving the impression of real-time thinking.
While OpenAI clarifies that the model doesn’t actually "think," the interface is designed to make its problem-solving process feel more relatable.
OpenAI o1 codes a video game from a prompt. pic.twitter.com/aBEcehP0j8
— OpenAI (@OpenAI) September 12, 2024
OpenAI o1 answers a famously tricky question for large language models. pic.twitter.com/5ZlQIOBWEd
— OpenAI (@OpenAI) September 12, 2024
OpenAI o1 translates a corrupted sentence. pic.twitter.com/E37e4SOuq4
— OpenAI (@OpenAI) September 12, 2024
OpenAI o1 thinks before it answers and can produce a long internal chain-of-thought before responding to the user.
o1 ranks in the 89th percentile on competitive programming questions, places among the top 500 students in the US in a qualifier for the USA Math Olympiad, and…— OpenAI (@OpenAI) September 12, 2024
The o1 shows an improvement over its predecessors, and this is a good thing. However, it still has its limits.
First of, according to Tworek, the AI model still struggles with hallucinations.
Then, it doesn’t perform as well with factual knowledge and lacks capabilities like web browsing or processing files and images.
And because it 'thinks' more, it's also slower to respond.
What's more, it's also pricier than the GPT-4o model.
Due to the weaknesses and some of the drawbacks, this is why OpenAI said that the o1 model is not the successor of GPT-4o in any way.
"OpenAI o1 isn’t a successor to gpt-4o. Don’t just drop it in—you might even want to use gpt-4o in tandem with o1’s reasoning capabilities," the company said.
OpenAI o1 isn’t a successor to gpt-4o. Don’t just drop it in—you might even want to use gpt-4o in tandem with o1’s reasoning capabilities.
Learn how to add reasoning to your product: https://t.co/cphpkNBPkB.
After this short beta, we’ll increase rate limits and expand access to…— OpenAI Developers (@OpenAIDevs) September 12, 2024
And according to co-founder and CEO Sam Altman, he also said that the model still has its flaws.
He said that the o1 is a series of OpenAI's most capable and aligned models yet, the model is still limited, and that "it still seems more impressive on first use than it does after you spend more time with it."
here is o1, a series of our most capable and aligned models yet:https://t.co/yzZGNN8HvD
o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it. pic.twitter.com/Qs1HoSDOz1— Sam Altman (@sama) September 12, 2024
Regardless, OpenAI sees o1 as the beginning of a new era of AI models, one with reasoning capabilities that represent a major shift from the traditional approach.
The name o1 reflects this fresh start, and OpenAI’s chief research officer, Bob McGrew, hopes this naming convention marks a new direction in their branding efforts.
For AI researchers, enhancing reasoning capabilities is a crucial step toward achieving human-level intelligence.
While o1 is still relatively slow and costly, it represents a pivotal advancement in AI’s journey toward autonomous agents capable of making decisions and taking actions independently.
"We’ve been focused on reasoning for months because we believe it's the key to solving the hardest problems on the path to human-like intelligence,” McGrew said, noting that this breakthrough is essential for OpenAI's future ambitions.
Initially, OpenAI is releasing o1-preview and o1-mini for ChatGPT Plus and Team users, with Enterprise and Edu users getting access a week later.
The company has plans to roll out o1-mini to free ChatGPT users.
There has been a lot of enthusiasm to try OpenAI o1-preview and o1-mini, and some users hit their rate limits quickly.
We reset weekly rate limits for all Plus and Team users so that you can keep experimenting with o1.— OpenAI (@OpenAI) September 13, 2024