'Gemini 2.0 Flash Experimental' Is A 'Reasoning' AI From Google That Can Explain Itself

Google Gemini 2.0 Flash Experimental

On the internet, it's business as usual. But for Google, the company has been busy.

After OpenAI launched ChatGPT, a fierce competition was ignited across the tech world. It didn’t take long before companies of all sizes began racing one another to quickly establish their dominance in the AI space.

Google, the company that thrives on the web and beyond, didn't anticipate this sudden boom of generative AI.

The tech giant quickly recognized the transformative potential of a Large Language Model (LLM)-powered chatbot on human-computer interaction, prompting what was described as a “code red.”

In response, Google rapidly developed its own chatbot, initially known as Bard, which was later rebranded as Gemini. The AI was deeply integrated across Google’s core products to maintain its competitive edge.

With OpenAI pushing the boundaries by releasing AI capable of advanced reasoning, Google is determined not to fall behind.

Google calls this AI the 'Gemini 2.0 Flash Thinking Experimental'.

This version of Gemini 2.0 Flash, which is essentially an optimized version of Gemini 2.0, the AI model Google said is designed for the "agentic era" of AI, is equipped with the ability to reason.

Similar to OpenAI's o1, which is able to achieve "deeper thinking" on problems fed into it, the experimental model from Google is able to incorporating feedback loops of self-checking mechanisms.

When a traditional Gemini AI is asked a question, it would quickly blurt out the answer. But Gemini 2.0 Flash Thinking Experimental would take its time to rethink, before rethinking its answers again and again.

As a result, the process that is usually instantaneous, is lengthened to a few more seconds, or even minutes.

This is because the process to reason requires a lot more computing time.

Google DeepMind's chief scientist, Jeff Dean, says that the model receives extra computing power, writing on X, "we see promising results when we increase inference time computation!" The model works by pausing to consider multiple related prompts before providing what it determines to be the most accurate answer.

By refraining itself from quickly answering users' questions as fast as possible, AIs with reasoning capabilities can perform well on some benchmarks.

Unlike OpenAI’s o1 model, which keeps things hidden, Google’s Gemini 2.0 Flash Thinking lets users know how it tackles problems by giving them insight into its arguments.

The demo featured by Google, the AI is able to solve complex physics problems by breaking them into smaller steps, showcasing its methodical problem-solving to deliver solid, reliable results.

In another demo, the model shows off its reasoning skills by combining visual and text data to solve a problem. It highlights how well the model can process and blend information taken from different sources.

By showing how it reasons, it should make users able to follow and avoid mistakes.

This unique approach helps it outshine the standard Gemini 2.0 Flash on tougher challenges.

However, due to how lengthy the whole process can be, and how thorough its answers can be, reasoning-capable AIs may not be suitable for everybody.

While they perform well, questions remain about their actual usefulness and accuracy. Also, the high computing costs needed to run reasoning models have created some rumblings about their long-term viability.

Not to mention that Gemini 2.0 Flash Thinking Experimental, like the OpenAI o1, can sometimes lack some minor details that can be essential in their actual problem-solving skills.

It's also worth noting that reports suggest that AI companies have turned to reasoning models as traditional scaling methods at training time have been showing diminishing returns.

Regardless, Logan Kilpatrick, who leads product for AI Studio, called Gemini 2.0 Flash Thinking Experimental “the first step in [Google’s] reasoning journey.”

The model supports just 32,000 tokens of input (about 50-60 pages worth of text), and can generate 8,000 tokens per output response. In a side panel on Google AI Studio, the company claims it is best for "multimodal understanding, reasoning" and "coding."

Published: 
11/12/2024