Google Introduces 'Project Astra,' its Answer To OpenAI's GPT-4o Multimodal AI

Business, the only way to go is forward. What this means, constant innovation is needed and very much required.

The AI world was quite dull before OpenAI Introduced ChatGPT. Since then, companies, big and small, began arming themselves to compete in the lucrative market, knowing that the demand is extremely high.

This time, a day after OpenAI impressed with GPT-4o, a startlingly improved large language model GPT-4, Google showed off an equally stunning vision for how AI should be able to improve the products that billions of people use every day.

During its 2024 annual Google I/O developer conference, the very event where the company is trying to push beyond its core advertising business with new devices and AI-powered tools, the company unveiled 'Project Astra'.

Developed by Google’s DeepMind AI lab, the AI is meant to allow AI assistants to help users’ everyday lives by using phone cameras to interpret information about the real world.

For example, the AI can be used to identify objects, and even find misplaced items.

Google also hinted at how it would work on augmented reality glasses.

During the I/O conference, Google CEO Sundar Pichai mentioned the term "AI" for more than 100 times.

Google showed its high expectation with AI, and how it wants to use the technology to make products that will become a bigger part of users’ lives, such as by sharing information, interacting with others, finding objects around the house, making schedules, shopping and using an Android device.

Google essentially wants its AI to be part of everything users do.

From being a search engine company, Google is becoming an AI-first company.

To show how serious it is, Pichai kicked off the event by highlighting various new features powered by its latest AI model Gemini 1.5 Pro.

One new feature, called Ask Photos, allows users to search photos for deeper insights, such as asking when your daughter learned to swim or recall what your license plate number is, by looking through saved pictures.

He also showed how users can ask Gemini 1.5 Pro to summarize all recent emails by analyzing attachments, and summarizing key points and spitting out action items.

Meanwhile, Google executives took turns demonstrating other capabilities, such as how the latest model could “read” a textbook and turn it into a kind of AI lecture featuring natural-sounding teachers that answer questions.

And this Project Astra, is meant to be part of Gemini's list of abilities to take in different kinds of input, like text, voice or images.

Or often referred to as "multimodal."

Just one day before, OpenAI, considered one of the tech's leaders in AI, unveiled GPT-4o that it says will make ChatGPT a lot smarter and easier to use.

According to the company, GPT-4o shares the same level of knowledge with GPT-4, but twice as fast, but with half the price.

With it, users can engage in real-time, through spoken conversations and interact using text and “vision.”

Project Astra is a direct response to ChatGPT’s efforts, which in Google's words in its blog post, is an "advanced seeing and talking responsive agent."

A Google executive also demoed a virtual “teammate” that can help stay on top of to-do lists, organize data and manage workflow.

Gemini, like other AI tools such as ChatGPT, is trained on vast troves of online data. Experts have long warned about the shortcomings around AI tools, such as the potential for inaccuracies, biases and the spreading of misinformation. Still, many companies are forging ahead on AI tools or partnerships.

As Google grows its AI footprint, and as worries grow, Google said that it's partnering with experts and institution to test and improve its AI.

Published:

15/05/2024