Google Introduces 'Veo', Its 'Most Capable Generative Video Model', As Well As 'Imagen 3'

AI is only as good as the data it has been trained on, and Google shows that its AI is exceptionally good.

In an arms race that followed the the rise of generative AI, Google announced its new AI media creation engines. The first one, is 'Veo', which can produce "high-quality" 1080p videos; and Imagen 3, the successor of its Imagen 2 text-to-image framework.

Neither of them are revolutionary, because they don't introduce anything new.

But in its war against OpenAI's Sora and DALL·E 3, Google is introducing the two AI tools as more than just a rival.

Announced at its annual Google I/O developer conference, the company wants Veo and Imagen 3 to simply go a step beyond what was previously imagined.

Explore our latest updates for generative video, images and music:
Veo, our new video-generation model
Imagen 3, our highest-quality image generation model
New music generated with our Music AI Sandbox
Learn more → https://t.co/6U60q3Zuah #GoogleIO pic.twitter.com/sYGf5M0LFF
— Google (@Google) May 15, 2024

First of, Veo, according to Google in a blog post, has "an advanced understanding of natural language and visual semantics" to create whatever video users have in mind.

What's more, the AI generated videos can last "beyond a minute."

Introducing Veo: our most capable generative video model.

It can create high-quality, 1080p clips that can go beyond 60 seconds.

From photorealism to surrealism and animation, it can tackle a range of cinematic styles. #GoogleIO pic.twitter.com/6zEuYRAHpH
— Google DeepMind (@GoogleDeepMind) May 14, 2024

Veo is also capable of understanding cinematic and visual techniques, like the concept of a timelapse.

"We're exploring features like storyboarding and generating longer scenes," DeepMind CEO Demis Hassabis said onstage.

"Veo gives you unprecedented creative control."

Veo is meant to live within VideoFX, Google's recorder app, and some features will arrive in YouTube Shorts and other Google products in the future, Google said.

To distinguish that they are AI-generated, all videos created in Veo will tout a new SynthID, which is an imperceptible digital watermark developed by Google.

"Over the past year, we've made incredible progress in enhancing the quality of our generative media technologies," said Google vice president of product management Eli Collins and senior research director Doug Eck in a blog post.

"We've been working closely with the creative community to explore how generative AI can best support the creative process, and to make sure our AI tools are as useful as possible at each stage," the pair wrote.

In all, Veo competes directly against OpenAI's video generator Sora.

Next up, is Imagen 3.

Google DeepMind said that the AI is the company's "highest quality" text-to-image model, with "incredible level of detail" for "photorealistic, lifelike images" and fewer artifacts.

We’re introducing Imagen 3: our highest quality text-to-image generation model yet.

It produces visuals with incredible detail, realistic lighting and fewer distracting artifacts.

From quick sketches to very high-res imagery, here’s a look at what it can create. #GoogleIO pic.twitter.com/XMrQYGeSiO
— Google DeepMind (@GoogleDeepMind) May 14, 2024

Published:

17/05/2024

Dark Mode

Search form

Google Introduces 'Veo', Its 'Most Capable Generative Video Model', As Well As 'Imagen 3'