Google Introduces 'Veo', Its 'Most Capable Generative Video Model', As Well As 'Imagen 3'

Google

AI is only as good as the data it has been trained on, and Google shows that its AI is exceptionally good.

In an arms race that followed the the rise of generative AI, Google announced its new AI media creation engines. The first one, is 'Veo', which can produce "high-quality" 1080p videos; and Imagen 3, the successor of its Imagen 2 text-to-image framework.

Neither of them are revolutionary, because they don't introduce anything new.

But in its war against OpenAI's Sora and DALL·E 3, Google is introducing the two AI tools as more than just a rival.

Announced at its annual Google I/O developer conference, the company wants Veo and Imagen 3 to simply go a step beyond what was previously imagined.

First of, Veo, according to Google in a blog post, has "an advanced understanding of natural language and visual semantics" to create whatever video users have in mind.

What's more, the AI generated videos can last "beyond a minute."

Veo is also capable of understanding cinematic and visual techniques, like the concept of a timelapse.

"We're exploring features like storyboarding and generating longer scenes," DeepMind CEO Demis Hassabis said onstage.

"Veo gives you unprecedented creative control."

Veo is meant to live within VideoFX, Google's recorder app, and some features will arrive in YouTube Shorts and other Google products in the future, Google said.

To distinguish that they are AI-generated, all videos created in Veo will tout a new SynthID, which is an imperceptible digital watermark developed by Google.

"Over the past year, we've made incredible progress in enhancing the quality of our generative media technologies," said Google vice president of product management Eli Collins and senior research director Doug Eck in a blog post.

"We've been working closely with the creative community to explore how generative AI can best support the creative process, and to make sure our AI tools are as useful as possible at each stage," the pair wrote.

In all, Veo competes directly against OpenAI's video generator Sora.

Next up, is Imagen 3.

Google DeepMind said that the AI is the company's "highest quality" text-to-image model, with "incredible level of detail" for "photorealistic, lifelike images" and fewer artifacts.

Published: 
17/05/2024