In the ever-changing AI landscape, the market tends to favor those with the best offering.
And as with the case of generative AI, the field that first became the hype following the introduction of ChatGPT by OpenAI, users are always on the lookout for increasingly powerful models.
This time, the increasingly ubiquitous Gemini doesn't disappoint.
Google has unveiled a significant advancement in AI-driven content creation by integrating its cutting-edge video generation model, Veo 2, into the Gemini platform.
In a blog post, Google said that:
This fusion empowers users to transform text prompts into high-quality, eight-second video clips, marking a new era in digital storytelling.
The journey began with the introduction of Gemini in December 2023, Google's most capable and general AI model designed to be multimodal, seamlessly understanding and operating across text, code, audio, image, and video.
Gemini was built from the ground up to be flexible and efficient, capable of running on everything from data centers to mobile devices.
In May 2024, Google announced Veo, a multimodal video generation model capable of producing 1080p videos beyond a minute long.
By December 2024, Veo 2 was released, offering enhanced realism and fidelity, improved understanding of real-world physics, and advanced motion capabilities.
Now, with the integration of Veo 2 into Gemini, users can generate eight-second videos at 720p resolution directly from text prompts.
These videos are delivered as MP4 files in a 16:9 landscape format and come watermarked with SynthID to indicate AI generation. Users can also share their creations directly to platforms like TikTok and YouTube.
The convergence of Gemini and Veo 2 signifies a leap forward in AI-powered content creation, offering users an intuitive and powerful toolset to bring their ideas to life through text and video.
At this time, Google’s applications of Veo 2 is kind of basic. But the CEO of Google DeepMind, Demis Hassabis, said that the company plans to eventually combine its Gemini AI models with its Veo models to improve the former’s understanding of the physical world.
Along with the Gemini and Veo 2 integration news, Google is also making Whisk Animate — a tool that lets users transform an image into an eight-second video with Veo 2 — available to Google One AI Premium subscribers.
This builds upon Google’s existing Whisk tool, which lets users create AI-generated mashups of images. Whisk Animate is available to subscribers globally through Google Labs.
This integration is available to Gemini Advanced subscribers, who can access Veo 2 from the model drop-down menu in the Gemini app on both web and mobile platforms. Additionally, Google introduced Whisk Animate, a companion tool that transforms static images into video clips using Veo 2, further expanding creative possibilities.
It's worth noting that there is a limit to how many videos users can create per month, and that Google Workspace business and education plans aren’t supported at the moment, the company said.
At this time, creators and artists are increasingly worried about tools like Veo 2, or OpenAI's Sora and more, which are threatening the entire creative industries.
With video generators, creating a unique video is only a prompt away. Not everyone likes this.