Background

The New Era Of Video, And How Google's Veo 3 Can Now Bring Images To Life

Google Veo 3 example

AI is moving at a breakneck pace, and with all of its might, Google is determined not only to remain relevant but also to stay ahead of the competition.

The arrival of OpenAI's ChatGPT in late 2022 was widely seen as a pivotal moment for Google, reportedly leading the company to issue a "code red" alarm.

The chatbot's ability to provide direct, conversational answers was perceived as a significant threat to Google's core search engine business model, which relies heavily on directing users to ad-supported links.

In response, Google introduced Gemini, its answer to ChatGPT and other emerging rivals.

As tech companies continue to pursue dominance, the technology behind large language models (LLMs) continues to improve. Now, alongside advancements in hardware and training datasets, AI can create videos from just a prompt.

To address the highly anticipated OpenAI's Sora and other competitors, Google launched its own video generation model called Veo, which was then succeeded by Veo 2.

However, neither of these could compare to Veo 3.

This particular model has gained massive popularity due to the realism of its results and its ability to incorporate sound, not just video. Now, Google is taking a step further by enabling Veo 3 to also create videos from user-uploaded images.

In a blog post, Google said that:

"We launched our state-of-the-art video generation model Veo 3 in May — and last week, we expanded access to Google AI Pro subscribers in over 150 countries. Now, with a new photo-to-video capability in Gemini, you can now transform your favorite photos into dynamic eight-second video clips with sound."

The process for creating a video from an image is designed to be straightforward.

Users can access the "Videos" tool directly within the Gemini prompt box, where they can upload a photo.

From there, they simply provide a description of the scene and any desired audio, and the AI will process the request, transforming the still image into a dynamic video.

Once the video is complete, it can be shared or downloaded from the platform, though all videos are capped at eight seconds in length.

Google has positioned its new image-to-video tool as a powerful creative outlet for subscribers, allowing them to bring drawings, nature scenes, and everyday objects to life.

With this capability, images in a user’s photo gallery that may have been forgotten can now be transformed into dynamic and engaging videos.

The new feature is expected to dramatically increase video creation. In just its first few weeks, the core Veo 3 model has already generated over 40 million videos.

This new photo-to-video capability should contribute significantly to that number, attracting both creative professionals and casual users alike.

The feature is currently rolling out to Google AI Pro and Ultra subscribers in select countries and is accessible through the main Gemini platform at gemini.google.com.

The same underlying technology is also integrated into Flow, Google's AI filmmaking tool.

It is worth noting that all videos generated with the Veo 3 model will include a visible "Veo" watermark and an invisible SynthID digital watermark, which Google uses to identify AI-generated content.

Published: 
13/07/2025