Background

Google's 'Gemini Omni' Joins Seedance And HappyHorse In The Race For Ultra-Realistic AI Video

Google Gemini Omni

The LLM war that eventually evolved to increasingly realistic videos has reached a new stage.

What started as companies racing to build better chatbots and text generators has quietly turned into a contest over who can produce video clips that hold up under close inspection. Text coherence, synchronized speech, natural hand movements, and accurate details in technical scenes have all become the new benchmarks.

A fresh example surfaced this week in a short clip shared on X, showing a professor at a chalkboard deriving the trigonometric identity from sin squared x plus cos squared x equals one into one plus tan squared x equals sec squared x.

The professor speaks clearly while writing each step, and the equations remain legible throughout.

The video was generated directly in the Gemini app using a straightforward prompt about a professor explaining a proof on a traditional chalkboard.

It comes from Google's unreleased 'Gemini Omni' video model, which appeared briefly for some users on May 11 before reportedly being pulled back.

Independent confirmations from tech observers and the original posters established that the output is genuine, not fabricated or edited after the fact.

Minor artifacts remain visible, such as slight mismatches in writing speed or chalk marks that appear instantly, but these are consistent with current limitations across the field rather than signs of fakery.

Another video that also goes viral, depicts two men eating spaghetti bolognese, showcasing voice quality that is much better than the Veo models by quite a large margin. It even added some light background music, that would fit right in with an upscale dining experience.

With China's ByteDance's Seedance 2 that went viral for its realism, to then Alibaba's HappyHorse 1.0 (also from China) which is now sitting comfortably at the top of rankings for more than one whole month, this Google tool is arguably on par with them, if not above.

The clip stands out because it handles dense technical content without the usual drift in formulas or speech that has plagued earlier models.

For now the model remains unavailable to the public, yet its brief appearance suggests the gap between leading video generators is narrowing faster than expected. Educators, content creators, and anyone tracking synthetic media will likely watch the next round of releases closely, as demonstrations like this one move the conversation from what is possible in principle to what is already appearing in everyday demos.

Published: 
12/05/2026