Runway 'Act Two' Adds Hand Tracking And Enhances Head, Face, And Body Generation

Technology has evolved rapidly—and among its most transformative breakthroughs is artificial intelligence.

The technology is unlocking creative possibilities that were once confined to expensive equipment and complex setups. Take the world of animation, for instance. In films and video games, capturing an actor’s facial expressions used to require a full motion capture rig. This often involved wearing a headset outfitted with specialized cameras and tiny facial markers—small dots meticulously placed on key points of the face.

These markers allowed computers to track facial movements with precision, mapping each subtle twitch, blink, or smile onto a 3D character.

This meticulous frame-by-frame process is known as facial motion capture, or more precisely, facial performance capture.

But now, AI is revolutionizing this process. What once demanded headgear, sensors, and studio-grade setups can now be achieved with nothing more than a standard video. No markers. No cameras mounted on helmets. Just powerful algorithms interpreting expressions in real time.

It's not just convenience—it’s liberation. And the future of animation is about to get a lot more expressive, accessible, and, hands-free.

Runway AI, the American company headquartered in New York City that specializes in generative artificial intelligence research and technologies, focuses on creating products and models for generating videos, images, and various multimedia content.

And this time, it announces 'Act-Two.'

Introducing Act-Two, our next-generation motion capture model with major improvements in generation quality and support for head, face, body and hand tracking. Act-Two only requires a driving performance video and reference character.

Available now to all our Enterprise… pic.twitter.com/wnLU46yORg
— Runway (@runwayml) July 15, 2025

As the successor of the extremely capable Act-One, Act-Two introduces major improvements in generation quality and support for head, face, and body.

And, it finally supports hand tracking.

"Act-Two only requires a driving performance video and reference character," Runway says.

Act-Two is a major improvement in fidelity, consistency and motion over Act-One. It is capable of animating any character with a single driving performance. Including head, facial expressions, upper body, hands and background.

2/4 pic.twitter.com/l4uG2ffXlM
— Runway (@runwayml) July 15, 2025

Late 2022 marked a seismic shift when OpenAI released ChatGPT,

In just days, its natural-sounding, conversational prowess captured global attention—sparking a rush of interest across industries and prompting other tech giants like Google, Microsoft, Meta, Anthropic (Claude), and xAI (Grok) to roll out competing large language models (LLMs)

Runway stands apart due to its positioning as the creator of multimodal AI—building models that understand and generate visual, motion, and video elements. .

Act‑One, the platform’s first animation engine, translated facial expressions and head movements into animated characters—capturing micro-expressions, lip sync, and upper‑body animation with surprising polish.

But it deliberately skipped hand-tracking and full-body motion.

With Act-Two, Runway literally makes its model capable of supporting full-body animation.

Key improvements include:

Hand tracking added, meaning that the AI is no longer limited to just facial cues—now fingers, hand gestures, and body motion are captured too.
Enhanced motion fidelity. What this means, all existing animation channels (face, head, body) are more precise and expressive.
Streamlined workflow to deliver a full-body character performance.

Act-Two can translate your performance to a wide variety of characters in a diverse set of environments, styles and art directions without compromising performance fidelity.

3/4 pic.twitter.com/W8O1cXD7l3
— Runway (@runwayml) July 15, 2025

With this kind of tool, game and animation studios no longer need motion capture suits or intricate rigs to bring characters to life—just a single camera is enough. With Runway’s Act-Two, full character performances can now be captured and translated into animation effortlessly, making high-end results far more accessible.

For content creators and influencers, this means a new level of expressive freedom. Finger-pointing, natural hand gestures, and full-body language can now be animated natively, without the need for any special equipment. It opens doors for more dynamic storytelling, more lifelike avatars, and more immersive content across platforms.

In industries like film, advertising, and education—where character-driven visuals matter—Act-Two offers a game-changing leap forward. AI-powered motion capture is no longer a futuristic dream; it’s becoming a practical tool for everyday creators.

The big picture? While tech giants like Google and Meta dominate the landscape with their own AI models, Runway is quietly staking its claim in a different domain: video-native, multimodal AI. Act-Two doesn’t just refine face animation—it brings creators closer to full-performance capture with nothing but a video input.

Act-Two is available now to all our Enterprise customers and Creative Partners. We will be opening access to everyone in the coming days.

4/4 pic.twitter.com/XZq5QdKjD7
— Runway (@runwayml) July 15, 2025

Published:

16/07/2025

Dark Mode

Search form

Runway 'Act Two' Adds Hand Tracking And Enhances Head, Face, And Body Generation