With 'Emu Video' And 'Emu Edit', Meta Introduces Its Own AI-Powered Media Editors


Computers cannot dream because they don't have senses to understand their surrounding. But with AI, this is changing.

By feeding AIs a bunch of data gathered from various datasets and the internet itself, AIs have become increasingly smarter, and seemingly gain a better knowledge about how the real world. And with generative AI becoming the hype, Meta doesn't want to be left behind.

This time, the tech titan is announcing two new AI-powered imaging tools for Facebook and Instagram that can work with both videos and images.

The features are built on Emu, the AI software at the heart of Meta’s AI offerings, which according to the company in a blog post, "underpins many of our generative AI experiences."

First, is 'Emu Video'.

What it does, is leveraging the Emu model to generate video from text prompts and also from still images.

Meta had previously made an AI video generator called Make-A-Video. But Emu Video here is a big improvement over Make-A-Video.

"Our state-of-the-art approach is simple to implement and uses just two diffusion models to generate 512×512 four-second long videos at 16 frames per second," explained Meta.

"In human evaluations, our video generations are strongly preferred compared to prior work—in fact, this model was preferred over Make-A-Video by 96% of respondents based on quality and by 85% of respondents based on faithfulness to the text prompt."

Second, is 'Emu Edit'.

This AI tool allows users to alter images based on text inputs.

While this is extremely similar to what Adobe Photoshop’s Generative Fill can do, but what differentiates Emu Edit is that users don’t have to actually select the element they want to change.

According to Meta, all users have to do, is describe what they want to change, and the AI will understand that request and comply.

For example, the user can just write “remove the person” and without selecting anything, and have the AI remove the person from the image automatically.

"Emu Edit is capable of free-form editing through instructions, encompassing tasks such as local and global editing, removing and adding a background, color and geometry transformations, detection and segmentation, and more," said Meta.

"Our key insight is that incorporating computer vision tasks as instructions to image generation models offers unprecedented control in image generation and editing."

Meta said that it trained Emu on "10 million synthesized samples, each including an input image, a description of the task to be performed, and the targeted output image," and believes it to be the largest dataset of its kind.

"Current methods often lean towards either over-modifying or under-performing on various editing tasks," added Meta.

"We argue that the primary objective shouldn’t just be about producing a ‘believable’ image. Instead, the model should focus on precisely altering only the pixels relevant to the edit request."

While the world is kept busy and fascinated by how generative AI products are blurring the line between what's real and fake, the advancements in AI that are opening up new possibilities, also raises concerns about the potential misuse.

With generative AI, its extremely easy for anyone to create realistic content that never exists.

This makes it challenging to distinguish between what's genuine and what's artificially generated. And not only that the fakery comes in the form of text, because with AI tools like Emu Video and Emu Edit, fakery also comes from videos and images.

So here, while generative is a double-edged sword, offering incredible possibilities but also posing potential risks like deepfakes or misinformation.

Meta didn't say when exactly the two AIs will be released, other than saying that it is "purely fundamental research" for the moment but the "potential use cases are clearly evident."