By Tracking Moving Objects, This AI Can Remove Them From Videos, Effortlessly

People can simply use software to manually create and edit images. But that requires them some knowledge and expertise. AIs can take those burdens, and automate the task.

For example, AIs can change the background of any image automatically, capable of enhancing photos in in a way Adobe Photoshop can't, capable of generating images of food that doesn't exist, capable of turning 2D images into 3D scenes, and many more.

The same goes with videos. While people can actually edit videos flawlessly, just like Hollywood, but the efforts can be daunting.

When dealing with videos which have moving objects related to the viewer. conventional object-removal techniques for video require specialized video editing skills and software. And the process can be very time-consuming.

This time, again, AI comes to the rescue.

Called simply as 'Video-Object-Removal', the AI can remove moving objects from videos, effortlessly.


it allows users to highlight objects from a footage by just drawing a box over it, to make the software do the rest of the heavy work.

In an example, the algorithm is capable or removing a person crossing the street.

How it does it, is by tracking the moving object inside the box, and remove the visual information. It then perform “inpainting”, a technique that uses inference to reconstruct lost or corrupted parts of an image, to fill in the “hole” left by our departed pedestrian.

While the AI is capable of removing the person effortlessly, it somehow leaves some traces behind.

On the place where the person previously walked, the result recreated footage shows some bending lines as the camera moves with the now-gone subject. In other cases, removing an object from a video resulted in smudges.


The software was built by a developer that goes with the pseudonym 'zllrunning'.

According to the developer, the algorithm relied on two training models:

  1. SiamMask: a simple multi-task learning approach that can be used to address both visual object tracking and semi-supervised video object segmentation.

    A trained SiamMask can produce object segmentation masks and rotate bounding boxes at 55 frames per second, relying solely on an initialized bounding box. The system established a new way to track moving objects in real-time.

  2. Deep Video Inpainting: is designed to fill spatiotemporal holes with reasonable content in a video. The framework is designed to synthesize unknown regions in videos using an image-based encoder-decoder model and release more semantically correct and smoother images.

    In other words, it's used to mask the object out of the footage.

  3. Video-Object-Removal

    Video-Object-Removal isn't actually the first of its kind.

    Previously in May 2019, researchers from the Nanyang Technological University in Singapore teased an algorithm that performs the same exact function. Tech giant Adobe also showcased a similar function in its video editing app After Effects using its Sensei AI.

    But unlike After Effects, zllrunning's Video-Object-Removal is open source on GitHub.

    What this means, developers can swift through the codes and modify them to fit their needs.