'CariGANs' AI From Microsoft Artistically Draws Caricatures Of People's Faces

AI can do a lot of things, and image recognition which include image manipulation is one of the most common usage of AIs.

To name a few, researchers have created an AI that is able to draw something original using just text commands, and having AI to turn doodles to proper pictures, and even artistic ones too, including an AI that paints like the famous Vincent van Gogh.

This time, researchers have another thing in mind, and that is to create caricatures using AI.

Kaidi Cai from Tsinghua University partnered with AI researcher from Microsoft Jing Liao and Lu Yuan to create a caricature-drawing neural network that consists of a pair of generative adversarial networks (GAN), or also called CariGANs.

The first of its neural networks, CariGeoGAN, is made to determine the geometry of a face in a photograph and maps it to a caricature model. The second neural network, called CariStyGAN, continues what the first did, and does the "style transfer," or applies the artistic look to the geometry map.

CariGANs caricature
The overall pipeline of the method

The two CariGANs were trained on tens of thousands of hand-drawn images which cover diverse gender, races, ages, expressions, poses and etc., in order for them to be able to turn relatively boring photographs to delightful caricatures.

For caricature generation, previous methods were based on AI learning from examples, relying on paired photo-to-caricature images. Artists were also required to paint corresponding caricatures for each photo, to make dataset suitable for supervised learning.

But here, the researchers' goal was to make the AI to learn how to create photo-to-caricature translation from unpaired photos and caricatures, where no pairing exists between the two domains.

But here is the problem:

"Since photo domain and caricature domain may be obviously different in both geometry shape and texture appearance. We cannot directly learn the mapping form X to Y by other existing image-to-image translation networks," said the researchers on their paper titled CariGANs: Unpaired Photo-to-Caricature Translation (https://ai.stanford.edu/~kaidicao/carigan.pdf).

CariGANs caricature
Results are generated with a random style code (first four) or a given reference (last two). Top row shows the two reference images

To manage this issue, when training the machines, the researchers conducted two studies.

The first was to ensure the CariGANs can create results which retain the identity of the portrait subject. The goal here, is to ensure that the caricature captures the subject's essence, but in an exaggerated form. According to the researchers, respondents indicated that the CariGANs caricatures compared favorably to those hand-drawn by artists.

The second study was to determine if the overall effectiveness of the AI-made drawing compared to human-drawn pieces. This too appears to be a success.

According to the researchers:

"Note that ours is ranked better than the hand-drawn one 22.95% of the times, which means our results sometime can fool users into thinking it is the real hand-drawn caricature. Although it is still far from an ideal fooling rate (i.e., 50%), our work has made a big step approaching caricatures drawn by artists, compared to other methods."
CariGANs caricature
A video caricature example. The upper row shows the input video frames and the bottom row shows the generated caricature results

What makes the CariGANs AI even more interesting is that, they can also parse frames from video and create caricatures from them. What this means, the AI pair can generate a drawing from a single frame that is consistent with ones generated from other frames.

As seen from the image above, the caricatures were made from a video of U.S. President Donald Trump speaking.

This research can be useful for animators, too, as the CariGANs AI can also reverse-engineer a caricature to determine what the person in the cartoon really looks like. The researchers said that “We believe it might be useful for face recognition in caricatures.”

In conclusion, the researchers said that "Our method advances the existing methods a bit in terms of visual quality and preserving identity." The CariGANs are also better in simulating hand-drawn caricatures to some extent. "Moreover, our approach supports flexible controls for user to change results in both shape exaggeration and appearance style."

But does have some limitations.

For example, the researchers noted that geometric exaggeration using CariGANs is more obviously observed in the face shape than other facial features, and there are also some small geometric exaggerations on ears, hairs, wrinkles and etc., that cannot be covered.

Published: 
21/11/2018