Using GAN, Nvidia Created AI Capable Of Turning Doodles To Realistic Imagery

Artificial Intelligence can be used for a lot of things. But the drawback of machines is that, it doesn't have imagination to start with. Nvidia is proving that wrong.

Using an AI model known as a generative adversarial network (GAN), the chipmaker Nvidia created 'GauGAN' that is essentially a "smart paint brush."

Here, users can sketch simple and basic outline of a scene, like doodling something, to create realistic images of them.

The idea is that the technology could be implemented for developers and designers to render new virtual environments for video games and movies, or for training self-driving cars.

This tool could help “everyone from architects and urban planners to landscape designers and game developers” in the future, as “It’s much easier to brainstorm designs with simple sketches, and this technology is able to convert sketches into highly realistic images,” said Bryan Catanzaro, VP of applied deep learning research at Nvidia.

While the results aren't really photorealistic, but they are impressive.

GauGAN was named after the post-Impressionist artist Paul Gauguin. And this AI is like an artist.

GauGAN has three tools: a paint bucket, pen and pencil.

At the bottom of the screen, there is a series of objects users can select. Using the cloud object, for example, users can draw a line with the pencil, and the software will produce photorealistic clouds.

Drawing a circle and fill it with the paint bucket, and the software will render summer clouds.

Users can use the input tools to draw the shape of a tree and the software will generate an image of a tree.

Draw a straight line, and it will automatically create a tree trunk.

And drawing a bulb-shape sketch on top of that trunk will turn it into a full tree with leaves.

The AI is also capable of doing this by associating each of the colors with specific objects, such as brown for "rock" and light blue for "sky."

Once users add a paint stroke in a specific color, the deep-learning model that was trained by Nvidia using a million images from Flickr, automatically fills in the texture and creates the details.

The tool also comes with different filters for changing the time of day, from sunrise to sunset, or the style of painting, from photorealistic to Impressionist.

"It’s like a coloring book picture that describes where a tree is, where the sun is, where the sky is," Catanzaro said.

"And then the neural network is able to fill in all of the detail and texture, and the reflections, shadows and colors, based on what it has learned about real images."

This software isn’t really groundbreaking.

Previously, researchers have showed off similar tools in the past, including one from Google that is capable of turning doodles into cliparts. But the GauGAN from Nvidia is taking that capability to the next level. The software generates AI landscapes instantly, and it’s intuitive.

For example, when a user draws a tree and then a pool of water underneath it, the AI understands that and automatically adds the tree’s reflection to the pool.

Nvidia said that the software can synthesize hundreds of thousands of objects and their relation to other objects in the real world.

But again, just like any technologies, GauGAN is not perfect.

First of all, the technology can’t just paint in any texture users think of. For example, generating fake grass and water is relatively easy for GANs because the visual patterns involved are unstructured. But generating a images of buildings and furniture can be tricky. The results are less convincing and much less realistic.

The reason for this is because those objects have a logic and structure to them that humans are sensitive to. GANs can manage to overcome this sort of challenge, like on Nvidia's previous project that is capable of creating AI-generated faces with relative ease. But it needs a lot more effort.

Second, the boundaries or gaps between objects are not aligned perfectly.

This is because neural networks have the tendencies to have issues on objects they were trained on, to understand what they are trained to do.

And for last, how Nvidia demoed the AI shows how easy people can create fakes using AI. It also raises important questions about the potential these kind of algorithms can have in spreading disinformation and undermine the truths in the future.

Catanzaro agrees on this, noting that it’s bigger than one project and company.

"We care about this a lot because we want to make the world a better place,” he said, adding that this is a trust issue instead of a technology issue and that people, as a society, must deal with.

And this is one of the reasons why GauGAN that is initially made to specialize in nature scenes, is not available for public use.

Published:

20/03/2019