This '3D MoMa' AI From Nvidia Is Able To Turn 2D Images Into 3D, Efficiently

Nvidia MoMa

Artificial Intelligence allows computers to operate beyond what their programming. And at each new iteration, AIs advance to even astonish their own creators.

And this time, at the Computer Vision and Pattern Recognition conference during the Conference on Computer Vision and Pattern Recognition (CVPR) in New Orleans, Nvidia unveiled a tool for designers to create digital assets. Called '3D MoMa', the tool uses AI to inverse pipeline rendering, in order to create 3D objects from 2D images.

Before this, Nvidia has been able to create 3D scenes from 2D images, like using what it calls the Instant NeRF.

But unlike Instant NeRF, a research paper revealed that 3D MoMa has taken the AI to a whole new level by allowing the creation of triangle mesh models.

What this means, generated 3D objects are immediately available for import into graphics engines.

As Nvidia's Isha Salian puts it in a blog post, this method "could empower architects, designers, concept artists and game developers to quickly import an object into a graphics engine to start working with it, modifying scale, changing the material or experimenting with different lighting effects."

Celebrating the birth of the music jazz, NVIDIA pays tribute by using 3D MoMa to model jazz instruments.

The team started by collecting 100 images each of five instruments commonly featured in jazz ensembles: trumpet, trombone, saxophone, drums, and clarinet.

The whole process take about an hour per object on a single Nvidia Tensor core GPU.

That time was required for the GPU to generate the three key features of the triangle mesh: a 3D mesh model, materials, and lighting.

Using the mesh, developers can modify the generated objects to suit their creative needs; 2D textured materials are laid over the 3D meshes like a skin; and an estimate of lighting needed for the scene allows creators to later adjust their lighting of objects.

The team brought the three key features into Nvidia 3D Omniverse, and modified their attributes on the fly.

For example, the team converted the trumpet model from the original plastic to gold, marble, wood, and cork, demonstrating 3D MoMa’s ability to swap the material of a generated shape. The team then placed the edited instruments into a Cornell box to test for rendering quality.

Ultimately, the experiment proved that the virtual instruments would react to light in the same way that they would in the real world.

For example, brass material is able to reflect light much better than matte, which absorbed light.

Nvidia MoMa
3D MoMa leverages differentiable marching tetrahedrons to directly optimize topology of a triangle mesh. The output representation is a triangle mesh with spatially varying 2D textures and a high dynamic range environment map. (Credit: Nvidia)

"These new objects, generated through inverse rendering, can be used as building blocks for a complex animated scene - showcased in the video’s finale as a virtual jazz band," said Salian.

Inverse rendering "has long been a holy grail unifying computer vision and computer graphics," said David Luebke, vice president of graphics research at Nvidia.

"By formulating every piece of the inverse rendering problem as a GPU-accelerated differentiable component, the NVIDIA 3D MoMa rendering pipeline uses the machinery of modern AI and the raw computational horsepower of NVIDIA GPUs to quickly produce 3D objects that creators can import, edit, and extend without limitation in existing tools."

The advances in AI has been tricky. But it's only a matter of time.

This is because advancing AI requires computing power. It's only when researchers are able to get quality data. create more powerful computer and more efficient algorithms, that AIs can become better.

Published: 
23/06/2022