OpenAI Introduces A Text-To-3D-Model Generator It Calls The 'Shap·E' AI

OpenAI Shap·E AI 800x800 pixels

AI is about intelligence demonstrated by machines, as opposed to natural intelligence displayed by living things.

OpenAI is among the pioneers in the AI field, and having created the 'GPT' for images, the company also has a few other AI products, each with remarkable features.

And this time, it's going a step further.

From DALL·E to DALL·E 2, to then introducing Point·E, OpenAI introduces another product with the letter "E" as its suffix.

The company calls it the 'Shap·E'.

In a research paper, researchers at OpenAI said that:

"We present Shap-E, a conditional generative model for 3D assets."

"When trained on a large dataset of paired 3D and text data, our resulting models are capable of generating complex and diverse 3D assets in a matter of seconds."

Shap·E is essentially a step up from OpenAI's point cloud-based Point·E, in which the company upgraded the technology to feature more-realistic textures and lighting.

It's an AI-driven text-to-3D-model generator that uses diffusion models to create 3D point clouds, which are collections of individual points in 3D space that represent the shape of an object. These points are not connected, nor do they include explicit information about an object’s surface or structure.

The AI uses these point clouds to convert them into mesh.

OpenAI taught Shap·E how to generate 3D models by training an encoder to convert 3D objects into mathematical representations, called implicit functions, to then train a diffusion model to generate new 3D models based on those representations.

The encoder ensures that the generated 3D assets are consistent with the input text, while the diffusion model adds diversity and realism to the output.

The results is an AI that can deal with more-realistic textures and lighting effects.

This is because Shap·E can directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields (NeRFs).

What this means, Shap·E can produce high-quality 3D assets that have fine-grained textures and complex shapes, unlike previous models that only output point clouds or voxels.

And more, it can even handle a wide range of textual prompts, from simple descriptions to complex queries.

Lastly, in comparison, Shap·E is also faster than Point·E.

OpenAI Shap·E

OpenAI has showcased several instances of Shap·E’s results, such as 3D renderings based on text prompts like a bowl of food, a penguin, a voxelized dog, a campfire, an avocado-shaped chair, among others.

Results created by Shap·E are created in just a matter of seconds.

However, it's worth noting that despite OpenAI claims that Shap·E is better than Point·E, because it understands the concept of model weights, inference code, and samples, results are still far from the deal deal.

For example, "an airplane that looks like a banana."

But of course, Shap·E was introduced at its early stages, meaning that it still has long way to go to become a proper text-to-3D-model generator that is useful without human intervention.

OpenAI has made Shap·E an open-source project on GitHub.

Published: 
19/05/2023