Breaking Ground in 3D Modeling: Unveiling 3D-GPT

Dec. 20, 2023. 3 min. read.

9 Interactions

Researchers from the Australian National University, University of Oxford, and Beijing Academy of Artificial Intelligence have collaboratively developed a groundbreaking framework known as 3D-GPT for instruction-driven 3D modeling.

The framework leverages large language models (LLMs) to dissect procedural 3D modeling tasks into manageable segments and appoints the appropriate agent for each task.

The paper begins by highlighting the increasing use of generative AI systems in various fields such as medicine, news, politics, and social interaction. These systems are becoming more widespread and are used to create content across different formats. However, as these technologies become more prevalent and integrated into various applications, concerns arise regarding public safety. Consequently, evaluating the potential risks posed by generative AI systems is becoming a priority for AI developers, policymakers, regulators, and civil society.

To address this issue, the researchers introduce 3D-GPT, a framework that utilizes large language models (LLMs) for instruction-driven 3D modeling. The framework positions LLMs as proficient problem solvers that can break down the procedural 3D modeling tasks into accessible segments and appoint the apt agent for each task.

The 3D-GPT framework integrates three core agents: the task dispatch agent, the conceptualization agent, and the modeling agent. They work together to achieve two main objectives. First, they enhance initial scene descriptions by evolving them into detailed forms while dynamically adapting the text based on subsequent instructions. Second, they integrate procedural generation by extracting parameter values from enriched text to effortlessly interface with 3D software for asset creation.

The task dispatch agent plays a crucial role in identifying the required functions for each instructional input. For instance, when presented with an instruction such as “translate the scene into a winter setting”, it pinpoints functions like add snow layer() and update trees(). This pivotal role played by the task dispatch agent is instrumental in facilitating efficient task coordination between the conceptualization and modeling agents. From a safety perspective, the task dispatch agent ensures that only appropriate and safe functions are selected for execution, thereby mitigating potential risks associated with the deployment of generative AI systems.

The conceptualization agent enriches the user-provided text description into detailed appearance descriptions. After the task dispatch agent selects the required functions, we send the user input text and the corresponding function-specific information to the conceptualization agent and request augmented text. In terms of safety, the conceptualization agent plays a vital role in ensuring that the enriched text descriptions accurately represent the user’s instructions, thereby preventing potential misinterpretations or misuse of the 3D modeling functions.

The modeling agent deduces the parameters for each selected function and generates Python code scripts to invoke Blender’s API. The generated Python code script interfaces with Blender’s API for 3D content creation and rendering. Regarding safety, the modeling agent ensures that the inferred parameters and the generated Python code scripts are safe and appropriate for the selected functions. This process helps to avoid potential safety issues that could arise from incorrect parameter values or inappropriate function calls.

The researchers conducted several experiments to showcase the proficiency of 3D-GPT in consistently generating results that align with user instructions. They also conducted an ablation study to systematically examine the contributions of each agent within their multi-agent system.

Despite its promising results, the framework has several limitations. These include limited curve control and shading design, dependence on procedural generation algorithms, and challenges in processing multi-modal instructions. Future research directions include LLM 3D fine-tuning, autonomous rule discovery, and multi-modal instruction processing.

In summary, the research paper introduces a novel framework that holds promise in enhancing human-AI communication in the context of 3D design and delivering high-quality results.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

About the writer

Wendwossen Dufera

3.68977 MPXR

Wendwossen is a young tech enthusiast with a vision for AI and blockchain to drive growth in less developed nations, ensuring global inclusivity for the impending singularity. He is committed to bridging technological gaps and fostering equal opportunities worldwide.

Comment on this article

You must be logged in to post a comment.

3 Comments

3 thoughts on “Breaking Ground in 3D Modeling: Unveiling 3D-GPT”

Lean7d
3 mons ago
0.60075 MPXR

Amazing

Like

Dislike

💯 💘 😍 ✨ 🎉 👏
🟨 😴 😡 ❌ 🤮 💩

Share

Reply
Tibebe S.
7 mons ago
2.23518 MPXR

Wow, this is a game-changer, I cant wait till this becomes mainstream

Like

Dislike

💯 💘 😍 ✨ 🎉 👏
🟨 😴 😡 ❌ 🤮 💩

Share

Reply
Samuel Birhanu
7 mons ago
6.23996 MPXR

1 interactions

Thanks for the informative article!

1 Like

Dislike

💯 💘 😍 ✨ 🎉 👏
🟨 😴 😡 ❌ 🤮 💩

Share

Reply

Breaking Ground in 3D Modeling: Unveiling 3D-GPT

About the writer

Wendwossen Dufera

3 thoughts on “Breaking Ground in 3D Modeling: Unveiling 3D-GPT”

Related Articles

Flying cars: Faster, Higher, Stronger

From Farm to Fork: The Blockchain Revolution in Agriculture and Food Supply

$570 Million Stolen: Crypto Hacks Surge in Q2 2024

Using AI to Monetize Your Data: Super AI Series - Part 5

Welcome Back

No account? Create One

Join

Already have an account? Sign in

forgot password

Breaking Ground in 3D Modeling: Unveiling 3D-GPT

About the writer

Wendwossen Dufera

share

Copy link

Facebook

Twitter

Telegram

Linkedin

Interactions

3 thoughts on “Breaking Ground in 3D Modeling: Unveiling 3D-GPT”

Related Articles

Flying cars: Faster, Higher, Stronger

From Farm to Fork: The Blockchain Revolution in Agriculture and Food Supply

$570 Million Stolen: Crypto Hacks Surge in Q2 2024

Using AI to Monetize Your Data: Super AI Series - Part 5

share

Copy link

Facebook

Twitter

Telegram

Linkedin

Content interactions