New AI tool makes fast, high-quality images

2025-03-21
2 min read.
Researchers create a hybrid method to quickly generate realistic images, helping train self-driving cars and robots.
New AI tool makes fast, high-quality images
Credit: Tesfu Assefa

Self-driving cars need realistic images to learn how to avoid dangers on the road. These images must look real to make the cars safer. Generating high-quality images fast is key to this process. Artificial intelligence (AI) helps create these images.

One type of AI, called a diffusion model, makes very realistic images. A diffusion model works by starting with random noise - meaning messy, unclear pixels - and slowly cleaning it up step by step. This process takes time and uses a lot of computer power. Another type, called an autoregressive model, is faster. It predicts parts of an image one by one, like guessing puzzle pieces. These models power tools like ChatGPT but often make blurry or wrong images.

Researchers from MIT and NVIDIA built a new tool called HART. HART stands for hybrid autoregressive transformer. It mixes the speed of autoregressive models with the detail of diffusion models. First, it uses an autoregressive model to quickly sketch the big parts of an image. Then, a small diffusion model fixes the tiny details. This teamwork makes HART fast and good at creating clear images. It works nine times faster than top diffusion models and needs less computer power. People can use it on a laptop or phone by typing a simple sentence.

The best of both worlds

The researchers describe HART in a preprint published in arXiv. HART might train robots to do tricky tasks or help game designers make cool scenes. Diffusion models usually take 30 steps to finish an image, but HART’s diffusion part only needs eight steps. This is because it just adds details after the fast model does most of the work. The researchers say it’s like painting a picture: start big, then add small touches.

The team faced challenges combining the two models. Early tries caused mistakes to pile up. They fixed this by using the diffusion model only at the end. HART’s smart design beats bigger models while using less power. In the future, it could work with vision-language tools or even make videos and sounds.

#GenerativeModels



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!

Mindplex

Mindplex is an AI company, a decentralized media platform, a global brain experiment, and a community dedicated to the rapidly unfolding future. Our platform empowers our community to share and discuss futurist content while showcasing AI and blockchain tools that enhance the media experience. Join us and shape the future of digital media!

ABOUT US

FAQ

CONTACT

Editors

© 2025 MindPlex. All rights reserved