back Back

TinyZero emulates DeepSeek for $30 on a specific task

Jan. 31, 2025.
2 mins. read. 3 Interactions

AI researchers claim that they reproduced the core abilities of DeepSeek R1-Zero with open source model TinyZero, for just $30.

About the Writer

Giulio Prisco

139.03817 MPXR

Giulio Prisco is Senior Editor at Mindplex. He is a science and technology writer mainly interested in fundamental science and space, cybernetics and AI, IT, VR, bio/nano, crypto technologies.

Artificial intelligence (AI) researchers at UC Berkeley claim that they reproduced the core abilities of DeepSeek R1-Zero for just $30, Tom’s Hardware reports. This shows how you can make advanced models cheaply.

Research leader Jiayi Pan posted an X thread about this. “You can experience the Ahah moment yourself for < $30,” he said. He wrote on his own website: “We release TinyZero, the first open reproduction of reasoning models. Through RL, the 3B base LM develops self-verification and search abilities all on its own.”

The researchers taught the model to verify and search answers using reinforcement learning. They started with a basic language model, a prompt, and a reward system. They tested the model with the Countdown game, where players use basic math to reach a target number from given numbers.

The model began with wrong guesses but learned to revise and search for the right answer. For example, it would suggest an answer, check if it was correct, and adjust until it found the solution.

Impressive cost reduction for a specific task

The researchers different model sizes, starting with one having 500 million parameters and then going up in size.

“We run Qwen-2.5-Base 0.5B, 1.5B, 3B to 7B. 0.5B guess a solution and stop,” said Pan. “From 1.5B, the model start learning to search, to self-verify and to revise its solutions, enabling them to achieve much higher scores.” Qwen is the AI model developed by Alibaba, which Alibaba has recently updated.

The code for the model, called TinyZero, is available on GitHub.

Impressively, this cost them only $30, much less than using services like OpenAI’s API, which costs $15 per million input tokens. DeepSeek-R1’s cost is $0.55 per million tokens. Pan’s project makes AI research more accessible due to its low cost.

“We hope this project helps to demystify the emerging RL scaling research and make it more accessible!,” concluded Pan. “One caveat, of course, is that it’s validated only in the Countdown task but not the general reasoning domain.”

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Comment on this article

0 Comments

0 thoughts on “TinyZero emulates DeepSeek for $30 on a specific task

3

Like

Dislike

Share

Comments
Reactions
💯 💘 😍 🎉 👏
🟨 😴 😡 🤮 💩

Here is where you pick your favorite article of the month. An article that collected the highest number of picks is dubbed "People's Choice". Our editors have their pick, and so do you. Read some of our other articles before you decide and click this button; you can only select one article every month.

People's Choice
Bookmarks