Cheap open AI matches OpenAI and DeepSeek in some tests

2025-02-07
2 min read.
AI researchers started with a free Qwen model and used distillation to make a new AI model for less than $50 in cloud computing costs.

AI researchers from Stanford and the University of Washington made a new artificial intelligence (AI) model called s1 for less than $50 in cloud computing costs, TechCrunch reports. One of the researchers told TechCrunch that, today, one could rent the necessary compute for about $20.

This follows in the wake of related developments such as TinyZero and Sky-T1.

The s1 model matches big names like OpenAI's o1 and DeepSeek's R1 in math and coding tests. The researchers have described s1 in a paper titled "s1: Simple test-time scaling," published in arXiv. Test-time computing happens during operations, not training. The researchers have also shared s1 on GitHub, including all the training data and code.

The researchers started with a basic model and used distillation to teach it reasoning. Distillation means taking the reasoning skills from another AI and training a new one with its answers. They used Google's Gemini 2.0 Flash Thinking Experimental for this.

The researchers focused on making s1 good at reasoning and giving it extra time to think before answering.

They trained s1 with supervised fine-tuning (SFT), where you show the model examples to follow. This method is cheaper than the large-scale learning DeepSeek used for R1. Google lets people use Gemini 2.0 for free, but with limits, and doesn't allow using it to make competing AI services.

S1 started from a free model by Qwen, an Alibaba-owned lab. The researchers used only 1,000 questions for training, which took less than 30 minutes on high-powered GPUs. They made s1 better by telling it to "wait" before answering, which improved accuracy.

But a distilled copy is not a superior model

The s1 project shows that small teams can innovate in AI without huge funds, but it also questions how unique AI models really are. Big AI companies worry about data theft for model distillation. However, while distillation is cost-effective for copying AI abilities, it doesn't make totally new, superior models.

Big companies plan to spend billions on AI to make entirely new and superior models.

#AutomatedReasoning

#Deeplearning



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!

Mindplex

Mindplex is an AI company, a decentralized media platform, a global brain experiment, and a community dedicated to the rapidly unfolding future. Our platform empowers our community to share and discuss futurist content while showcasing AI and blockchain tools that enhance the media experience. Join us and shape the future of digital media!

ABOUT US

FAQ

CONTACT

Editors

© 2025 MindPlex. All rights reserved