Cheap open AI matches OpenAI and DeepSeek in some tests

AI researchers started with a free Qwen model and used distillation to make a new AI model for less than $50 in cloud computing costs.

AI researchers from Stanford and the University of Washington made a new artificial intelligence (AI) model called s1 for less than $50 in cloud computing costs, TechCrunch reports. One of the researchers told TechCrunch that, today, one could rent the necessary compute for about $20.

This follows in the wake of related developments such as TinyZero and Sky-T1.

The s1 model matches big names like OpenAI's o1 and DeepSeek's R1 in math and coding tests. The researchers have described s1 in a paper titled "s1: Simple test-time scaling," published in arXiv. Test-time computing happens during operations, not training. The researchers have also shared s1 on GitHub, including all the training data and code.

The researchers started with a basic model and used distillation to teach it reasoning. Distillation means taking the reasoning skills from another AI and training a new one with its answers. They used Google's Gemini 2.0 Flash Thinking Experimental for this.

The researchers focused on making s1 good at reasoning and giving it extra time to think before answering.

They trained s1 with supervised fine-tuning (SFT), where you show the model examples to follow. This method is cheaper than the large-scale learning DeepSeek used for R1. Google lets people use Gemini 2.0 for free, but with limits, and doesn't allow using it to make competing AI services.

S1 started from a free model by Qwen, an Alibaba-owned lab. The researchers used only 1,000 questions for training, which took less than 30 minutes on high-powered GPUs. They made s1 better by telling it to "wait" before answering, which improved accuracy.

But a distilled copy is not a superior model

The s1 project shows that small teams can innovate in AI without huge funds, but it also questions how unique AI models really are. Big AI companies worry about data theft for model distillation. However, while distillation is cost-effective for copying AI abilities, it doesn't make totally new, superior models.

Big companies plan to spend billions on AI to make entirely new and superior models.

#AutomatedReasoning

#Deeplearning

Cheap open AI matches OpenAI and DeepSeek in some tests

But a distilled copy is not a superior model

Related Articles

Comments on this article