OpenAI’s new model o3 smashes AGI benchmarks, is one of the 200 best coders in the world
Dec. 23, 2024.
1 min. read.
1 Interactions
On 20 December, OpenAI announced o3, their latest model, which they claim demonstrates aspects of AGI (Artificial General Intelligence).
ARC is an AGI benchmark developed in 2019 to be a hard test that requires human-like reasoning. OpenAI’s o3 has just achieved a score of 75.7%. For comparison, no model in the first year of the test surpassed five per cent.
o3 achieves a score of 2727 in Codeforces’ coding test rating. This would make it the 175th-best competitive coder in the entire world.
o3 replaces OpenAI’s o1 model, released in September. Both models run on ‘reinforcement learning’, where they follow a series of logical reasoning steps, making it a lot slower and more computationally-intensive, but improving its performance on tasks like mathematics and science.
o3 has still not been made available to the public, so these evaluations were conducted behind closed doors. It’s also worth remembering that all tests are limited, and AGI when it comes will prove itself in the din of daily life, not any test.
Another limitation of o3 is its cost: the graph at the top of ARC Prize’s announcement shows that it scores about 3× as highly on the test as o1 for a 10-1000 increase in cost.
Still, it has gotten people excited about progress towards AGI again,
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
0 Comments
0 thoughts on “OpenAI’s new model o3 smashes AGI benchmarks, is one of the 200 best coders in the world”