OpenAI's new model o3 smashes AGI benchmarks, is one of the 200 best coders in the world

2024-12-23
1 min read.
OpenAI's new model o3 smashes AGI benchmarks, is one of the 200 best coders in the world

On 20 December, OpenAI announced o3, their latest model, which they claim demonstrates aspects of AGI (Artificial General Intelligence).

ARC is an AGI benchmark developed in 2019 to be a hard test that requires human-like reasoning. OpenAI's o3 has just achieved a score of 75.7%. For comparison, no model in the first year of the test surpassed five per cent.

o3 achieves a score of 2727 in Codeforces' coding test rating. This would make it the 175th-best competitive coder in the entire world.

o3 replaces OpenAI's o1 model, released in September. Both models run on 'reinforcement learning', where they follow a series of logical reasoning steps, making it a lot slower and more computationally-intensive, but improving its performance on tasks like mathematics and science.

o3 has still not been made available to the public, so these evaluations were conducted behind closed doors. It's also worth remembering that all tests are limited, and AGI when it comes will prove itself in the din of daily life, not any test.

Another limitation of o3 is its cost: the graph at the top of ARC Prize's announcement shows that it scores about 3× as highly on the test as o1 for a 10-1000 increase in cost.

Still, it has gotten people excited about progress towards AGI again,

#o3



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!

Mindplex

Mindplex is an AI company, a decentralized media platform, a global brain experiment, and a community dedicated to the rapidly unfolding future. Our platform empowers our community to share and discuss futurist content while showcasing AI and blockchain tools that enhance the media experience. Join us and shape the future of digital media!

ABOUT US

FAQ

CONTACT

Editors

© 2025 MindPlex. All rights reserved