New OpenAI models enhance coding and comprehension

2025-04-16
1 min read.
Three new OpenAI models boost coding, instruction following, and long-context understanding for developers building efficient AI systems.
New OpenAI models enhance coding and comprehension
Credit: Tesfu Assefa

OpenAI has released three new AI models, GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. The models are now available through an API. According to OpenAI, the new models surpass older models in coding and understanding instructions.

These models handle up to 1 million tokens, allowing them to process vast amounts of information. They also understand long texts better, with updated knowledge from June 2024.

GPT-4.1 shines in coding, scoring 54.6% on a test called SWE-bench Verified, which checks real-world software skills. This is much higher than older models. It also excels in following instructions, scoring 38.3% on a benchmark named MultiChallenge, which tests multi-step tasks. For long videos or texts, GPT-4.1 scores 72.0% on Video-MME, a test for understanding extended content without subtitles.

Real-world applications and efficiency

Developers helped shape these models for practical use. GPT-4.1 mini outperforms some larger models while cutting response time by nearly half and costs by 83%. GPT-4.1 nano, the fastest and cheapest, handles tasks like sorting or auto-filling text efficiently, scoring 80.1% on MMLU, a knowledge test. These models also power agents, systems that work independently, like analyzing documents or coding software.

Only available via the API, GPT-4.1’s improvements appear in the latest ChatGPT version. An older model, GPT-4.5 Preview, will stop working on July 14, 2025, as GPT-4.1 offers better results at lower costs. OpenAI expects developers to find GPT-4.1 more reliable for coding tasks. It also creates better web apps, with 80% of human testers preferring its designs.

In real-world tests, companies like Windsurf saw 60% better coding results, while Qodo noted improved code reviews. Blue J improved tax research accuracy by 53%, and Thomson Reuters boosted document review by 17%. These models handle large datasets, like financial records, with 50% better accuracy. With faster response times and lower costs, these models could help developers build smarter, more efficient systems.

#AIApplications

#Learning

#NeuralNetworks



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!

Mindplex

Mindplex is an AI company, a decentralized media platform, a global brain experiment, and a community dedicated to the rapidly unfolding future. Our platform empowers our community to share and discuss futurist content while showcasing AI and blockchain tools that enhance the media experience. Join us and shape the future of digital media!

ABOUT US

FAQ

CONTACT

Editors

© 2025 MindPlex. All rights reserved