top of page
Content Team

OpenAI Releases New AI Models Capable of Deeper Reasoning

OpenAI's latest models are aiming to achieve human-like reasoning.

OpenAI o1 graphic

OpenAI has recently introduced its latest AI model, o1, previously known under the code name "Strawberry." This release represents a shift in how artificial intelligence processes complex problems, bringing more refined reasoning capabilities to the forefront. Unlike previous models, which relied primarily on pattern recognition, o1 is designed to evaluate its steps before reaching a conclusion, offering more explanation in its responses.


What Makes o1 Different?


o1 is not just a continuation of the existing GPT models—it’s part of a new series with a fresh naming and numbering system. While models like GPT-4o excel at generating responses based on learned patterns, o1 takes a different approach by considering multiple solutions before arriving at an answer. This allows the model to tackle more intricate tasks, such as advanced math, coding, and scientific problems, with a higher degree of accuracy.


This new reasoning model was built using a unique optimization algorithm, tailored specifically for problem-solving. OpenAI reports that o1 performs much better on complex tasks than its predecessors. For example, in tests like the International Mathematics Olympiad, o1 was able to solve 83% of the problems, compared to GPT-4o's 13% success rate.


Practical Applications and Access


o1 is already being used in fields that demand precision and advanced problem-solving, such as healthcare, physics, and software development. Researchers can use it to annotate large datasets, while developers can leverage its multistep reasoning abilities to build complex code workflows. There's even a lighter version, o1-mini, aimed specifically at handling code generation. However, o1 isn’t designed for every task. It lacks some of the broader functionality of earlier models, such as browsing the web or handling files and images.


Currently, o1 is available in a preview mode for ChatGPT Plus and Team users, with plans to roll out to educational and enterprise customers soon. OpenAI is limiting access to 30 messages per week for o1-preview and 50 for o1-mini, reflecting the model’s focus on high-level tasks rather than day-to-day interactions. Despite these limitations, o1 offers a clear advantage for users needing its more advanced capabilities.


The Trade-offs and Costs


While o1 brings substantial benefits in terms of reasoning and problem-solving, it also comes with some drawbacks. For one, it tends to be slower than GPT-4o, as the model takes more time to "think through" each query. Additionally, o1 is significantly more expensive for developers. The cost for API access is $15 per million input tokens and $60 per million output tokens, which is much higher than GPT-4o's pricing. These factors make it less accessible for general use, positioning o1 as a specialized tool for specific, complex problems.


Safety and Guardrails


One of the key advancements in o1 is its improved adherence to safety guidelines. OpenAI has emphasized that the model was designed to be more resistant to generating harmful content. In testing, o1 demonstrated better performance in sticking to safety protocols compared to earlier models. OpenAI has rated the model as "medium risk" on its internal preparedness scale, indicating that while it introduces more advanced capabilities, these do not come with added risks beyond what current AI models already pose.


Graph of Jailbreak Evaluations from OpenAI
Jailbreak Evaluations, OpenAI

What’s Next for OpenAI?


The release of o1 is part of a broader strategy at OpenAI to develop AI models with enhanced reasoning abilities. OpenAI views reasoning as essential for the future of AI, particularly as the company works toward creating autonomous systems that can handle complex decision-making. However, this isn’t the end of the GPT series. OpenAI has also confirmed that it is working on an even larger model within the GPT series, which will aim to surpass the capabilities of GPT-4o.

bottom of page