On Tuesday, March 25, 2025, Google unveiled its new AI model series, Gemini 2.5, which pauses to think before providing answers. This new series includes the "Gemini 2.5 Pro Experimental" model, which is claimed to be the most advanced AI model from Google to date.
According to a report from TechCrunch, Gemini 2.5 Pro will be available on Google’s developer platform, Google AI Studio, and will be offered to subscribers of the Gemini Advanced AI package. In the future, Google says all new AI models will feature integrated reasoning capabilities.
The Race for Reasoning AI Models
Since OpenAI launched the first AI model with reasoning capabilities in September 2024, tech companies have been racing to match or surpass the abilities of this model. Today, companies like Anthropic, DeepSeek, Google, and xAI have developed reasoning AI models that use additional computational power and time to verify information and think through problems before providing an answer.
Reasoning techniques have also helped AI models achieve new milestones in fields such as mathematics and programming. Many in the tech world believe reasoning models will become a key component of autonomous AI agents, systems capable of performing tasks with minimal human intervention. However, these models are also more expensive.
Gemini 2.5: Google’s Boldest Attempt Yet
Google had previously tested AI thinking models, releasing a thinking version of Gemini in December. However, Gemini 2.5 represents Google's largest attempt to surpass OpenAI's "O" series of models.
Google claims that Gemini 2.5 Pro outperforms previous leading AI models in several areas. The company asserts that Gemini 2.5 is specifically designed to excel at creating visually appealing web applications and agent-based programming applications.
Performance and Benchmarking Results
Google states that Gemini 2.5 Pro scored 68.6% on the Aider Polyglot evaluation, which measures code editing, surpassing the top models from OpenAI, Anthropic, and the Chinese AI lab DeepSeek. However, in another software development benchmark, SWE-bench, Gemini 2.5 Pro achieved 63.8%, outperforming OpenAI's O3-mini and DeepSeek’s R1, but falling short of Anthropic's Claude 3.7 Sonnet, which scored 70.3%.
In a final multi-modal test consisting of thousands of collective questions covering math, humanities, and natural sciences, Google claims that Gemini 2.5 Pro achieved 18.8%, performing better than most competing leading models.
Key Features of Gemini 2.5 Pro
Google also announced that Gemini 2.5 Pro includes a context window that can handle up to one million tokens, allowing the AI model to process approximately 750,000 words in one go. This is longer than the entire "Lord of the Rings" series, and soon, Gemini 2.5 Pro will support twice that input length—up to two million tokens.
Post a Comment