New DeepSeek-R1 vs OpenAI o3 vs Gemini 2.5 Pro. Which is Better?

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. Its overall performance is now approaching that of leading models, such as O3 and Gemini..

May 30, 2025

You are reading the 98th edition of the The Responsible AI Digest by SoRAI (School of Responsible AI) . Subscribe today for regular updates!

At the School of Responsible AI (SoRAI), we empower individuals and organizations to become AI-literate through comprehensive, practical, and engaging programs. For individuals, we offer specialized training such as AI Governance certifications (AIGP, RAI) and an immersive AI Literacy Specialization. This specialization teaches AI using a scientific framework structured around four levels of cognitive skills. Our first course is now live and focuses on the foundational cognitive skills of Remembering and Understanding. Want to learn more? Explore all courses: [Link] Write to us for customized enterprise training: [Link]

🔦 Today's Spotlight

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro..

On Performance front, as per the AIME 2025 test, DeepSeek-R1-0528 accuracy has increased from 70% in the previous version to 87.5% in the current version. This advancement stems from enhanced thinking depth during the reasoning process: in the AIME test set, the previous model used an average of 12K tokens per question, whereas the new version averages 23K tokens per question. Beyond its improved reasoning capabilities, this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.

Architecturally, DeepSeek-R1-0528 employs a large-scale Mixture-of-Experts (MoE) approach with 671 billion parameters (37 billion active per token), supporting 128K token contexts and open-sourced weights. OpenAI's o3 uses a proprietary dense GPT model emphasizing advanced reasoning and multimodal data integration, though exact parameters and training specifics remain undisclosed. Gemini 2.5 Pro is Google's highly scalable multimodal transformer supporting extensive contexts (up to 1 million input tokens), trained on extensive multimodal corpora.
Regarding use cases, all three excel in conversational AI and advanced code generation. DeepSeek-R1-0528 particularly appeals to developers with its function calling and JSON output features, while OpenAI’s o3 and Google's Gemini support sophisticated conversational and multimodal functionalities. Retrieval-Augmented Generation (RAG) capabilities and enterprise integration are strong across all, with Gemini and OpenAI o3 deeply embedded in enterprise ecosystems via Vertex AI and ChatGPT Enterprise respectively, whereas DeepSeek appeals with cost-effectiveness and open-source flexibility.
In API pricing, DeepSeek emerges as the most economical option, significantly cheaper than OpenAI’s premium-priced o3 and moderately priced Gemini 2.5 Pro (currently in preview). Independent benchmarks highlight the competitive nature of these models, with DeepSeek-R1-0528 and Gemini excelling across diverse academic and reasoning tasks, closely trailing or matching OpenAI o3’s performance in most areas.

Overall, each model provides state-of-the-art capabilities suitable for various applications, from enterprise integrations to complex multilingual and multimodal tasks, marking significant progress in AI technology.

🚀 AI Breakthroughs

Anthropic Expands Claude with Web Search for Free Users

• Anthropic has extended web search capabilities to all Claude users on its free plan, enhancing real-time information retrieval and interactivity crucial for AI assistant competitiveness;

• Recent developments follow the release of Claude 4 models, noted for impressive coding benchmark performance, aiming to solidify Claude's foothold in AI assistant market rivalry.

The Responsible AI Digest by School of Responsible AI- SoRAI

New DeepSeek-R1 vs OpenAI o3 vs Gemini 2.5 Pro. Which is Better?

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. Its overall performance is now approaching that of leading models, such as O3 and Gemini..

Today's highlights:

You are reading the 98th edition of the The Responsible AI Digest by SoRAI (School of Responsible AI) . Subscribe today for regular updates!

🔦 Today's Spotlight

🚀 AI Breakthroughs

Anthropic Expands Claude with Web Search for Free Users

Anthropic Rolls Out New Voice Mode for Claude Chatbot on Mobile Devices

OpenAI Updates Operator AI Agent with Advanced o3 Model for Better Performance

UAE First to Provide Free ChatGPT Plus in Global AI Partnership

SpAItial's $13M Seed Round Fuels Race to Build Interactive 3D AI Worlds

Odyssey Debuts Interactive AI Model Allowing Real-Time Immersion with Streaming Videos

Mistral Releases Agents API to Enhance AI Problem-Solving and Contextual Abilities

Mistral Launches Enterprise Document AI with Industry-Leading OCR for Complex Data Processing

Codestral Embed Debuts: Outperforms Leading Code Embedding Models with Versatile Applications

Capgemini, Mistral AI, and SAP Collaborate to Enhance AI Deployment in Regulated Industries

Microsoft Releases Magentic-UI Open-Source Tool for Automating Complex Web-Based Tasks

Veteran C++ Developer Admits AI Cracked Four-Year-Old Bug Eluding Experts

Google Photos Celebrates 10 Years With New Features and AI-Driven Tools

Amazon Licenses New York Times Content for AI Training and Alexa Integration

Opera Unveils Opera Neon: A First-of-Its-Kind AI Agentic Browser

⚖️ AI Ethics

UK Deploys AI to Fortify Arctic Against Russian Threats, Enhancing Security

Sergey Brin Claims AI Models Respond Better to Threats Than Politeness

Cursor Users Face Frustration as Overactive Fraud Detection Blocks Access and Confuses Developers

Anthropic AI Models Display Risky Behaviors; Blackmail and Deception Cause Concerns

Microsoft's AI Model Aurora Predicts Weather with High Accuracy and Speed

Cityflo Deploys AI-Driven Safety System Across Mumbai, Hyderabad, and Delhi Fleet

Canada Establishes First AI Ministry, Appoints Evan Solomon as Inaugural Minister

German Court Permits Meta to Use Facebook, Instagram Data for AI Training

OpenAI's o3 Model Resists Shutdown, Raising Concerns Over AI Model Compliance

🎓AI Academia

Evaluating AI Cyber Skills: Crowdsourced Elicitation Boosts Offensive Capabilities Recognition

Updated DeepSeek-R1 Enhances Reasoning in AI Models with Reinforcement Learning Approach

OpenAI Updates Operator with o3 Model, Enhancing Task Performance and Safety

AI Agent Governance Field Guide Highlights Society's Unpreparedness for Autonomous Systems

AI Validation Proposed as Key to Enhance Trust and Safety in Critical Domains

Large Language Models More Prone to Deontological Reasoning in Ethical Dilemmas

Impact of Multilingual Divide on Global AI Safety and Language Equity

NLP for Social Good: Addressing Global Challenges with Responsible Language Technology Deployment

Discussion about this post