DeepSeek R1 0528
DeepSeek R1 0528 is DeepSeek's open-source reasoning model, released January 20, 2025. It scores 79.8% Pass@1 on AIME 2024 and 97.3% on MATH-500. Weights ship under the MIT License for commercial use.
import { streamText } from 'ai'
const result = streamText({ model: 'deepseek/deepseek-r1', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
DeepSeek R1 0528 generates verbose reasoning traces before final answers. Budget output tokens generously and account for variable response length when estimating costs.
When to Use DeepSeek R1 0528
Best For
Competitive mathematics:
Formal proof construction and quantitative reasoning where AIME 2024 and MATH-500 benchmark results match your task
Code generation and debugging:
Algorithm design where RL-derived problem-solving patterns produce self-correcting chains before final output
Complex analytical reasoning:
Multi-step reasoning in finance, science, and engineering where showing work and self-verification build trust
Consider Alternatives When
Conversation or summarization:
Extended reasoning traces add unnecessary output token cost for content generation workloads
Hybrid thinking modes:
DeepSeek-V3.1 or later supports both thinking and non-thinking modes through the same endpoint
Strict latency requirements:
Variable response times from long reasoning chains are not acceptable when latency is a hard constraint
Pure creative writing:
Structured reasoning adds no quality benefit for open-ended generation tasks
Conclusion
DeepSeek R1 0528 matches closed-source models on published benchmarks while shipping weights under the MIT License. For math, code, and formal reasoning workloads, it fits teams that need open weights.
FAQ
DeepSeek applied reinforcement learning directly to the base model, bypassing the conventional step of training on human-written reasoning traces. Reasoning patterns like self-verification and reflection emerged from RL exploration rather than curated data.
79.8% Pass@1 on AIME 2024, on par with OpenAI o1 at release. On MATH-500 it scores 97.3%.
The MIT License permits commercial use. Many proprietary reasoning models impose stricter restrictions.
A context window of 160K tokens. The architecture is Mixture-of-Experts (MoE) with 671B total parameters, activating 37B per forward pass.
DeepSeek R1 0528 specializes in deep reasoning with extended chain-of-thought. DeepSeek-V3 and later variants are general-purpose models that balance reasoning with faster, lower-cost completions and suit mixed-workload deployments better.
Yes. The chain-of-thought trace appears in the response. This helps with debugging and with applications that display the model's reasoning to end users.