How does MiMo V2 Pro differ from MiMo v2 Flash?

It's the Pro tier. MiMo V2 Pro targets harder reasoning, math, and code than Flash, with higher per-token cost and somewhat lower throughput than Flash.

What architecture does MiMo V2 Pro use?

A Mixture-of-Experts (MoE) setup: each forward pass activates a subset of parameters, which keeps inference cost manageable while the full parameter count holds broader knowledge.

What's the context window for MiMo V2 Pro?

1M tokens. Hybrid sliding window attention reduces KV-cache use so long-context runs stay practical.

How do I authenticate requests to MiMo V2 Pro through AI Gateway?

Add your API key in AI Gateway project settings. Use `xiaomi/mimo-v2-pro` in API calls. AI Gateway routes, retries, and fails over across xiaomi.

What does MiMo V2 Pro cost?

See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for MiMo V2 Pro.

Can I route between MiMo V2 Pro and the Flash variant automatically?

Yes. AI Gateway supports fallback and routing. You can send hard requests to MiMo V2 Pro and fall back to Flash for simpler tasks to control cost.

What tasks is MiMo V2 Pro best suited for?

Multi-step reasoning, code generation, math, and long-context analysis. For short or simple jobs, Flash is usually cheaper.

Is MiMo V2 Pro available under an open-source license?

Yes. The MiMo v2 line is under the MIT license, which allows commercial use, modification, and redistribution.

Dashboard

MiMo V2 Pro

MiMo V2 Pro is the Pro variant in Xiaomi's MiMo v2 family with over 1T total parameters and 42B active, built for math, code, and multi-step reasoning within a context window of 1M tokens. It uses a hybrid attention architecture for long-context processing.

ReasoningTool Use

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'xiaomi/mimo-v2-pro',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out MiMo V2 Pro by Xiaomi. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

1.6s

49tps

$1.00/M

$3.00/M

Read:

$0.2/M

Write:

—

03/18/2026

More models by Xiaomi

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

1.1M

4.2s

101tps

$0.14/M

$0.28/M

Read:$0.0/M

Write:—

—

04/22/2026

1.1M

3.4s

93tps

$0.43/M

$0.87/M

Read:$0.0/M

Write:—

—

04/22/2026

262K

1.1s

143tps

$0.10/M

$0.30/M

Read:$0.02/M

Write:—

—

12/17/2025

About MiMo V2 Pro

MiMo V2 Pro is the Pro variant in Xiaomi's MiMo v2 family, released March 18, 2026. Compared to Flash, Pro trades some throughput for more depth on harder tasks.

Like Flash, MiMo V2 Pro uses a Mixture-of-Experts (MoE) architecture, so each forward pass activates a subset of total parameters and keeps per-token compute in check. The model supports a context window of 1M tokens and uses the same hybrid sliding window attention pattern as the rest of the MiMo v2 line, which cuts KV-cache storage versus full attention.

MiMo V2 Pro fits jobs where reasoning depth matters more than peak tokens per second. It handles multi-step math, code generation, and analytical work. You can call it through xiaomi via AI Gateway.

What To Consider When Choosing a Provider

Configuration: MiMo V2 Pro sits at the Pro end of the MiMo v2 lineup. It costs more per token than Flash, but it fits when accuracy matters more than price. Use AI Gateway's cost tracking and model fallback to route easy work to Flash and harder work to Pro.
Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use MiMo V2 Pro

Best For

Multi-step reasoning: You care more about accuracy on hard problems than raw throughput
Code generation: Architecture design, debugging, and multi-file refactors
Math and proofs: Problems that require long logical chains where intermediate reasoning steps matter
Long-context work: A window of 1M tokens fits big documents or repos
Pro-tier MoE: Over 1T total parameters with 42B active per forward pass

Consider Alternatives When

Speed-first simple tasks: The Flash variant fits better when cost and throughput drive the choice
Multimodal input required: MiMo V2 Pro is text-in, text-out only
Simple classification jobs: A smaller, cheaper model handles extraction at lower cost

Conclusion

MiMo V2 Pro is the Pro pick in Xiaomi's MiMo v2 lineup. Use it for multi-step math, multi-step code, and analytical work. Pair it with Flash through AI Gateway routing so you can balance cost and quality.

Frequently Asked Questions

How does MiMo V2 Pro differ from MiMo v2 Flash?
It's the Pro tier. MiMo V2 Pro targets harder reasoning, math, and code than Flash, with higher per-token cost and somewhat lower throughput than Flash.
What architecture does MiMo V2 Pro use?
A Mixture-of-Experts (MoE) setup: each forward pass activates a subset of parameters, which keeps inference cost manageable while the full parameter count holds broader knowledge.
What's the context window for MiMo V2 Pro?
1M tokens. Hybrid sliding window attention reduces KV-cache use so long-context runs stay practical.
How do I authenticate requests to MiMo V2 Pro through AI Gateway?
Add your API key in AI Gateway project settings. Use xiaomi/mimo-v2-pro in API calls. AI Gateway routes, retries, and fails over across xiaomi.
What does MiMo V2 Pro cost?
See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for MiMo V2 Pro.
Can I route between MiMo V2 Pro and the Flash variant automatically?
Yes. AI Gateway supports fallback and routing. You can send hard requests to MiMo V2 Pro and fall back to Flash for simpler tasks to control cost.
What tasks is MiMo V2 Pro best suited for?
Multi-step reasoning, code generation, math, and long-context analysis. For short or simple jobs, Flash is usually cheaper.
Is MiMo V2 Pro available under an open-source license?
Yes. The MiMo v2 line is under the MIT license, which allows commercial use, modification, and redistribution.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

MiMo V2 Pro

Playground

Providers

More models by Xiaomi

About MiMo V2 Pro

What To Consider When Choosing a Provider

When to Use MiMo V2 Pro

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions