Skip to content
Vercel April 2026 security incident

Gemini 3 Pro Preview

google/gemini-3-pro-preview

Gemini 3 Pro Preview is the flagship reasoning model in the Gemini 3 generation for demanding agentic and analytical tasks, with improvements in multi-step function calling, complex image reasoning, long-document analysis, and instruction following over Gemini 2.5 Pro.

File InputTool UseReasoningVision (Image)Web Searchtiered-costImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'google/gemini-3-pro-preview',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

Gemini 3 Pro Preview is a reasoning model: enable includeThoughts via providerOptions.google.thinkingConfig to surface the model's reasoning trace, which is particularly useful when auditing complex multi-step outputs.

When to Use Gemini 3 Pro Preview

Best For

  • Multi-step agentic workflows:

    Sequential function calls that require reliable planning and execution

  • Deep document analysis:

    Combining long text with embedded charts, diagrams, and images

  • Instruction-following tasks:

    Precision and completeness are critical to downstream correctness

  • Reasoning-intensive applications:

    Surfacing the model's thought process aids auditability

  • Complex technical research:

    Tasks requiring synthesis across disparate sources and formats

Consider Alternatives When

  • Latency and cost primary:

    Per-token cost and response speed dominate (consider google/gemini-3-flash for pro-grade quality at flash speed)

  • Updated agentic quality needed:

    For the latest improvements on software engineering tasks (consider google/gemini-3.1-pro-preview)

  • Native image generation output:

    Your workflow requires image output (consider google/gemini-3-pro-image)

  • High-volume straightforward tasks:

    Extraction or translation at scale (consider google/gemini-3.1-flash-lite-preview)

Conclusion

Gemini 3 Pro Preview targets tasks where getting every step right matters more than getting the answer quickly. That means complex agentic pipelines, technical document analysis, and multimodal reasoning that spans images and long text. For teams building the highest-stakes AI features, this is the Gemini 3 model designed for reasoning depth rather than maximum throughput.

FAQ

Four specific improvements: multi-step function calling, planning, reasoning over complex images and long documents, and instruction following. These directly address the reliability gaps that affect agentic workflows at scale.

Set includeThoughts to true under providerOptions.google.thinkingConfig in the AI SDK. Use streamText for streaming, and the model emits reasoning tokens alongside the generated response.

It can be, but it is a reasoning model with higher latency than the Flash tier. For interactive applications where sub-second responses are required, google/gemini-3-flash provides pro-grade reasoning at significantly lower latency.

Yes. The model handles long documents with embedded charts, diagrams, and images. Improved reasoning over complex images and long documents is one of its headline capabilities over Gemini 2.5 Pro.

Gemini 3.1 Pro introduces additional quality improvements for software engineering and agentic tasks, enhanced usability for finance and spreadsheet applications, and more efficient thinking that reduces token consumption. Gemini 3 Pro Preview was the initial release; 3.1 Pro builds on that foundation.

No. AI Gateway manages all underlying provider credentials. You authenticate once using a Vercel API key or OIDC token.

The model more reliably executes sequences of tool calls: choosing the right tool, interpreting its output, deciding whether to call another tool, and knowing when the task is complete. This reduces the need for human intervention to correct routing errors mid-workflow.

Yes. You can pass image inputs alongside text prompts to enable cross-modal analysis within a single request.