GLM 5.1

zai/glm-5.1

GLM 5.1 advances Z.ai's GLM-5 generation with a focus on long-horizon autonomous coding. It can work independently on a single task for over eight hours, planning, executing, and iterating until it delivers engineering-grade results.

ReasoningTool UseImplicit Caching

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-5.1',
  prompt: 'Why is the sky blue?'
})

Playground

Try out GLM 5.1 by Z.ai. Usage is billed to your team at API rates. Free users get $5 of credits every 30 days, and you are considered a free user if you haven't made a payment.

About GLM 5.1

GLM 5.1 builds on the GLM-5 generation with a significant jump in coding capability, released April 7, 2026. Where GLM-5 introduced multiple thinking modes and agentic workflows, GLM 5.1 pushes the autonomy envelope: it sustains focus on one task for over eight hours, continuously planning, writing code, running tests, and improving its own output without human intervention.

The model targets long-horizon tasks that earlier models struggle with. Multi-file refactors, end-to-end feature implementation, and large-scale codebase migrations benefit from the extended autonomous execution window. Rather than handing back partial results for human review at each step, GLM 5.1 completes the full loop and delivers finished, tested code.

GLM 5.1 supports a context window of 202.8K tokens and max output of 202.8K tokens. Through AI Gateway, it shares the same unified API, built-in observability, and provider routing as other Z.ai models.

Providers

The AI Gateway supports routing requests across multiple AI providers. You can control provider preferences using the provider slugs available for copying with the buttons below. For more see the AI Gateway provider options documentation. By using the AI provider you acknowledge you reviewed and agree to their terms listed in the Legal section under the AI provider's name.

Provider

Context	Max Output	Latency	Throughput	Input	Output	Cache	Image Gen	Video Gen	Web Search	Per Query	Capabilities	ZDR	No Training	HIPAA	Release Date

Legal:Terms

•

Privacy

203K

64K

6.6s

43tps

$1.40/M

$4.40/M

Read:$0.26/M

Write:—

—

04/07/2026

Legal:Terms

•

Privacy

202K

2.1s

25tps

$1.40/M

$4.40/M

Read:$0.26/M

Write:—

—

04/07/2026

Legal:Terms

•

Privacy

203K

1.4s

20tps

$1.40/M

$4.40/M

Read:$0.26/M

Write:—

—

04/07/2026

Metrics

Based exclusively on usage through AI Gateway.

Throughput24 hours

More models by Z.ai

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

203K

3.9s

70tps

$1.20/M

$4.00/M

Read:$0.24/M

Write:—

—

03/15/2026

203K

0.4s

84tps

$0.80/M

$2.56/M

Read:$0.16/M

Write:—

—

02/12/2026

205K

0.2s

430tps

$2.25/M

$2.75/M

Read:$2.25/M

Write:—

—

12/22/2025

128K

2.9s

58tps

$0.30/M

$0.90/M

Read:$0.05/M

Write:—

—

09/30/2025

200K

115tps

$0.06/M

$0.40/M

Read:$0.01/M

Write:—

—

01/01/2025

200K

2.4s

18tps

$0.07/M

$0.40/M

Read:$0.01/M

Write:—

—

What To Consider When Choosing a Provider

Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

GLM 5.1 excels when given a well-defined task with clear acceptance criteria. Provide a detailed specification, relevant file paths, and expected behavior so the model can plan its autonomous execution effectively.

Tasks running for hours consume tokens proportionally. Monitor usage through AI Gateway's observability tools and set budget limits for extended runs.

Despite autonomous self-correction, review the final output before merging into production. Treat GLM 5.1 as a thorough junior engineer who still needs a code review.

When to Use GLM 5.1

Best For

Large-scale refactors:
Dozens of files where sustained context and iterative testing matter
End-to-end feature implementation:
Spec to tested, working code with minimal human checkpoints
Codebase migrations:
Hours of methodical file-by-file changes
Complex bug investigations:
The model autonomously traces root causes across a large codebase
Autonomous coding agents:
A model capable of multi-hour independent operation

Consider Alternatives When

Short-horizon tasks:
GLM-5 or GLM-5-Turbo handle coding tasks completing in minutes at lower cost
Vision or multimodal input:
GLM-5V-Turbo combines coding with screenshot and GUI understanding
Interactive pair programming:
GLM-4.7-Flash provides fast responses for real-time back-and-forth workflows
Budget-constrained workloads:
GLM-5-Turbo offers GLM-5-class capability at reduced per-token cost for shorter tasks

Conclusion

GLM 5.1 targets the gap between short-burst coding assistance and fully autonomous software engineering. For tasks that take hours of sustained, methodical work, it delivers complete results where shorter-context models would lose coherence or require repeated human intervention.

FAQ

Over eight hours of continuous autonomous operation. It plans, executes, tests, and iterates on its own output throughout that window.

GLM-5 introduced multiple thinking modes and agentic workflows for general-purpose reasoning. GLM 5.1 builds on that foundation with a specific focus on long-horizon coding tasks, sustaining autonomous operation for hours rather than minutes.

202.8K tokens.

Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves GLM 5.1.

Use the zai/glm-5.1 model identifier with your AI Gateway API key. No separate Z.ai account is needed. BYOK is also supported.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GLM 5.1

Playground

About GLM 5.1

Providers

More models by Z.ai

What To Consider When Choosing a Provider

Zero Data Retention

Authentication

When to Use GLM 5.1

Best For

Large-scale refactors:

End-to-end feature implementation:

Codebase migrations:

Complex bug investigations:

Autonomous coding agents:

Consider Alternatives When

Short-horizon tasks:

Vision or multimodal input:

Interactive pair programming:

Budget-constrained workloads:

Conclusion

FAQ

Playground

About GLM 5.1

Providers

More models by Z.ai

About GLM 5.1

What To Consider When Choosing a Provider

Zero Data Retention

Authentication

When to Use GLM 5.1

Best For

Large-scale refactors:

End-to-end feature implementation:

Codebase migrations:

Complex bug investigations:

Autonomous coding agents:

Consider Alternatives When

Short-horizon tasks:

Vision or multimodal input:

Interactive pair programming:

Budget-constrained workloads:

Conclusion

FAQ

How long can GLM 5.1 work on a single task?

How does GLM 5.1 differ from GLM-5?

What is the context window for GLM 5.1?

What is the pricing for GLM 5.1?

How do I access GLM 5.1 through AI Gateway?

About GLM 5.1