Publish time
2026/4/8Model Series
GLMInput type
Output type
Context Window
200,000Max Output Length
128,000Input Price
¥6 / 1M tokensOutput Price
¥24 / 1M tokensGLM-5.1 是智谱最新旗舰模型,代码能力大大增强,长程任务显著提升,能够在单次任务中持续、自主地工作长达 8 小时,完成从规划、执行到迭代优化的完整闭环,交付工程级成果。 在综合能力与 Coding 能力上,GLM-5.1 整体表现对齐 Claude Opus 4.6,并在长程自主执行、复杂工程优化与真实开发场景中展现出更强的持续工作能力,是构建 Autonomous Agent 与长程 Coding Agent 的理想基座。
Zhinao API routes requests to the best-fit provider and automatically fails over to the one with highest availability.
TTFT
9.63s
Throughput
17.07tps
Uptime
99.00%
Provider Model
huaweicloud/z-ai/glm-5.1
Supported Parameters
Recent Uptime
Reasoning
Toggleable
Supported Response Formats
Request Log Collection
ZDR Supported
Distillable
Yes
Total Context
200,000
Max Output
128,000
Input Price
¥6 / 1M tokens
Output Price
¥24 / 1M tokens
TTFT
8.98s
Throughput
25.70tps
Uptime
86.00%
Provider Model
bigmodel/z-ai/glm-5.1
Supported Parameters
Recent Uptime
Reasoning
Toggleable
Supported Response Formats
Request Log Collection
ZDR Supported
Distillable
Yes
Total Context
200,000
Max Output
128,000
Input Price
¥6 / 1M tokens
Output Price
¥24 / 1M tokens
Compare different providers across Zhinao API
28.87 tok/s
7.69 s
Uptime for z-ai/glm-5.1 across all providers
Zhinao API normalizes requests and responses across providers for you
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.360.cn/v1",
apiKey: process.env.ZHINAO_API_KEY,
});
const response = await client.chat.completions.create({
model: "z-ai/glm-5.1",
messages: [
{ role: "user", content: "Hello, how are you?" }
],
temperature: 0.7,
max_tokens: 1000,
});
console.log(response.choices[0].message.content);