Publish time
2026/2/12Model Series
GLMInput type
Output type
Context Window
128,000Max Output Length
2,048Input Price
¥4 / 1M tokensOutput Price
¥18 / 1M tokensGLM-5 是面向 Coding 与 Agent 场景的新一代大模型,在复杂系统工程与长程任务中达到开源 SOTA,真实编程体验逼近 Claude Opus 级别;基于 744B 新基座、异步强化学习与稀疏注意力,实现从“写代码”到“写工程”的全面升级。
Zhinao API routes requests to the best-fit provider and automatically fails over to the one with highest availability.
TTFT
5.14s
Throughput
18.90tps
Uptime
97.00%
Provider Model
huaweicloud/z-ai/glm-5
Supported Parameters
Recent Uptime
Reasoning
Toggleable
Supported Response Formats
Request Log Collection
ZDR Supported
Distillable
Yes
Total Context
128,000
Max Output
2,048
Input Price
¥4 / 1M tokens
Output Price
¥18 / 1M tokens
Compare different providers across Zhinao API
27.99 tok/s
3.13 s
Uptime for z-ai/glm-5 across all providers
Zhinao API normalizes requests and responses across providers for you
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.360.cn/v1",
apiKey: process.env.ZHINAO_API_KEY,
});
const response = await client.chat.completions.create({
model: "z-ai/glm-5",
messages: [
{ role: "user", content: "Hello, how are you?" }
],
temperature: 0.7,
max_tokens: 1000,
});
console.log(response.choices[0].message.content);