Publish time
2026/4/24Model Series
DeepSeekInput type
Output type
Context Window
1,000,000Max Output Length
384,000Input Price
¥1 / 1M tokensOutput Price
¥2 / 1M tokensDeepSeek-V4 拥有百万字超长上下文,在 Agent 能力、世界知识和推理性能上均实现国内与开源领域的领先。 相比 DeepSeek-V4-Pro,DeepSeek-V4-Flash 在世界知识储备方面稍逊一筹,但展现出了接近的推理能力。而由于模型参数和激活更小,相较之下 V4-Flash 能够提供更加快捷、经济的 API 服务。
Zhinao API routes requests to the best-fit provider and automatically fails over to the one with highest availability.
TTFT
0.86s
Throughput
52.30tps
Uptime
100.00%
Provider Model
deepseek/deepseek/deepseek-v4-flash
Supported Parameters
Recent Uptime
Reasoning
Toggleable
Supported Response Formats
Request Log Collection
ZDR Supported
Distillable
Yes
Total Context
1,000,000
Max Output
384,000
Input Price
¥1 / 1M tokens
Output Price
¥2 / 1M tokens
TTFT
17.87s
Throughput
22.10tps
Uptime
93.00%
Provider Model
huaweicloud/deepseek/deepseek-v4-flash
Supported Parameters
Recent Uptime
Reasoning
Toggleable
Supported Response Formats
Request Log Collection
ZDR Supported
Distillable
Yes
Total Context
1,000,000
Max Output
384,000
Input Price
¥1 / 1M tokens
Output Price
¥2 / 1M tokens
TTFT
1.06s
Throughput
10.62tps
Uptime
100.00%
Provider Model
st/deepseek/deepseek-v4-flash
Supported Parameters
Recent Uptime
Reasoning
Toggleable
Supported Response Formats
Request Log Collection
ZDR Supported
Distillable
Yes
Total Context
1,000,000
Max Output
384,000
Input Price
¥0.8 / 1M tokens
Output Price
¥1.6 / 1M tokens
Compare different providers across Zhinao API
30.55 tok/s
1.11 s
Uptime for deepseek/deepseek-v4-flash across all providers
Zhinao API normalizes requests and responses across providers for you
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.360.cn/v1",
apiKey: process.env.ZHINAO_API_KEY,
});
const response = await client.chat.completions.create({
model: "deepseek/deepseek-v4-flash",
messages: [
{ role: "user", content: "Hello, how are you?" }
],
temperature: 0.7,
max_tokens: 1000,
});
console.log(response.choices[0].message.content);