qwen3-32b

Online Chat

Publish time

Model Series

Input type

Output type

Context Window

128,000

Max Output Length

31,000

Input Price

¥2 / 1M tokens

Output Price

¥20 / 1M tokens

实现思考模式和非思考模式的有效融合，可在对话中切换模式。推理能力显著超过QwQ、通用能力显著超过Qwen2.5-32B-Instruct，达到同规模业界SOTA水平。

Providers for qwen3-32b

Zhinao API routes requests to the best-fit provider and automatically fails over to the one with highest availability.

七

七牛云

国内

TTFT

No data

Throughput

21.42tps

Uptime

100.00%

Provider Model

qiniu/qwen3-32b

Supported Parameters

Recent Uptime

5月24日 11 PM100.00%

Reasoning

Supported Response Formats

OpenAI Chat CompletionsOpenAI ResponsesAnthropic MessagesGoogle VertexAI

Request Log Collection

Distillable

Total Context

65,536

Max Output

31,000

Input Price

¥2 / 1M tokens

Output Price

¥20 / 1M tokens

通

通义千问

国内

TTFT

0.64s

Throughput

24.29tps

Uptime

No data

Provider Model

alibaba/qwen3-32b

Supported Parameters

Recent Uptime

5月24日 2 PM

Reasoning

Supported Response Formats

OpenAI Chat CompletionsOpenAI ResponsesAnthropic MessagesGoogle VertexAI

Request Log Collection

Distillable

Total Context

65,536

Max Output

31,000

Input Price

¥2 / 1M tokens

Output Price

¥20 / 1M tokens

Performance for qwen3-32b

Compare different providers across Zhinao API

Throughput

24.80 tok/s

TTFT

No data

Uptime for qwen3-32b

Uptime for qwen3-32b across all providers

Sample code and API for qwen3-32b

Get API Key

Zhinao API normalizes requests and responses across providers for you

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.360.cn/v1",
  apiKey: process.env.ZHINAO_API_KEY,
});

const response = await client.chat.completions.create({
  model: "qwen3-32b",
  messages: [
    { role: "user", content: "Hello, how are you?" }
  ],
  temperature: 0.7,
  max_tokens: 1000,
});

console.log(response.choices[0].message.content);

qwen3-32b

Online Chat

Publish time

Model Series

Input type

Output type

Context Window

128,000

Max Output Length

31,000

Input Price

¥2 / 1M tokens

Output Price

¥20 / 1M tokens

实现思考模式和非思考模式的有效融合，可在对话中切换模式。推理能力显著超过QwQ、通用能力显著超过Qwen2.5-32B-Instruct，达到同规模业界SOTA水平。

Providers for qwen3-32b

Zhinao API routes requests to the best-fit provider and automatically fails over to the one with highest availability.

七

七牛云

国内

TTFT

No data

Throughput

21.42tps

Uptime

100.00%

Provider Model

qiniu/qwen3-32b

Supported Parameters

Recent Uptime

5月24日 11 PM100.00%

Reasoning

Supported Response Formats

OpenAI Chat CompletionsOpenAI ResponsesAnthropic MessagesGoogle VertexAI

Request Log Collection

Distillable

Total Context

65,536

Max Output

31,000

Input Price

¥2 / 1M tokens

Output Price

¥20 / 1M tokens

通

通义千问

国内

TTFT

0.64s

Throughput

24.29tps

Uptime

No data

Provider Model

alibaba/qwen3-32b

Supported Parameters

Recent Uptime

5月24日 2 PM

Reasoning

Supported Response Formats

OpenAI Chat CompletionsOpenAI ResponsesAnthropic MessagesGoogle VertexAI

Request Log Collection

Distillable

Total Context

65,536

Max Output

31,000

Input Price

¥2 / 1M tokens

Output Price

¥20 / 1M tokens

Performance for qwen3-32b

Compare different providers across Zhinao API

Throughput

24.80 tok/s

TTFT

No data

Uptime for qwen3-32b

Uptime for qwen3-32b across all providers

Sample code and API for qwen3-32b

Get API Key

Zhinao API normalizes requests and responses across providers for you

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.360.cn/v1",
  apiKey: process.env.ZHINAO_API_KEY,
});

const response = await client.chat.completions.create({
  model: "qwen3-32b",
  messages: [
    { role: "user", content: "Hello, how are you?" }
  ],
  temperature: 0.7,
  max_tokens: 1000,
});

console.log(response.choices[0].message.content);