创建响应 - 千问 AI 平台

POST

/compatible-mode/v1/responses

import os
from openai import OpenAI

client = OpenAI(
  # 若未设置环境变量，请替换为：api_key="sk-xxx"
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
  model="qwen3.7-plus",
  input="What can you do?"
)

# 获取模型回复
print(response.output_text)

{
  "created_at": 1771165900,
  "id": "f75c28fb-4064-48ed-90da-4d2cc4362xxx",
  "model": "qwen3.7-plus",
  "object": "response",
  "output": [
    {
      "content": [
        {
          "annotations": [],
          "text": "Hello! I am Qwen3.5, a large language model developed by Alibaba Cloud with knowledge up to 2026, designed to assist you with complex reasoning, creative tasks, and multilingual conversations.",
          "type": "output_text"
        }
      ],
      "id": "msg_89ad23e6-f128-4d4c-b7a1-a786e7880xxx",
      "role": "assistant",
      "status": "completed",
      "type": "message"
    }
  ],
  "parallel_tool_calls": false,
  "status": "completed",
  "tool_choice": "auto",
  "tools": [],
  "usage": {
    "input_tokens": 57,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 44,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 101,
    "x_details": [
      {
        "input_tokens": 57,
        "output_tokens": 44,
        "total_tokens": 101,
        "x_billing_type": "response_api"
      }
    ]
  }
}

旧版 URL 路径 /api/v2/apps/protocols/compatible-mode/v1/responses 即将停止维护，请尽快迁移至新版路径 /compatible-mode/v1/responses。

与 OpenAI 的兼容性

本 API 兼容 OpenAI，但在参数、功能和行为上存在差异。请求仅处理本文档中列出的参数，未提及的 OpenAI 参数将被忽略。主要差异：

不支持的参数：部分参数不支持，例如 background（仅支持同步调用）。
扩展参数：支持 OpenAI 规范之外的额外参数，例如 enable_thinking。

鉴权

string

header

必填

千问 AI 平台 API Key。详见获取 API Key。

Header 参数

enum<string>

控制多轮对话中的会话缓存（需配合 previous_response_id 使用）。启用后，服务器将自动缓存对话上下文，从而降低延迟和费用。

enable：启用会话缓存。缓存创建按标准输入价格的 125% 计费；缓存命中按 10% 计费。缓存有效期为 5 分钟（命中后重置）。创建缓存至少需要 1024 个 Token。
disable：禁用会话缓存。如模型支持，则回退到隐式缓存。

支持的模型：qwen3.8-max-preview（仅 Token Plan 可用）、qwen3.7-max、qwen3.7-max-2026-06-08、qwen3.7-max-2026-05-20、qwen3-max、qwen3.7-plus、qwen3.7-plus-2026-05-26、qwen3.6-plus、qwen3.5-plus、qwen3.5-flash、qwen-plus、qwen-flash、qwen3-coder-plus、qwen3-coder-flash。

SDK 传参方式：Python 使用 default_headers，Node.js 使用 defaultHeaders。

可选值：enable,disable

请求体

application/json

string

必填

模型名称。支持的模型包括 qwen3.8-max-preview（仅 Token Plan 可用）、qwen3.7-max、qwen3.7-max-2026-06-08、qwen3.7-max-2026-05-20、qwen3.7-max-preview、qwen3.7-max-2026-05-17、qwen3-max、qwen3-max-2026-01-23、qwen3.7-plus、qwen3.7-plus-2026-05-26、qwen3.6-plus、qwen3.6-plus-2026-04-02、qwen3.5-plus、qwen3.5-plus-2026-04-20、qwen3.5-plus-2026-02-15、qwen3.7-flash、qwen3.7-flash-2026-07-15、qwen3.6-flash、qwen3.6-flash-2026-04-16、qwen3.5-flash、qwen3.5-flash-2026-02-23、qwen3.6-35b-a3b、qwen3.5-397b-a17b、qwen3.5-122b-a10b、qwen3.5-27b、qwen3.5-35b-a3b、qwen-plus、qwen-flash、qwen3-coder-plus、qwen3-coder-flash、qwen3.5-ocr、qwen-plus-character、qwen-flash-character。

string

必填

模型的输入内容。支持纯文本字符串，或按对话顺序排列的消息数组。

string

插入到上下文开头的系统指令。使用 previous_response_id 时，上一轮中指定的 instructions 不会延续到当前上下文。

string

上一轮响应的唯一 ID，有效期为 7 天。通过该参数可实现多轮对话，服务器会自动检索并将上一轮的输入和输出作为上下文传入。若同时提供了消息数组和 previous_response_id，input 中的新消息将追加到历史上下文之后。不能与 conversation 同时使用。使用示例请参考多轮对话指南。

string

当前响应所属的会话。会话中的历史记录将自动作为上下文传入当前请求，当前请求的输入和输出也会在响应完成后自动添加到会话中。不能与 previous_response_id 同时使用。

boolean

默认值false

是否启用流式输出。设置为 true 时，模型响应数据将实时以流的形式返回给客户端。

object[]

模型可使用的工具列表。支持的工具类型：web_search、code_interpreter、web_extractor、web_search_image、image_search、file_search、mcp、function。

内置工具使用 {"type": "<tool_name>"} 格式。例如：{"type": "web_search"}。

MCP 工具使用以下格式：

{
    "type": "mcp",
    "server_protocol": "sse",
    "server_label": "amap-maps",
    "server_description": "AMAP MCP Server...",
    "server_url": "https://dashscope.aliyuncs.com/api/v1/mcps/amap-maps/sse",
    "headers": {
        "Authorization": "Bearer $DASHSCOPE_API_KEY"
    }
}

Function 工具使用以下格式：

[{
  "type": "function",
  "name": "get_weather",
  "description": "Get weather information for a specified city",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The name of the city"
      }
    },
    "required": ["city"]
  }
}]
``` 使用示例请参考[函数调用指南](/developer-guides/tool-calling/function-calling)和[联网搜索指南](/developer-guides/tool-calling/web-search)。

显示子属性

string

必填

工具类型。有效值：web_search、code_interpreter、web_extractor、web_search_image、image_search、file_search、mcp、function。

enum<string>

控制模型选择和调用工具的方式。支持字符串格式和对象格式。

字符串格式：

auto：模型自动决定是否调用工具。
none：阻止模型调用任何工具。
required：强制模型调用工具。仅当 tools 列表中只有一个工具时可用。

**对象格式：**指定模型可使用的工具范围，模型只能从预定义的工具列表中选择并调用。

number

控制生成文本多样性的采样温度。温度越高，生成的文本越多样；温度越低，生成的文本越确定。取值范围：[0, 2)。temperature 和 top_p 都能控制生成文本的多样性，建议只设置其中一个。

number

控制生成文本多样性的核采样概率阈值。top_p 越高，生成文本越多样；top_p 越低，生成文本越确定。取值范围：(0, 1.0]。temperature 和 top_p 都能控制生成文本的多样性，建议只设置其中一个。

boolean

是否启用思考模式。设置为 true 时，模型在回复前会先进行思考，思考内容通过 reasoning 类型的输出项返回。推理 Token 计入 output_tokens_details.reasoning_tokens，并按推理 Token 价格计费。启用思考模式时，建议同时启用内置工具，以在复杂任务上获得最佳模型性能。

该参数不是标准 OpenAI 参数。 Python SDK 需通过 extra_body={"enable_thinking": True} 传递；Node.js SDK 和 curl 可直接在顶层参数中使用 enable_thinking: true。建议使用 reasoning.effort 替代，enable_thinking 后续将不再支持。

object

思考模式相关配置。

显示子属性

enum<string>

思考强度档位，默认值为 xhigh。支持 none、minimal、low、medium、high、xhigh、max 共 7 个递增档位。降低该值可加快响应速度并减少推理 Token 的消耗。

可选值：none,minimal,low,medium,high,xhigh,max

响应

200-application/json

string

本次响应的唯一 ID，有效期为 7 天。可将此 ID 传入 previous_response_id 参数以实现多轮对话。

number

本次请求的 Unix 时间戳（秒）。

enum<string>

对象类型。值为 response。

可选值：response

enum<string>

响应生成的状态。

可选值：completed,failed,in_progress,cancelled,queued,incomplete

string

生成本次响应所使用的模型 ID。

object[]

模型生成的输出项数组。数组中元素的类型和顺序取决于模型的响应。

显示子属性

enum<string>

输出项的类型。

message：包含模型生成的最终回复内容。
reasoning：启用思考模式（enable_thinking: true）时返回。推理 Token 计入 output_tokens_details.reasoning_tokens，并按推理 Token 价格计费。
function_call：使用用户自定义 function 工具时返回，需要处理函数调用并返回结果。
web_search_call：使用 web_search 工具时返回。
code_interpreter_call：使用 code_interpreter 工具时返回。
web_extractor_call：使用 web_extractor 工具时返回，必须配合 web_search 工具使用。
web_search_image_call：使用 web_search_image 工具时返回，包含搜索到的图片列表。
image_search_call：使用 image_search 工具时返回，包含相似图片列表。
mcp_call：使用 mcp 工具时返回，包含 MCP 服务调用的结果。
file_search_call：使用 file_search 工具时返回，包含搜索查询和知识库检索结果。

可选值：message,reasoning,function_call,web_search_call,code_interpreter_call,web_extractor_call,web_search_image_call,image_search_call,mcp_call,file_search_call

string

输出项的唯一标识符，所有类型的输出项均包含此字段。

enum<string>

消息角色。值为 assistant。仅当 type 为 message 时存在此字段。

可选值：assistant

enum<string>

输出项的生成状态。

可选值：completed,in_progress

string

工具或函数名称。当 type 为 function_call、web_search_image_call、image_search_call 或 mcp_call 时存在此字段。对于 web_search_image_call 和 image_search_call，值固定为 web_search_image 和 image_search。对于 mcp_call，值为 MCP 服务中调用的具体函数名，例如 amap-maps-maps_geo。

string

工具调用的参数，为 JSON 字符串格式。当 type 为 function_call、web_search_image_call、image_search_call 或 mcp_call 时存在此字段。使用前需通过 JSON.parse() 解析。

各工具类型的参数内容：

web_search_image_call：{"queries": ["search term 1", "search term 2"]}，其中 queries 为模型自动生成的搜索词列表。
image_search_call：{"img_idx": 0, "bbox": [0, 0, 1000, 1000]}，其中 img_idx 为输入图片的索引（从 0 开始），bbox 为搜索区域的边界框坐标 [x1, y1, x2, y2]，范围为 0-1000。
function_call：根据用户自定义函数参数的 schema 生成的参数对象。
mcp_call：MCP 服务中调用的函数的参数对象。

string

函数调用的唯一标识符。仅当 type 为 function_call 时存在此字段。返回函数调用结果时，必须使用此 ID 关联请求与响应。

object[]

消息内容数组。仅当 type 为 message 时存在此字段。

显示子属性

enum<string>

内容类型。值为 output_text。

可选值：output_text

string

模型生成的文本内容。

object[]

文本注释数组。通常为空数组。

object[]

推理摘要数组。仅当 type 为 reasoning 时存在此字段。每个元素包含 type 字段（值为 summary_text）和 text 字段（包含摘要文本）。

显示子属性

enum<string>

可选值：summary_text

string

object

搜索动作信息。仅当 type 为 web_search_call 时存在此字段。

显示子属性

string

搜索查询关键词。

enum<string>

搜索类型。值为 search。

可选值：search

object[]

搜索来源列表。每个元素包含 type 字段和 url 字段。

显示子属性

string

模型生成并执行的代码。仅当 type 为 code_interpreter_call 时存在此字段。

object[]

代码执行输出数组。仅当 type 为 code_interpreter_call 时存在此字段。每个元素包含 type 字段（值为 logs）和 logs 字段（包含代码执行日志）。

显示子属性

enum<string>

可选值：logs

string

代码解释器容器的标识符。仅当 type 为 code_interpreter_call 时存在此字段。用于在同一会话中关联多次代码执行。

string

提取目标的描述，说明需要从网页中提取哪些信息。仅当 type 为 web_extractor_call 时存在此字段。

string

工具调用的输出结果，为字符串格式。

当 type 为 web_extractor_call 时，为提取的网页内容摘要。
当 type 为 web_search_image_call 或 image_search_call 时，为包含图片搜索结果数组的 JSON 字符串。每个元素包含 title 字段（图片标题）、url 字段（图片 URL）和 index 字段（序号）。
当 type 为 mcp_call 时，为 MCP 服务返回的 JSON 字符串结果。

string[]

已抓取网页的 URL 列表。仅当 type 为 web_extractor_call 时存在此字段。

string

MCP 服务标签。仅当 type 为 mcp_call 时存在此字段，用于标识本次调用使用的 MCP 服务。

string[]

用于知识库检索的查询列表。仅当 type 为 file_search_call 时存在此字段。数组元素为模型生成的搜索查询字符串。

object[]

知识库检索结果数组。仅当 type 为 file_search_call 时存在此字段。

显示子属性

string

匹配文档的文件 ID。

string

匹配文档的文件名。

number

匹配相关性分数，范围为 0 到 1。值越高表示相关性越强。

string

匹配文档内容的片段。

boolean

是否启用了并行工具调用。

string

请求中 tool_choice 参数的回显值。有效值为 auto、none 和 required。

object[]

请求中 tools 参数的完整内容回显。结构与请求体中的 tools 参数相同。

object | null

模型生成响应失败时返回的错误对象。成功时此字段为 null。

显示子属性

string

错误码。

string

可读的错误信息。

object

本次请求的 Token 消耗信息。

显示子属性

integer

输入 Token 数量。

integer

模型输出的 Token 数量。

integer

消耗的 Token 总数，即 input_tokens 与 output_tokens 之和。

object

输入 Token 的细分类别。

显示子属性

integer

命中缓存的 Token 数量。

object

输出 Token 的细分类别。

显示子属性

integer

思考过程中消耗的 Token 数量。

object[]

按计费类型细分的 Token 详情。

显示子属性

integer

输入 Token 数量。

integer

模型输出的 Token 数量。

integer

消耗的 Token 总数。

string

值为 response_api。

object

工具使用的统计信息。如使用了内置工具，此字段包含每个工具的调用次数。示例：{"web_search": {"count": 1}}

​与 OpenAI 的兼容性

鉴权

Header 参数

请求体

响应

与 OpenAI 的兼容性