跳转到主要内容
Qwen-TTS

语音合成

Qwen-TTS API 参考

POST
/api/v1/services/aigc/multimodal-generation/generation
# 安装最新版本的 DashScope SDK
import os
import dashscope

dashscope.base_http_api_url = 'https://dashscope.aliyuncs.com/api/v1'

text = "Let me recommend a T-shirt to everyone. This one is really super nice. The color is very elegant, and it's also a perfect item to match. Everyone can buy it without hesitation. It's truly beautiful and very forgiving on the figure. No matter what body type you have, it will look great. I recommend everyone to place an order."
# SpeechSynthesizer 接口用法:dashscope.audio.qwen_tts.SpeechSynthesizer.call(...)
response = dashscope.MultiModalConversation.call(
  # 如需使用指令控制功能,请将模型替换为 qwen3-tts-instruct-flash
  model="qwen3-tts-flash",
  # 如果未配置环境变量,请将以下行替换为您的 API 密钥:api_key="sk-xxx"
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  text=text,
  voice="Cherry"
  # 如需使用指令控制功能,请取消以下行的注释,并将模型替换为 qwen3-tts-instruct-flash
  # instructions='Fast speech rate, with a clear rising intonation, suitable for introducing fashion products.',
  # optimize_instructions=True
)
print(response)
{
  "status_code": 200,
  "request_id": "5c63c65c-cad8-4bf4-959d-xxxxxxxxxxxx",
  "code": "",
  "message": "",
  "output": {
    "text": null,
    "choices": null,
    "finish_reason": "stop",
    "audio": {
      "url": "https://example.oss.aliyuncs.com/audio-result.wav?Expires=1766113409&OSSAccessKeyId=LTAIxxxx&Signature=xxxx",
      "data": "",
      "id": "audio_5c63c65c-cad8-4bf4-959d-xxxxxxxxxxxx",
      "expires_at": 1766113409
    }
  },
  "usage": {
    "input_tokens": 0,
    "output_tokens": 0,
    "total_tokens": 1121,
    "characters": 195,
    "input_tokens_details": {
      "text_tokens": 76
    },
    "output_tokens_details": {
      "audio_tokens": 1045,
      "text_tokens": 0
    }
  }
}
DashScope Python SDK 使用 MultiModalConversation 而非 SpeechSynthesizer,用法和参数完全相同。

鉴权

string
header
必填

千问云 API Key。详见获取 API Key

Header 参数

enum<string>

设置为 enable 可通过 HTTP 实现流式输出。Python SDK 使用 stream 参数代替此设置;Java SDK 使用 streamCall 接口代替此设置。

enable

请求体

application/json
string
必填

模型名称。

object
必填

语音合成的输入参数。

响应

200-application/json
integer

HTTP 状态码。示例:200(成功)、400(客户端错误)、401(未授权)、404(未找到)、500(服务器错误)。

200
string

本次请求的唯一 ID,可用于定位和排查问题。

5c63c65c-cad8-4bf4-959d-xxxxxxxxxxxx
string

请求失败时显示错误码。请参阅错误码说明

string

请求失败时显示错误信息。请参阅错误码说明

object

模型的输出结果。

object

Token 或字符消耗信息。Qwen-TTS 返回 token 消耗量,Qwen3-TTS-Flash 返回字符消耗量。