• 서비스
  • 가격
  • 리소스
  • 기술지원
이 페이지는 현재 영어로만 제공되며 한국어 버전은 곧 제공될 예정입니다. 기다려 주셔서 감사드립니다.

텍스트 음성 변환 구성

This document describes how to configure the TTSConfig parameter of the StartAIConversation API.
Use TRTC’s built-in TTS or bring your own (BYO) third-party TTS service.

TRTC TTS Configuration

If you choose TRTC Real-Time TTS for your conversational AI scenarios, follow the guide below for rapid integration. For service activation and billing rules, refer to Billing instructions.
{
"TTSType": "flow", // [Required] Fixed value.
"VoiceId": "v-female-R2s4N9qJ", // [Required] Premium or Cloned Voice ID. Refer to the "Voice List" below for available IDs
"Model": "flow_01_turbo", //[Required] The current default TTS model version.
"Speed": 1.0, //[Optional] Speech rate. Range: [0.5-2.0]. Default: 1.0. Higher values indicate faster speech.
"Volume": 1.0, // [Optional] Volume level. Range: [0, 10]. Default: 1.0. Higher values indicate louder volume.
"Pitch": 0, // [Optional] Pitch adjustment. Range: [-12, 12]. Default: 0 (original tone).Higher values result in a higher pitch.
"Language": "zh" // [Optional] Recommended. Language ID; Supports "zh" (Chinese), "en" (English), "yue" (Chinese-Cantonese). Standard: ISO 639-1.
}
Note:
TRTC TTS offers a library of premium voices in Chinese and English, as listed in the table below. For additional voice or language requirements, please contact us.
To integrate a third-party TTS service for your Conversational AI scenarios, please refer to BYO TTS Configurations.

Voice Library

The TRTC premium TTS voice library are listed below. You can select and configure a voice based on your preferences.
Voice Name
Voice ID
Language
Language ID
Commanding CEO - Male
v-male-Bk7vD3xP
Chinese
zh
Gentle Lady
v-female-R2s4N9qJ
Chinese
zh
Tsundere Girl
v-female-m1KpW7zE
Chinese
zh
Cutesy Girl
v-female-U8aT2yLf
Chinese
zh
Casual Man
v-male-s5NqE0rZ
Chinese
zh
Natural Man
v-male-W1tH9jVc
Chinese
zh
Customer Service - Xiaomei (Sweet Girl)
female-kefu-xiaomei
Chinese
zh
Customer Service - Xiaoxin (Soft Female)
female-kefu-xiaoxin
Chinese
zh
Customer Service - Xiaoyue (Cheerful Female)
female-kefu-xiaoyue
Chinese
zh
Customer Service - Xiaoxu (Professional Male)
male-kefu-xiaoxu
Chinese
zh
Articulate Narrator - Female
v-female-p9Xy7Q1L
English (US)
en
Analytical Presenter - Female
v-female-Z3x9LmQ2
English (US)
en
Scholarly Lecturer - Male
v-male-A4b9KqP2
English (US)
en
Expert Analyst - Male
v-male-r7K2pQ9L
English (US)
en
Calm Reviewer - Male
v-male-Q6p8ZxL3
English (US)
en
Mindfulness Coach - Female
v-female-T3s8BqL9
English (US)
en
Gentle Mentor - Male
v-male-P6q7LzD8
English (US)
en
Reserved Broadcaster - Female
v-female-M7k2PxL9
English (US)
en
Serene Voice Actress
v-female-S5n9QxJ4
English (US)
en
Composed Voice Actress
v-female-T8m4WxP7
English (US)
en
Resonant Reviewer - Male
v-male-D6p3KxN8
English (US)
en
Empathic Host - Female
v-female-A9b3KfL2
English (US)
en
Sincere Storyteller - Female
v-female-A7h2MxQ5
English (US)
en
Gentle Storyteller - Male
v-male-G4n7RxM3
English (US)
en
Caring Counselor
v-male-H3p9LxK7
English (US)
en
Sincere Streamer - Male
v-male-R6n2MxT9
English (US)
en
Confident Actress
v-female-C8k4NxL6
English (US)
en
Uplifting Speaker - Male
v-male-L7m5QxP4
English (US)
en
Rational Commentator - Male
v-male-N4k8TxR7
English (US)
en
Intellectual Narrator - Female
v-female-B7k5WxN4
English (US)
en
Elegant Narrator - Female
v-female-k3P8sL0Q
Chinese-Cantonese
yue
Composed Narrator - Male
v-male-L4s7PqZ9
Chinese-Cantonese
yue

BYO TTS Configurations

If you choose to bring your own (BYO) third-party TTS service, you will need to prepare the corresponding TTS service account and API key. Please refer to the following sections for configuration instructions regarding different service providers.

Azure TTS

{
"TTSType": "azure", // Required. TTS type in string format.
"SubscriptionKey": "xxxxxxxx", // Required. Subscription key in string format.
"Region": "southeastasia", // Required. Subscription region in string format.
"VoiceName": "en-US-AmandaMultilingualNeural", // Required. Timbre name in string format.
"Language": "en-US", // Required. Language for TTS in string format.
"Rate": 1 // Optional. Speech speed in float format. Value range: 0.5–2. Default value: 1.
}

Cartesia TTS

{
"TTSType": "cartesia", // Required. TTS type in string format.
"Model": "sonic-multilingual", // Required. Model.
"APIKey": "eyxxxx", // Required. Obtained API key.
"VoiceId": "eda5bbff-1ff1-4886-8ef1-4e69a77640a0" // Required. Timbre ID. Visit https://play.cartesia.ai/ for details.
}

ElevenLabs TTS

{
"TTSType": "elevenlabs", // // Required. String. Specifies the TTS provider type.
"Model": "eleven_turbo_v2_5", // Required. Model Type.
"APIKey": "eyxxxx", // Required. The API key used to authenticate requests.
"VoiceId": "eda5bbff-1ff1-4886-8ef1-4e69a77640a0" // Required. The voice ID. See https://elevenlabs.io/docs/api-reference/get-voices for details.
}

Tencent TTS

{
"TTSType": "tencent", // TTS type in string format. Valid values: "tencent" and "minixmax". Other vendors will be supported in future versions.
"AppId": "Your application ID", // Required. The value is in string format.
"SecretId": "Your key ID", // Required. The value is in string format.
"SecretKey": "Your key", // Required. The value is in string format.
"VoiceType": 101001, // Required. Timbre ID in integer format. Standard timbre and premium timbre are supported. The premium timbre is more real, and its price differs from that of the standard timbre. See the TTS billing overview for details. For the complete list of timbre IDs, see the TTS timbre list.
"Speed": 1.25, // Optional. Speech speed in integer format. Value range: [-2, 6], corresponding to different speech speeds. -2: 0.6 times; -1: 0.8 times; 0: 1.0 times (default value); 1: 1.2 times; 2: 1.5 times; 6: 2.5 times. If you need a more fine-grained speech speed, the value can be accurate to 2 decimal places, such as 0.5, 1.25, and 2.81. For the conversion between the parameter value and actual speech speed, see Speech Speed Conversion.
"Volume": 5, // Optional. Volume level in integer format. Value range: [0, 10], corresponding to 11 volume levels. The default value is 0, representing the normal volume.
"PrimaryLanguage": 1, // Optional. Primary language in integer format. 1: Chinese (default value); 2: English; 3: Japanese.
"FastVoiceType": "xxxx" // Optional. Parameter for fast voice cloning.
}

MiniMax TTS

{
"TTSType": "minimax", // Required. String. Specifies the TTS provider type.
"Model": "speech-02-turbo", // Required. The TTS model type.
"APIUrl": "https://api.minimax.chat/v1/t2a_v2",// Required. The API endpoint URL.
"APIKey": "eyxxxx",// Required. String. The API key used for authentication.
"GroupId": "181000000000000", // Required. The MiniMax group ID associated with your account.
"VoiceType": "female-tianmei", // Required. String. The requested voice identifier (voice_id).
"Speed": 1.2 // Optional. Float. Speech speed multiplier. Valid range: [0.5, 2.0]. Default is 1.0.
}
See MiniMax
For rate limits, see MiniMax. Rate limits may cause response lag.
API
T2A V2 (Speech generation)
T2A Pro (Speech generation)
T2A (Speech generation)
T2A Stream (Streaming speech generation)
T2A Stream (Streaming speech generation)

Model
speech-2.6-hd, speech-2.6-turbo, speech-02-hd, speech-02-turbo, speech-01-hd, speech-01-turbo
speech-01,
speech-02
speech-01, speech-02
speech-01
speech-01
Customer type/Limit type
RPM
RPM
RPM
RPM
CONN (maximum number of parallel tasks)
Users using a free account
3
3
3
3
1
Users using a paid account
20
20
20
20
3

Custom TTS

{
"TTSType": "custom", // Required. The value is in string format.
"APIKey": "ApiKey", // Required. API key in string format for authentication.
"APIUrl": "http://0.0.0.0:8080/stream-audio", // Required. TTS API URL in string format.
"AudioFormat": "wav", // Optional. Expected output audio format in string format. For example, mp3, ogg_opus, pcm, and wav. Default value: wav. Currently, only pcm and wav are supported.
"SampleRate": 16000, // Optional. Audio sampling rate in integer format. Default value: 16000 (16 kHz). Recommended value: 16000.
"AudioChannel": 1, // Optional. Number of audio channels in integer format. Valid values: 1 and 2. Default value: 1.
}
For specific protocol specifications, see Custom TTS protocol.