Text-To-Speech Configuration
This article mainly introduces how to configure the
TTSConfig
parameter in the StartAIConversation API.Supported Configurations
Please use your own third-party account for TTS parameters.
Tencent TTS
{"TTSType": "tencent", // String TTS type, currently supports "tencent" and "minixmax", support for other vendors is ongoing."AppId": "Your Application ID", // String, required"SecretId": "Your Secret ID", // String, required"SecretKey": "Your Secret Key", // String, required"VoiceType": 101001, // Integer, required, voice ID, including standard timbre and premium timbre. The premium timbre has higher realism and a different price from the standard timbre. See the Text To Speech billing overview for details. For the complete list of voice IDs, see the Text To Speech timbre list."Speed": 1.25, // Integer, optional, speech speed, range: [-2, 6], corresponding to different speech speeds: -2: represents 0.6 times, -1: represents 0.8 times, 0: represents 1.0 times (default), 1: represents 1.2 times, 2: represents 1.5 times, 6: represents 2.5 times. If you need a more detailed speech speed, you can retain 2 decimal places, such as 0.5/1.25/2.81, etc. For the conversion between parameter values and actual speech speed, see Speech Speed Conversion."Volume": 5, // Integer, optional, volume level, range: [0, 10], corresponding to 11 levels of volume, default value is 0, representing normal volume."PrimaryLanguage": 1, // Integer, optional primary language 1 - Chinese (default) 2 - English 3 - Japanese"FastVoiceType": "xxxx" // optional parameter, parameter for Voice Reproduce}
Minimax TTS
{"TTSType": "minimax", // String, TTS type,"Model": "speech-01-turbo","APIUrl": "https://api.minimax.chat/v1/t2a_v2","APIKey": "eyxxxx","GroupId": "181000000000000","VoiceType":"female-tianmei","Speed": 1.2}
API | T2A V2 (Speech Generation) | T2A Pro (Speech Generation) | T2A (Speech Generation) | T2A Stream (Streaming Speech Generation) | T2A Stream (Streaming Speech Generation) |
Model | speech-01-turbo, speech-01-240228, speech-01-turbo-240228 | speech-01, speech-02 | speech-01, speech-02 | speech-01 | speech-01 |
Limit Type | RPM | RPM | RPM | RPM | CONN (Maximum Number of Parallel Running Tasks) |
Free plan | 3 | 3 | 3 | 3 | 1 |
Paid plan | 20 | 20 | 20 | 20 | 3 |
Azure TTS
{"TTSType": "azure", // required: String TTS type"SubscriptionKey": "xxxxxxxx", // required: String Subscription Key"Region": "chinanorth3", // required: String Region of subscription"VoiceName": "zh-CN-XiaoxiaoNeural", // required: String Voice name is required"Language": "zh-CN", // required: String Language for synthesis"Rate": 1 // optional: float speech speed 0.5–2 Default is 1}
Cartesia TTS
{"TTSType": "cartesia", // required: String TTS type"Model": "sonic-multilingual", // required model"APIKey": "eyxxxx", // required: obtained API key"VoiceId": "eda5bbff-1ff1-4886-8ef1-4e69a77640a0" // required sound id https://play.cartesia.ai/}
ElevenLabs TTS
{"TTSType": "elevenlabs", // required: String TTS type"Model": "eleven_turbo_v2_5", // required: model type"APIKey": "eyxxxx","VoiceId": "eda5bbff-1ff1-4886-8ef1-4e69a77640a0" // Voice type https://elevenlabs.io/docs/api-reference/get-voices}
Custom TTS
{"TTSType": "custom", // required: String"APIKey": "ApiKey", // required: String for authentication"APIUrl": "http://0.0.0.0:8080/stream-audio" // required: String, TTS API URL"AudioFormat": "wav", // String, optional, expected output audio format, such as mp3, ogg_opus, pcm, wav, default is wav, currently only supports pcm and wav"SampleRate": 16000, // Integer, optional, audio sample rate, default is 16000 (16kHz), recommended value is 16000"AudioChannel": 1, // Integer, optional, audio channel quantity, value: 1 or 2, default is 1}