Feedback

Custom TTS

Bring your own TTS engine by implementing TRTC's custom streaming protocol. This option gives you full control over the synthesis pipeline — use your proprietary model, on-premise deployment, or any third-party service not natively supported. Choose this if you have specialized voice requirements or need to integrate an in-house TTS solution.

Usage

To use a custom TTS engine, pass the following JSON in the TTSConfig field of the StartAIConversation API. Your TTS service must implement the TRTC custom TTS streaming protocol:
// json — TTSConfig
{
"TTSType": "custom",
"APIKey": "<your_api_key>",
"APIUrl": "http://0.0.0.0:8080/stream-audio",
"AudioFormat": "wav",
"SampleRate": 16000,
"AudioChannel": 1
}
For the complete TTSConfig parameter reference, see the Text-to-Speech Configuration.

Parameter reference

Field
Type
Required
Description
TTSType
String
Yes
Fixed value: "custom".
APIKey
String
Yes
API key for authentication with your TTS service.
APIUrl
String
Yes
Your TTS service endpoint URL.
AudioFormat
String
No
Output audio format. Currently supports: pcm, wav. Default: wav.
SampleRate
Integer
No
Audio sample rate. Default: 16000 (16 kHz). Recommended: 16000.
AudioChannel
Integer
No
Number of audio channels. 1 (mono) or 2 (stereo). Default: 1.
Note:
For the custom TTS protocol specification, see Customize TTS Protocol.