Feedback

Custom

Bring your own TTS engine by implementing TRTC's custom streaming protocol. This option gives you full control over the synthesis pipeline — use your proprietary model, on-premise deployment, or any third-party service not natively supported. Choose this if you have specialized voice requirements or need to integrate an in-house TTS solution.
Usage
To use a custom TTS engine, pass the following JSON in the TTSConfig field of the StartAIConversation API. Your TTS service must implement the TRTC custom TTS streaming protocol:
// json — TTSConfig
{
  "TTSType": "custom",
  "APIKey": "<your_api_key>",
  "APIUrl": "http://0.0.0.0:8080/stream-audio",
  "AudioFormat": "wav",
  "SampleRate": 16000,
  "AudioChannel": 1
}
For the complete TTSConfig parameter reference, see the Text-to-Speech Configuration.
Parameter reference
Field
Type
Required
Description
TTSType
String
Yes
Fixed value: "custom".
APIKey
String
Yes
API key for authentication with your TTS service.
APIUrl
String
Yes
Your TTS service endpoint URL.
AudioFormat
String
No
Output audio format. Currently supports: pcm, wav. Default: wav.
SampleRate
Integer
No
Audio sample rate. Default: 16000 (16 kHz). Recommended: 16000.
AudioChannel
Integer
No
Number of audio channels. 1 (mono) or 2 (stereo). Default: 1.
Note:
For the custom TTS protocol specification, see Customize TTS Protocol.
Next step: StartAIConversation API Reference﻿

Field	Type	Required	Description
`TTSType`	String	Yes	Fixed value: `"custom"`.
`APIKey`	String	Yes	API key for authentication with your TTS service.
`APIUrl`	String	Yes	Your TTS service endpoint URL.
`AudioFormat`	String	No	Output audio format. Currently supports: `pcm`, `wav`. Default: `wav`.
`SampleRate`	Integer	No	Audio sample rate. Default: 16000 (16 kHz). Recommended: 16000.
`AudioChannel`	Integer	No	Number of audio channels. `1` (mono) or `2` (stereo). Default: 1.