Conversational AI Subtitle Callback

Tencent RTC Conversational AI provides the capability to display real-time subtitles. Real-time subtitles are sent through Tencent RTC's Custom Message, enabling millisecond-level synchronization with audio dialogue.
Through the Tencent RTC SDK Receive Custom Messages feature, you can listen for callbacks on the client to receive real-time AI subtitles data. cmdID is fixed at 1.

Features

1. Real-time: Subtitles sync with audio dialogue on a millisecond-level delay.
2. Flexibility: Using custom message formats makes it easy to integrate and extend.

Message Format

Real-time subtitle messages use JSON formats, with specific fields as follows:
Field
Type
Description
type
Number
Message type, 10000 indicates real-time subtitles
sender
String
Speaker's userid
receiver
Array
List of recipient `userid`s, this message is actually broadcast in the room
payload
Object
Message payload, containing detailed subtitle information
The payload object contains the following fields:
Field
Type
Description
text
String
Original text from ASR
start_time
String
Start time of this sentence, format: "HH:MM:SS"
end_time
String
End time of this sentence, format: "HH:MM:SS"
roundid
String
Unique ID identifying a single conversation
end
Boolean
If true, this represents a complete sentence

Sample message

{
"type": 10000,
"sender": "user_a",
"receiver": [],
"payload": {
"text": "Hello, nice to meet you."
"start_time": "00:00:01",
"end_time": "00:00:03",
"roundid": "conversation_123456",
"end": true
}
}

Implementation Notes

1. Message Processing: The receiver needs to correctly parse the JSON message and identify real-time subtitle messages based on the type field.
2. Time Synchronization: Use start_time and end_time to ensure the subtitles align correctly with the audio.
3. Dialogue Segmentation: Use the end field to determine if a sentence has ended. This can be used for interface updates or storing complete dialogues.

Web SDK Custom Message Parsing

trtcClient.on(Tencent RTC.EVENT.CUSTOM_MESSAGE, (event) => {
let data = new TextDecoder().decode(event.data);
let jsonData = JSON.parse(data);
console.log(`receive custom msg from ${event.userId} cmdId: ${event.cmdId} seq: ${event.seq} data: ${data}`);
if (jsonData.type == 10000 && jsonData.payload.end == false) {
// Intermediate status of subtitles
} else if (jsonData.type == 10000 && jsonData.payload.end == true) {
// The sentence is finished
}
});