음성을 텍스트로
This document explains how to quickly integrate AI real-time transcription (speech-to-text) and translation features on the client side using the
AITranscriberManager interface in the TRTC SDK.Solution Overview
TRTC's AI real-time transcription and translation features let you convert audio streams in a room to text instantly and translate them into multiple target languages. With the SDK's
AITranscriberManager, you can start transcription tasks, receive recognition results, and manage the process directly on the client. Unlike server-side integration, using the SDK removes the need to build your own backend for cloud API calls, streamlining your development workflow.Prerequisites
Log in to the TRTC console, activate the TRTC service, and create an RTC-Engine application.
Purchasing RTC-Engine package (Lite version or above) unlocks the speech to text and real-time translation features.
Note:
The speech-to-text and real-time translation features are billed based on usage. For details, see Pricing.
Integration Process
Step 1: Integrate the TRTC SDK
Add the TRTC SDK to your project, join a TRTC room, and enable local microphone audio capture and publishing. iOS, Android, Windows, macOS And Web clients currently support direct transcription and translation task initiation. See the following integration guides to import the SDK into your project:
Note:
After importing the SDK, continue with the steps below.
For Web integration, refer to Web SDK Enable Real-time Voice Transcription and Translation.
Step 2: Obtain an AITranscriberManager Instance
AITranscriberManager is the main class for managing AI transcription features. Retrieve its instance from TRTCCloud.import com.tencent.liteav.transcriber.AITranscriberManager;TRTCCloud mTRTCCloud = TRTCCloud.sharedInstance(context);AITranscriberManager aiTranscriberManager = mTRTCCloud.getAITranscriberManager();
TRTCCloud *trtcCloud = [TRTCCloud sharedInstance];AITranscriberManager *manager = [trtcCloud getAITranscriberManager];
liteav::ITRTCCloud* trtcCloud = liteav::ITRTCCloud::getTRTCShareInstance();liteav::AITranscriberManager* manager = trtcCloud->getAITranscriberManager();
Step 3: Set Up Event Listeners
Set up a listener to receive transcription status updates, real-time transcription and translation messages, and error notifications for users participating in transcription within the room.
AITranscriberManager.AITranscriberListener listener = new AITranscriberManager.AITranscriberListener() {@Overridepublic void onRealtimeTranscriberStarted(String roomId, String transcriberRobotId) {// Transcription started}@Overridepublic void onReceiveTranscriberMessage(String roomId, AITranscriberManager.TranscriberMessage message) {// Handle real-time transcription and translation messages}@Overridepublic void onRealtimeTranscriberStopped(String roomId, String transcriberRobotId, int reason) {// Transcription stopped}@Overridepublic void onRealtimeTranscriberError(String roomId, String transcriberRobotId, int error, String errorInfo) {// Handle real-time transcription service errors}};aiTranscriberManager.addListener(listener);
- (void)onRealtimeTranscriberStarted:(NSString *)roomId transcriberRobotId:(NSString *)transcriberRobotId {// Transcription started}- (void)onReceiveTranscriberMessage:(NSString *)roomId message:(TranscriberMessage *)message {// Handle real-time transcription and translation messages}- (void)onRealtimeTranscriberStopped:(NSString *)roomId transcriberRobotId:(NSString *)transcriberRobotId reason:(NSInteger)reason {// Transcription stopped}- (void)onRealtimeTranscriberError:(NSString *)roomId transcriberRobotId:(NSString *)transcriberRobotId error:(NSInteger)error errorInfo:(NSString *)errorInfo {// Handle real-time transcription service errors}[manager addListener:self];
class MyTranscriberListener : public liteav::AITranscriberListener {public:void onRealtimeTranscriberStarted(const char* roomId, const char* transcriberRobotId) override {// Transcription started}void onReceiveTranscriberMessage(const char* roomId, const liteav::TranscriberMessage& message) override {// Handle real-time transcription and translation messages}void onRealtimeTranscriberStopped(const char* roomId, const char* transcriberRobotId, int reason) override {// Transcription stopped}void onRealtimeTranscriberError(const char* roomId, const char* transcriberRobotId, int error, const char* errorInfo) override {// Handle real-time transcription service errors}};MyTranscriberListener* listener = new MyTranscriberListener();manager->addListener(listener);
TranscriberMessage Details
Field Name | Type | Description |
segmentId | String | Unique ID for the message segment. Used for deduplication or sorting. |
speakerUserId | String | ID of the speaking user. |
sourceText | String | Recognized source language text (Unicode encoded). |
translationTexts | Map/List | Translated target language text. |
timestamp | long | UTC timestamp when the message was generated, in milliseconds. |
isCompleted | bool | Indicates whether transcription is complete. true: The sentence is finished, final result.false: The sentence is ongoing, intermediate result (streaming update). |
Step 4: Start a Transcription Task
Create a
TranscriberParams object, set the transcriber robot ID, source language, target translation languages, and other parameters. Then call startRealtimeTranscriber to begin the transcription service.AITranscriberManager.TranscriberParams params = new AITranscriberManager.TranscriberParams();params.transcriberRobotId = "my_robot"; // Optional: Specify robot IDparams.sourceLanguage = "en"; // Source language: Englishparams.translationLanguages = Arrays.asList("zh", "ja"); // Optional: If not set, only transcription is performed, no translationparams.userIdsToTranscribe = Arrays.asList("userA"); // Optional: If not set, transcribe all users in the roomaiTranscriberManager.startRealtimeTranscriber(params);
TranscriberParams *params = [[TranscriberParams alloc] init];params.transcriberRobotId = @"my_robot"; // Optional: Specify robot IDparams.sourceLanguage = @"en"; // Source language: Englishparams.translationLanguages = @[@"zh", @"ja"]; // Optional: If not set, only transcription is performed, no translationparams.userIdsToTranscribe = @[@"userA"]; // Optional: If not set, transcribe all users in the room[manager startRealtimeTranscriber:params];
liteav::TranscriberParams params;params.transcriberRobotId = "my_robot"; // Optional: Specify robot IDparams.sourceLanguage = "en"; // Source language: Englishconst char* targetLangs[] = {"zh", "ja"};params.translationLanguages = targetLangs; // Optional: If not set, only transcription is performed, no translationparams.translationLanguagesCount = 2;const char* transcribeUsers[] = {"userA"};params.userIdsToTranscribe = transcribeUsers; // Optional: If not set, transcribe all users in the roomparams.userIdsToTranscribeCount = 1;manager->startRealtimeTranscriber(params);
TranscriberParams Details
Parameter Field | Type | Required | Description |
transcriberRobotId | String | No | Unique ID for the transcription robot. For a single transcription task, if not specified, the SDK generates a default ID in the format transcriber_${roomid}_robot_${userid}.If you start multiple transcription tasks at the same time, you must specify the robot ID. |
sourceLanguage | String | Yes | Source language code. Specify the language type of the source audio. Use the standard language code (e.g., "en"). |
translationLanguages | List/Array | No | List of target language codes for translation. If translation is required, set the target language codes here (e.g., "zh"). |
userIdsToTranscribe | List/Array | No | List of user IDs to transcribe. If not set, audio from all users in the room will be transcribed by default. |
The SDK sends the result of this interface via webhook:
If the call succeeds, you'll receive the
onRealtimeTranscriberStarted webhook, indicating the transcription task started successfully. You can then receive real-time transcription and translation messages through the onReceiveTranscriberMessage webhook.If the call fails, you'll receive the
onRealtimeTranscriberError webhook, indicating the transcription task failed to start. Take action based on the specific error code (see Server Error Codes).Step 5: Stop the Transcription Task
When transcription is no longer needed, call
stopRealtimeTranscriber to end the task and release resources. Pass the robot ID used to start the task. (If you didn't specify a robot ID at start, a default one is generated; passing an empty value will stop the robot task.)aiTranscriberManager.stopRealtimeTranscriber("my_robot");
[manager stopRealtimeTranscriber:@"my_robot"];
manager->stopRealtimeTranscriber("my_robot");
The SDK sends the result of this interface via webhook:
If the call succeeds, you'll receive the
onRealtimeTranscriberStopped webhook, indicating the transcription task stopped successfully. You will no longer receive new transcription messages.If the call fails, you'll receive the
onRealtimeTranscriberError webhook, indicating the transcription task failed to stop. Take action based on the specific error code (see Server Error Codes).Supported Language Codes
Source Language
Language Code | Language Name |
zh | Chinese |
en | English |
vi | Vietnamese |
ja | Japanese |
ko | Korean |
id | Indonesian |
th | Thai |
pt | Portuguese |
tr | Turkish |
ar | Arabic |
es | Spanish |
hi | Hindi |
fr | French |
ms | Malay |
fil | Filipino |
de | German |
it | Italian |
ru | Russian |
sv | Swedish |
da | Danish |
no | Norwegian |
Note:
Client-initiated real-time transcription currently supports 21 languages: Chinese, English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Turkish, Arabic, Spanish, Hindi, French, Malay, Filipino, German, Italian, Russian, Swedish, Danish, Norwegian. For support for additional languages, please contact us.
For Chinese and English, client-side transcription via
AITranscriberManager uses the latest 16k_zh_en large model in the Standard Edition language engine by default. See Billing of Speech AI Service.Target Translation Language
Language Code | Language Name |
zh | Chinese |
en | English |
es | Spanish |
pt | Portuguese |
fr | French |
de | German |
ru | Russian |
ar | Arabic |
ja | Japanese |
ko | Korean |
vi | Vietnamese |
ms | Malay |
id | Indonesian |
it | Italian |
th | Thai |
Note:
Real-time translation currently supports 15 languages for input and output: Chinese, English, Spanish, Portuguese, French, German, Russian, Arabic, Japanese, Korean, Vietnamese, Malay, Indonesian, Italian, Thai. If the preceding ASR transcription language is not one of these, translation cannot be enabled. For additional language support, please contact us.
AI translation results are provided for reference only and should not be considered professional advice or conclusions.
Server Error Codes
Error Code | Meaning | Recommended Action |
2000 | Parameter error. | Check whether the request parameters are valid. |
2002 | Task does not exist. | If returned when calling the stop interface, can be ignored. |
2026 | Transcription service (ASR/Translation service) not enabled. | Enable the relevant service in the console. |
3000 | Internal error. | Retry the operation. |
4003 | Task is exiting. | If returned when calling the stop interface, can be ignored. |
5000 | Resource overload. | Use a backoff strategy and retry. |
5001 | Concurrency limit. | Contact the product team to increase concurrency limits. |
-102009 | Host is not in the room. | Check host status and retry after confirmation. |
-102005 | Room does not exist. | Check room status and retry after confirmation. |