• 서비스
  • 가격
  • 리소스
  • 기술지원
이 페이지는 현재 영어로만 제공되며 한국어 버전은 곧 제공될 예정입니다. 기다려 주셔서 감사드립니다.

Billing of Speech-To-Text

These billing instructions apply to two services: Speech-to-Text and AI Real-Time Translation.
​​Speech-to-Text:​​ Transcribes spoken audio into text using Automatic Speech Recognition (ASR/STT). This is commonly used to generate real-time captions.
​​AI Real-Time Translation:​​ Translates transcribed text into target languages to deliver real-time multilingual subtitles.

Billing Information

Speech-to-Text

This service recognizes and transcribes audio streams from specified users or all users in a TRTC room.
This capability is available only to applications subscribed to the RTC-Engine Monthly Package.
For eligible packages (RTC Engine Lite and above), the service is billed on a pay-as-you-go basis after unlocking.
Third-party STT is not supported in AI real-time translation scenarios to ensure consistency and output quality.
Billing mode: Postpaid.
Billing cycle: Daily. Specific billing details and the statement issuance time are subject to Billing Statement.

AI Real-Time Translation

This service translates transcribed content into one or more specified target languages in real-time.
Billing mode: Postpaid.
Billing cycle: Daily. Specific billing details and the statement issuance time are subject to Billing Statement.

Pricing

The following table provides the list prices and language support details for both the Speech-to-Text and AI Real-Time Translation services:
ServiceType
Unit Price (USD/Minute)
Support Languages
Speech-to-Text
0.02
Supports 22 languages, including:
Chinese, Chinese (Traditional), English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Turkish, Arabic, Spanish, Hindi, French, Malay, Filipino, German, Italian, Russian, Swedish, Danish, and Norwegian.
AI Real-time Translation
0.016
Supports 15 languages, including:
Chinese, English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Arabic, Spanish, French, Malay, German, Italian, and Russian.

Metering & Usage Notes

Note:
Service duration is metered in seconds and accumulated on a per SDKAppID basis. For billing, the total daily seconds are converted to minutes, and any remaining seconds are rounded up to the next full minute.
When speech-to-text or AI real-time translation is enabled in a TRTC room, a robot will join as a virtual participant to subscribe to the relevant audio/video streams. This subscription incurs audio and video usage duration.

Speech-to-Text

Only the duration of audio streams actively undergoing recognition is billed.
In multi-stream scenarios, the cumulative duration of all input streams is used for billing.

AI Real-Time Translation

Billed based on the duration of the input audio streams actively translated.
If a single input stream is translated into multiple target languages, billing is calculated as Input Duration × Number of Output Languages

Billing Examples

For example, suppose Users A and B are having a voice call in Chinese. Viewer C requires English subtitles, while viewer D requires Japanese subtitles. Both Speech-to-Text and AI Real-Time Translation services are involved in this scenario. The total call duration is 5 minutes. The corresponding charges are calculated as follows:
Billing Type
User A
User B
Subtotal
Speech-to-Text
5 minutes
5 minutes
10 minutes
AI Real-time Translation
5 minutes * 2
5 minutes * 2
20 minutes
Speech-to-text charges: 10 minutes of usage is incurred, unit price is 0.02 USD/minute, the cost is 0.02 × 10 = 0.2 USD;
AI Real-time translation charges: 20 minutes of usage is incurred, unit price is 0.016 USD/minute, the cost is 0.016 × 20 = 0.32 USD.
In this scenario, you need to pay the total fee: 0.52 USD.

Integration Guide

For integration steps, please refer to the Speech-to-Text and Translation Integration Instructions.