• 製品
  • 価格
  • リソース
  • サポート
このページは現在英語版のみで提供されており、日本語版も近日中に提供される予定です。ご利用いただきありがとうございます。
Feedback

Cartesia

Cartesia is purpose-built for real-time voice AI, offering ultra-low latency streaming TTS with natural-sounding output. Its Sonic model supports multilingual synthesis and voice mixing. An excellent choice when end-to-end latency is critical — such as interactive voice agents where every millisecond counts.

Usage

To use Cartesia as the TTS engine, pass the following JSON in the TTSConfig field of the StartAIConversation API:
// json — TTSConfig
{
"TTSType": "cartesia",
"Model": "sonic-3-2026-01-12",
"APIKey": "<your_cartesia_api_key>",
"VoiceId": "eda5bbff-1ff1-4886-8ef1-4e69a77640a0"
}
For the complete TTSConfig parameter reference, see the Text-to-Speech Configuration.

Parameter reference

Field
Type
Required
Description
TTSType
String
Yes
Must be "cartesia".
Model
String
Yes
Cartesia model name (e.g., sonic-3-2026-01-12). See Cartesia Models.
APIKey
String
Yes
Your Cartesia API key. Obtain from Cartesia Console.
VoiceId
String
Yes
Voice ID. Browse voices at Cartesia Voice Library.
For more details on Cartesia, see the Cartesia documentation.