Feedback

Azure

Microsoft Azure Speech Services delivers enterprise-grade speech recognition with support for 100+ languages and variants. It excels in customizable speech models, keyword recognition, and compliance-ready deployments. A strong choice if you're already in the Azure ecosystem or need broad multilingual coverage with enterprise SLAs.

Usage

To use Azure as the STT engine, pass the following JSON in the STTConfig field of the StartAIConversation API:
// json — STTConfig
{
"Language": "en",
"VadSilenceTime": 1000,
"CustomParam": "{\"STTType\":\"azure\",\"SubscriptionKey\":\"<your_azure_subscription_key>\",\"Region\":\"eastus\"}"
}
For the complete STTConfig parameter reference, see the STTConfig configuration guide.

Parameter reference

STTConfig fields

The following fields are part of STTConfig. For the full definition, see STTConfig.
Field
Type
Required
Description
Language
String
No
Language code. See Azure STT language support.
VadSilenceTime
Integer
No
VAD silence duration (ms). See STTConfig.

CustomParam fields

CustomParam is not part of the standard STTConfig fields. It is only required when using a third-party STT engine, and is used to pass the service provider's authentication parameters.
Field
Type
Required
Description
STTType
String
Yes
Fixed value: "azure".
SubscriptionKey
String
Yes
Your Azure Speech resource subscription key. Obtain from Azure Portal.
Region
String
Yes
Azure region for your Speech resource (e.g., eastus, westeurope). See Azure Speech regions.
For more details on Azure Speech Services, see the Azure Speech documentation.