Speech to Text

Converts live speech to text.

This is useful for routing contacts on the basis of their spoken words and phrases.

A third-party speech-to-text provider uses live speech to create an array of text interpretations and assigns a percentage confidence value to each. The action cell stores the first three interpretations that match or exceed a confidence threshold of your choice. You can then use or manipulate any of these text interpretations for routing contacts.

Properties

Media List/Introduction Prompt

Use the Select Media List option to select the media list containing the introduction prompt audio file and then either use the Select Introduction Prompt option to select the file or use the Use Dynamic Introduction Prompt option to use a local string variable containing the name of the file.

Alternatively, use the Use Dynamic Media List option to use a local string variable containing the name of the media list and then use a local string variable containing the name of the introduction prompt audio file in the media list.

Click the None check box if you do not want to use an introduction prompt.

Model

Only 'Standard' is currently available. This uses options supported by a third-party speech-to-text provider.

Timeout

Select the maximum duration for which the caller can speak before the audio is interpreted. This is measured from the end of the introduction prompt (if present).

Confidence Threshold (%)

Specify a percentage threshold value. Interpretations matching or exceeding this value will be written to the interpretation variables (see below).

Language

Select the spoken language to detect.

Interpretation 1

The string variable in which to store the first interpretation of up to 2048 characters (including spaces) that has a confidence level matching or exceeding the specified confidence threshold.

Confidence 1 (%)

The integer variable in which to store the confidence value assigned to interpretation 1 (as returned by the speech-to-text provider).

Interpretation 2

The string variable in which to store the second interpretation of up to 2048 characters (including spaces) that has a confidence level matching or exceeding the specified confidence threshold.

Confidence 2 (%)

The integer variable in which to store the confidence value assigned to interpretation 2 (as returned by the speech-to-text provider).

Interpretation 3

The string variable in which to store the third interpretation of up to 2048 characters (including spaces) that has a confidence level matching or exceeding the specified confidence threshold.

Confidence 3 (%)

The integer variable in which to store the confidence value assigned to interpretation 3 (as returned by the speech-to-text provider).

Exit points

Exit point	Taken
[Interpretation Found]	If the speech-to-text provider returned at least one interpretation matching or exceeding the value in the Confidence Threshold (%) parameter.
[Interpretation Not Found]	If the speech-to-text provider did NOT return an interpretation matching or exceeding the value in the Confidence Threshold (%) parameter.
[No Audio Present]	If no audio was detected within the time set in the Timeout parameter.
Error (*)	If an internal system error occurred.