The SpeechDetector result

class UASSpeechDetectorResult

The speech detector result object.

This object is returned by the speech detector’s get_recognised_speech function.

Usage example:

result = channel.SpeechDetector.get_recognised_speech()
if result.get_recognised_words_as_string() == "yes":
    # speech detector recognised the word yes
    pass
get_recognised_words()

Return the recognition result.

The recognition result is a list of words. If there is no result, None will be returned.

Usage example:

result = channel.SpeechDetector.get_recognised_speech()
# the result will always be lower case
words = result.get_recognised_words()
if not words:
    # speech detector didn't understand the speech or nothing was said
    pass
elif words[0] == "yes":
    # speech detector recognised the word yes
    pass
elif words[0] == "no":
    # speech detector recognised the word no
    pass
else:
    # something else in the grammar was said
    pass
get_recognised_words_as_string()

Return the recognition result.

The recognition result is a string of words. If there is no result, None will be returned.

Usage example:

result = channel.SpeechDetector.get_recognised_speech()
# the result will always be lower case
words = result.get_recognised_words_as_string()
if not words:
    # speech detector didn't understand the speech or nothing was said
    pass
elif words == "yes":
    # speech detector recognised the word yes
    pass
elif words == "no":
    # speech detector recognised the word no
    pass
else:
    # something else in the grammar was said
    pass

The SpeechDetector property

Aculab Cloud provides Automatic Speech Recognition through the UAS Speech Detector. For the latest information regarding this feature please go to the online cloud documentation.

class UASCallSpeechDetector

The SpeechDetector class represents a detector that can recognise specific speech utterances. You supply a grammar definition and the speech detector will use Automatic Speech Recognition (ASR) to try to detect utterances, that fulfil the grammar definition, in a media stream.

The grammar defines exactly what utterances the speech detector listens for. It is particularly important to define the grammar accurately to obtain the best detection results.

Please have a look at the on-line ASR documentation for more information on setting up an ASR system, including the rules for setting up a grammar.

class Cause

Once a speech detection job has ended and the state has returned to IDLE or ERROR, the cause will be one of these.

ERROR

the speech detector has stopped due to an error.

NORMAL

the speech detector has stopped normally.

OUTOFGRAMMAR

the words that were spoken were not recognised as part of the grammar.

BARGEIN

the speech detector has stopped due to a DTMF digit being detected. Run the DTMFDetector get_digits() function to collect the digits, if required.

ABORTED

speech detection has been aborted (probably because stop was called).

HANGUP

speech detection ended due to a call hang-up.

TIMEOUT

the speech detector stopped because a timer has expired.

NONE

speech detection is in progress or speech detection has not started.

Usage example:

cause = channel.SpeechDetector.cause()
if cause == channel.SpeechDetector.Cause.BARGEIN:
    # speech detection interrupted by DTMF, get the digits,
    # in this example we want five digits
    dtmf = channel.DTMFDetector.get_digits(count=5)
class Grammar(formatted_grammar='')

The Grammar class represents a textual representation of grammars for use in speech recognition. The grammar defined here is used by Aculab Cloud speech recognition to determine what to listen for and, so, defines what a user may say.

The grammar can have a maximum of 2000 characters, but it should be kept simple to ensure a good recognition rate. The accuracy can be optimised by careful programming: designing the grammar to avoid confusable words, and prompting the caller in such a way as to ensure they respond in line with that grammar.

Optional argument:
  • formatted_grammar:

    A grammar string that defines what the speech recognition will listen for.

The argument formatted_grammar, if provided, must adhere to the grammar rules:

The Speech Grammar Format:

  • All grammars must end with a semicolon.

  • The most basic grammar instructs the speech detector to listen for a single word, for example, door; will recognise the word door.

  • A sequence of words can be given, shut the door; will recognise that sequence of words.

  • Square brackets can be used to make a word optional, so, [please] shut the door; will recognise the sequence with or without the please.

  • The pipe symbol can be used to provide alternatives, door | gate; will recognise the word door or the word gate.

  • Round brackets can be used to group words and rules together, so, shut the (door | gate); will recognise “shut the door” or “shut the gate”; whereas, without the round brackets, shut the door | gate, will recognise the phrase “shut the door” or the single word “gate”.

Putting these rules together we can make, [please] shut the (door | gate);.

Capital letters are allowed but results will always be lower case. The non-alphabet characters permitted in words are: apostrophe, full stop (period), hyphen and underscore.

For more grammar rule options and examples please see the documentation at.

Usage example:

my_grammar = channel.SpeechDetector.Grammar('yes [please] | no [thanks];')
create_from_alternatives(*alternatives)

Create a formatted grammar string from an array of alternatives.

Required argument:
  • alternatives:

    The alternative strings to listen for.

Usage example:

# this example will create the grammar 'yes | no | maybe;'
my_grammar = channel.SpeechDetector.Grammar()
my_grammar.create_from_alternatives('yes', 'no', 'maybe')
create_from_predefined(predefined)

Several pre-defined grammars are available for you to use. Please check the website for more options.

Examples:
  • OneDigit

    To recognise a single digit, zero to nine.

  • TwoDigits

    To recognise two digits, zero to nine.

  • ThreeDigits

    To recognise three digits, zero to nine.

  • FourDigits

    To recognise four digits, zero to nine.

  • FiveDigits

    To recognise five digits, zero to nine.

  • OneToThirtyOne

    To recognise a single number, one to thirty one.

  • SixteenToNinetyNine

    To recognise a single number, sixteen to ninety nine.

  • ZeroToNinetyNine

    To recognise a single number, zero to ninety nine.

Usage example:

my_grammar = channel.SpeechDetector.Grammar()
my_grammar.create_from_predefined('OneDigit')
class SpeechDetectorTrigger

To be used with the prime() function.

The speech detector can be primed to start listening for a grammar definition at a predefined moment.

ONPLAYSTART

The speech detector will start as soon as the next media play has begun. This allows the caller to begin speaking during the media play, thus interrupting it.

ONPLAYEND

The speech detector will start as soon as the next media play stops normally. The caller cannot interrupt the media play.

Usage example:

my_grammar = channel.SpeechDetector.Grammar('yes | no;')
my_start = channel.SpeechDetector.SpeechDetectorTrigger.ONPLAYEND
if channel.SpeechDetector.prime(my_grammar, my_start) is True:
    # speech detection primed to begin after the next play command ends.
    pass
class State

The speech detector state can be checked to determine whether the speech detector is running.

Speech detector states are:

RUNNING

the speech detector is running.

IDLE

the speech detector is not running.

PRIMED

the speech detector is primed. See the prime() function.

ERROR

an error has occurred.

Usage example:

state = channel.SpeechDetector.state()
if state == channel.SpeechDetector.State.RUNNING:
    # speech detection is in progress
    pass
cause()

This function will return a cause.

When a particular job terminates, the reason why can be requested by calling this function.

If this function is called while a job is still running, the cause will be NONE.

get_recognised_speech(seconds_timeout=10)

Return the latest speech detection result.

Returns a UASSpeechDetectorResult object.

Optional argument:
  • seconds_timeout:

    How long to wait for the speech recognition result. Default is 10 seconds.

If the speech detector is IDLE the result of the most recent speech detection is returned immediately. If the speech detector is currently PRIMED or RUNNING this method will block until a result is obtained, the detection times out, or the detector is interrupted.

The speech detector can be interrupted by the caller pressing digits on the telephone keypad. If this happens, the cause will be BARGEIN and the result will be None. If the system is designed to use speech or DTMF digits as a valid response, call the DTMFDetector.get_digits() function to retrieve the required number of digits.

This function can raise a Hangup exception.

Usage example:

# in a yes/no system, the caller might press 1 for yes instead of saying it
result = channel.SpeechDetector.get_recognised_speech()
words = result.get_recognised_words()
if not words:
    # no words, check for DTMF digits
    cause = channel.SpeechDetector.cause()
    if cause == channel.SpeechDetector.Cause.BARGEIN:
        dtmf = channel.DTMFDetector.get_digits(count=1)
        if dtmf == '1':
            # caller pressed 1 for yes
            pass
elif words[0] == "yes":
    # caller said the word yes
    pass
prime(listen_for, start_on=None, seconds_detection_timeout=30)

Prime the speech detector to start detecting speech after the next prompt has finished playing, or as soon as the next prompt starts.

Required argument:
  • listen_for:

    The grammar definition to listen for.

Optional argument:
  • start_on:

    This defines when the speech detection should start. Default is ONPLAYEND.

  • seconds_detection_timeout:

    The number of seconds to wait for the defined speech to be detected. This timeout starts counting down as soon as the media prompt that triggers the detection has ended. Default is 30 seconds.

The argument start_on will be of type SpeechDetectorTrigger.

The argument listen_for will be of type Grammar.

The argument seconds_detection_timeout defines how long to wait for the speech detector to recognise the grammar definition in the media stream. The timer will start as soon as the media prompt has ended, it is not affected by the start_on setting.

This function returns True if the speech detection was successfully primed to start.

This function can raise a Hangup exception.

Usage example:

my_grammar = channel.SpeechDetector.Grammar('yes | no;')
my_start = channel.SpeechDetector.SpeechDetectorTrigger.ONPLAYSTART
if channel.SpeechDetector.prime(my_grammar, my_start) is True:
    # speech detection primed to begin when next play starts
    pass
start(listen_for, seconds_detection_timeout=30)

Start the speech detector immediately.

If a media prompt is playing when this function is called, or if one is started afterwards, it will be interrupted if the caller starts to speak.

Required argument:
  • listen_for:

    The grammar definition to listen for.

Optional argument:
  • seconds_detection_timeout:

    The number of seconds to wait for the defined speech to be detected. Default is 30 seconds.

The argument listen_for will be of type Grammar.

The argument seconds_detection_timeout defines how long to wait for the speech detector to recognise the grammar definition in the media stream. The timer will start immediately.

This function returns True if the speech detection was successfully started.

This function can raise a Hangup exception.

Usage example:

my_grammar = channel.SpeechDetector.Grammar('yes | no;')
if channel.SpeechDetector.start(my_grammar) is True:
    # speech detection started OK
    pass
state()

This function will return the current state.

When a particular job is busy, its state can be tracked by calling this function.

If this function is called while when no job is in progress, the state will be IDLE.

stop()

Stop the speech detector.

This function will return True if the speech detector is successfully stopped, else it will return False.

This function can raise a Hangup exception.

Usage example:

if channel.SpeechDetector.stop() is True:
    # speech detection stopped OK
    pass