The SpeechDetector result¶
- class UASSpeechDetectorResult¶
The speech detector result object.
This object is returned by the speech detector’s
get_recognised_speech
function.Usage example:
result = channel.SpeechDetector.get_recognised_speech() if result.get_recognised_words_as_string() == "yes": # speech detector recognised the word yes pass
- get_recognised_words()¶
Return the recognition result.
The recognition result is a list of words. If there is no result, None will be returned.
Usage example:
result = channel.SpeechDetector.get_recognised_speech() # the result will always be lower case words = result.get_recognised_words() if not words: # speech detector didn't understand the speech or nothing was said pass elif words[0] == "yes": # speech detector recognised the word yes pass elif words[0] == "no": # speech detector recognised the word no pass else: # something else in the grammar was said pass
- get_recognised_words_as_string()¶
Return the recognition result.
The recognition result is a string of words. If there is no result, None will be returned.
Usage example:
result = channel.SpeechDetector.get_recognised_speech() # the result will always be lower case words = result.get_recognised_words_as_string() if not words: # speech detector didn't understand the speech or nothing was said pass elif words == "yes": # speech detector recognised the word yes pass elif words == "no": # speech detector recognised the word no pass else: # something else in the grammar was said pass
The SpeechDetector property¶
Aculab Cloud provides Automatic Speech Recognition through the UAS Speech Detector. For the latest information regarding this feature please go to the online cloud documentation.
- class UASCallSpeechDetector¶
The SpeechDetector class represents a detector that can recognise specific speech utterances. You supply a grammar definition and the speech detector will use Automatic Speech Recognition (ASR) to try to detect utterances, that fulfil the grammar definition, in a media stream.
The grammar defines exactly what utterances the speech detector listens for. It is particularly important to define the grammar accurately to obtain the best detection results.
Please have a look at the on-line ASR documentation for more information on setting up an ASR system, including the rules for setting up a grammar.
- class Cause¶
Once a speech detection job has ended and the state has returned to
IDLE
orERROR
, the cause will be one of these.- ERROR
the speech detector has stopped due to an error.
- NORMAL
the speech detector has stopped normally.
- OUTOFGRAMMAR
the words that were spoken were not recognised as part of the grammar.
- BARGEIN
the speech detector has stopped due to a DTMF digit being detected. Run the DTMFDetector
get_digits()
function to collect the digits, if required.- ABORTED
speech detection has been aborted (probably because stop was called).
- HANGUP
speech detection ended due to a call hang-up.
- TIMEOUT
the speech detector stopped because a timer has expired.
- NONE
speech detection is in progress or speech detection has not started.
Usage example:
cause = channel.SpeechDetector.cause() if cause == channel.SpeechDetector.Cause.BARGEIN: # speech detection interrupted by DTMF, get the digits, # in this example we want five digits dtmf = channel.DTMFDetector.get_digits(count=5)
- class Grammar(formatted_grammar='')¶
The Grammar class represents a textual representation of grammars for use in speech recognition. The grammar defined here is used by Aculab Cloud speech recognition to determine what to listen for and, so, defines what a user may say.
The grammar can have a maximum of 2000 characters, but it should be kept simple to ensure a good recognition rate. The accuracy can be optimised by careful programming: designing the grammar to avoid confusable words, and prompting the caller in such a way as to ensure they respond in line with that grammar.
- Optional argument:
- formatted_grammar:
A grammar string that defines what the speech recognition will listen for.
The argument
formatted_grammar
, if provided, must adhere to the grammar rules:The Speech Grammar Format:
All grammars must end with a semicolon.
The most basic grammar instructs the speech detector to listen for a single word, for example,
door;
will recognise the word door.A sequence of words can be given,
shut the door;
will recognise that sequence of words.Square brackets can be used to make a word optional, so,
[please] shut the door;
will recognise the sequence with or without the please.The pipe symbol can be used to provide alternatives,
door | gate;
will recognise the word door or the word gate.Round brackets can be used to group words and rules together, so,
shut the (door | gate);
will recognise “shut the door” or “shut the gate”; whereas, without the round brackets,shut the door | gate
, will recognise the phrase “shut the door” or the single word “gate”.
Putting these rules together we can make,
[please] shut the (door | gate);
.Capital letters are allowed but results will always be lower case. The non-alphabet characters permitted in words are: apostrophe, full stop (period), hyphen and underscore.
For more grammar rule options and examples please see the documentation at.
Usage example:
my_grammar = channel.SpeechDetector.Grammar('yes [please] | no [thanks];')
- create_from_alternatives(*alternatives)¶
Create a formatted grammar string from an array of alternatives.
- Required argument:
- alternatives:
The alternative strings to listen for.
Usage example:
# this example will create the grammar 'yes | no | maybe;' my_grammar = channel.SpeechDetector.Grammar() my_grammar.create_from_alternatives('yes', 'no', 'maybe')
- create_from_predefined(predefined)¶
Several pre-defined grammars are available for you to use. Please check the website for more options.
- Examples:
- OneDigit
To recognise a single digit, zero to nine.
- TwoDigits
To recognise two digits, zero to nine.
- ThreeDigits
To recognise three digits, zero to nine.
- FourDigits
To recognise four digits, zero to nine.
- FiveDigits
To recognise five digits, zero to nine.
- OneToThirtyOne
To recognise a single number, one to thirty one.
- SixteenToNinetyNine
To recognise a single number, sixteen to ninety nine.
- ZeroToNinetyNine
To recognise a single number, zero to ninety nine.
Usage example:
my_grammar = channel.SpeechDetector.Grammar() my_grammar.create_from_predefined('OneDigit')
- class SpeechDetectorTrigger¶
To be used with the
prime()
function.The speech detector can be primed to start listening for a grammar definition at a predefined moment.
- ONPLAYSTART
The speech detector will start as soon as the next media play has begun. This allows the caller to begin speaking during the media play, thus interrupting it.
- ONPLAYEND
The speech detector will start as soon as the next media play stops normally. The caller cannot interrupt the media play.
Usage example:
my_grammar = channel.SpeechDetector.Grammar('yes | no;') my_start = channel.SpeechDetector.SpeechDetectorTrigger.ONPLAYEND if channel.SpeechDetector.prime(my_grammar, my_start) is True: # speech detection primed to begin after the next play command ends. pass
- class State¶
The speech detector state can be checked to determine whether the speech detector is running.
Speech detector states are:
- RUNNING
the speech detector is running.
- IDLE
the speech detector is not running.
- PRIMED
the speech detector is primed. See the
prime()
function.- ERROR
an error has occurred.
Usage example:
state = channel.SpeechDetector.state() if state == channel.SpeechDetector.State.RUNNING: # speech detection is in progress pass
- cause()¶
This function will return a cause.
When a particular job terminates, the reason why can be requested by calling this function.
If this function is called while a job is still running, the cause will be
NONE
.
- get_recognised_speech(seconds_timeout=10)¶
Return the latest speech detection result.
Returns a
UASSpeechDetectorResult
object.- Optional argument:
- seconds_timeout:
How long to wait for the speech recognition result. Default is 10 seconds.
If the speech detector is
IDLE
the result of the most recent speech detection is returned immediately. If the speech detector is currentlyPRIMED
orRUNNING
this method will block until a result is obtained, the detection times out, or the detector is interrupted.The speech detector can be interrupted by the caller pressing digits on the telephone keypad. If this happens, the cause will be
BARGEIN
and the result will beNone
. If the system is designed to use speech or DTMF digits as a valid response, call the DTMFDetector.get_digits() function to retrieve the required number of digits.This function can raise a
Hangup
exception.Usage example:
# in a yes/no system, the caller might press 1 for yes instead of saying it result = channel.SpeechDetector.get_recognised_speech() words = result.get_recognised_words() if not words: # no words, check for DTMF digits cause = channel.SpeechDetector.cause() if cause == channel.SpeechDetector.Cause.BARGEIN: dtmf = channel.DTMFDetector.get_digits(count=1) if dtmf == '1': # caller pressed 1 for yes pass elif words[0] == "yes": # caller said the word yes pass
- prime(listen_for, start_on=None, seconds_detection_timeout=30)¶
Prime the speech detector to start detecting speech after the next prompt has finished playing, or as soon as the next prompt starts.
- Required argument:
- listen_for:
The grammar definition to listen for.
- Optional argument:
- start_on:
This defines when the speech detection should start. Default is
ONPLAYEND
.
- seconds_detection_timeout:
The number of seconds to wait for the defined speech to be detected. This timeout starts counting down as soon as the media prompt that triggers the detection has ended. Default is 30 seconds.
The argument
start_on
will be of type SpeechDetectorTrigger.The argument
listen_for
will be of type Grammar.The argument
seconds_detection_timeout
defines how long to wait for the speech detector to recognise the grammar definition in the media stream. The timer will start as soon as the media prompt has ended, it is not affected by thestart_on
setting.This function returns
True
if the speech detection was successfully primed to start.This function can raise a
Hangup
exception.Usage example:
my_grammar = channel.SpeechDetector.Grammar('yes | no;') my_start = channel.SpeechDetector.SpeechDetectorTrigger.ONPLAYSTART if channel.SpeechDetector.prime(my_grammar, my_start) is True: # speech detection primed to begin when next play starts pass
- start(listen_for, seconds_detection_timeout=30)¶
Start the speech detector immediately.
If a media prompt is playing when this function is called, or if one is started afterwards, it will be interrupted if the caller starts to speak.
- Required argument:
- listen_for:
The grammar definition to listen for.
- Optional argument:
- seconds_detection_timeout:
The number of seconds to wait for the defined speech to be detected. Default is 30 seconds.
The argument
listen_for
will be of type Grammar.The argument
seconds_detection_timeout
defines how long to wait for the speech detector to recognise the grammar definition in the media stream. The timer will start immediately.This function returns
True
if the speech detection was successfully started.This function can raise a
Hangup
exception.Usage example:
my_grammar = channel.SpeechDetector.Grammar('yes | no;') if channel.SpeechDetector.start(my_grammar) is True: # speech detection started OK pass
- state()¶
This function will return the current state.
When a particular job is busy, its state can be tracked by calling this function.
If this function is called while when no job is in progress, the state will be
IDLE
.
- stop()¶
Stop the speech detector.
This function will return
True
if the speech detector is successfully stopped, else it will returnFalse
.This function can raise a
Hangup
exception.Usage example:
if channel.SpeechDetector.stop() is True: # speech detection stopped OK pass