A Tutorial Introduction to SALT8. Speech Recognition GrammarsControl of how a SALT <LISTEN> object recognises speech is performed using a recognition grammar written in a special formalism called the Speech Recognition Grammar Specification (SRGS). SRGS is an XML-based markup language that can be used to specify the allowed word sequences that can be recognised. It also allows the use word occurrence probabilities to affect the likelihood of different recognition outcomes. We give a basic introduction to SRGS below, but full details can be found in the W3C Speech Recognition Grammar Specification (SRGS) or in the SASDK Documentation. 8.1 SRGS Elements
Each of these is discussed in more detail below. <grammar> elementThe grammar element is the highest level container and indicates the name of the top-level grammar rule.
<rule> elementEach grammar must be made up from one or more rules which describe sequences or alternatives of recognised elements. One rule is identified as the root rule.
<item> elementThe item element marks up a recognisable element. The element may contain a single word or a phrase. Sequences of elements must be recognised in sequence, unless they are enclosed in a "one-of" element. Attributes allow you to specify if the item can be repeated.
<one-of> elementThe one-of element marks up alternative selections. When a number of item elements occur within a one-of element, only one can be chosen within a single recognised input. See examples below. <ruleref> elementThe ruleref element allows rules to be included within one another to form a hierarchy. This allows rules to be re-used within one phrase, or rules for common components like numbers and dates to be re-used. A special application of the ruleref element is to match extraneous input.
<tag> elementThe tag element can be used to associate certain actions with a choice of a recognised element in a grammar rule. Typically this involves setting the value of some text within the XML structured recognition results. This allows you to "parse" the input into meaningful units during the recognition rather than having a separate set of rules for parsing the recognised word string. The typical contents of a tag element look like this:
<tag> $.property={}; $.property._value="desired value"; </tag>
In this example '$' refers to the root of the XML recognition structure. The statement $.property={}; adds a new child element called 'property', while $.property._value="desired value" stored 'desired value' into the new child element. You can now access this stored text from your Javascript once the recognition is complete. Here is an example:
<rule id="WeekdaySelection">
<one-of>
<item>Monday<tag>$.daynum={}; $.daynum._value="1";</tag></item>
<item>Wednesday<tag>$.daynum={}; $.daynum._value="3";</tag></item>
<item>Friday<tag>$.daynum={}; $.daynum._value="5";</tag></item>
</one-of>
</rule>
The recognised output from the LISTEN object might then look like: <SML confidence="0.850" text="Monday" utteranceConfidence="0.850"> <daynum confidence="0.850">1</daynum> </SML> 8.2 Grammar ExamplesThis is an example of the use of weighted alternatives:
<grammar root="PizzaSize" xml:lang="en-US" version="1.0" xmlns="http://www.w3.org/2001/06/grammar">
<rule id="PizzaSize" scope="public">
a
<one-of>
<item weight=".5">small</item>
<item>medium</item>
<item weight="2">large</item>
</one-of>
pizza
</rule>
</grammar>
This is an example of a rule being used twice within one recognised phrase using the ruleref element.
<grammar root="buyShirt" xml:lang="en-US">
<rule id="buyShirt" scope="public">
<item>
Get me a <ruleref uri="#ruleColors" />
shirt and a <ruleref uri="#ruleColors"/>
tie</item>
</rule>
<rule id="ruleColors" scope="public">
<one-of>
<item>red</item>
<item>white</item>
<item>green</item>
</one-of>
</rule>
</grammar>
8.3 Semantic Mark-up DemonstrationIn this demonstration we show how to display and process the recognised XML formatted recognition result. We use an external grammar which associates colour names with hexadecimal values that can be used to control the background colour of a table cell. This is the grammar, stored in a file colours.grxml:
<grammar version="1.0" xml:lang="en-US"
xmlns="http://www.w3.org/2001/06/grammar" root="colours"
tag-format="semantics-ms/1.0">
<rule id="colours" scope="public">
<one-of>
<item>Black<tag> $.hex={}; $.hex._value="#000000"; </tag></item>
<item>Blue<tag> $.hex={}; $.hex._value="#0000FF"; </tag></item>
<item>Brown<tag> $.hex={}; $.hex._value="#A52A2A"; </tag></item>
<item>Gray<tag> $.hex={}; $.hex._value="#808080"; </tag></item>
<item>Green<tag> $.hex={}; $.hex._value="#008000"; </tag></item>
<item>Indigo <tag> $.hex={}; $.hex._value="#4B0082"; </tag></item>
<item>Orange<tag> $.hex={}; $.hex._value="#FFA500"; </tag></item>
<item>Pink<tag> $.hex={}; $.hex._value="#FFC0CB"; </tag></item>
<item>Purple<tag> $.hex={}; $.hex._value="#800080"; </tag></item>
<item>Red<tag> $.hex={}; $.hex._value="#FF0000"; </tag></item>
<item>Violet<tag> $.hex={}; $.hex._value="#EE82EE"; </tag></item>
<item>White<tag> $.hex={}; $.hex._value="#FFFFFF"; </tag></item>
<item>Yellow<tag> $.hex={}; $.hex._value="#FFFF00"; </tag></item>
</one-of>
</rule>
</grammar>
A larger version of the grammar, with more colours is also available. This is the application code. Try out on your computer: Normal version, Debug version.
<html xmlns:salt="http://www.saltforum.org/2002/SALT">
<object id="speech-add-in" CLASSID="clsid:33cbfc53-a7de-491a-90f3-0e782a7e347a">
</object>
<?import namespace="salt" implementation="#speech-add-in"/>
<!-- SALT: Recognise HTML Colour Names -->
<salt:listen id="recColour" onreco="doColour()">
<salt:grammar src="colours.grxml" />
<salt:bind targetelement="txtDebug" value="/" />
</salt:listen>
<body>
<h1><center>SALT: Recognise HTML Colours</center></h1>
<p><center><table border=1 bgcolor="white">
<tr>
<td colspan=2>
<textarea id="txtDebug" rows=5 cols=50>
SML format recognition results appear here.
</textarea>
</td>
</tr>
<tr>
<td align=center><input name="txtColour" type="text" width="10" /></td>
<td width=150 id="txtCell"> </td>
</tr>
</table></center>
<p><center><input type="button" value="Click to Speak" onclick="recColour.Start()">
<p>Click button, wait for level meter, speak colour name.
</body>
<script>
function doColour()
{
// set text field to colour name
var pRecog=document.getElementById("recColour");
var pField=document.getElementById("txtColour");
pField.value=pRecog.text;
// but set cell colour to hex value
var pCell=document.getElementById("txtCell");
var pNode = pRecog.recoresult.selectSingleNode("//hex");
if (pNode != null) pCell.style.background=pNode.text;
}
</script>
</html>
Particular aspects to note are:
This is how it should look: ![]() Next: Simple dialogue application.
|
|
University College London - Gower Street - London - WC1E 6BT - |