Speech Processing by Computer

 

LECTURE 10

SPEECH UNDERSTANDING

 

Objectives

 

By the end of the session you should:

        be able to outline the general architecture of a contemporary large vocabulary speech recognition system

        be able to describe the function of the phonetic, word and sentence decoding stages

        be able to describe in general terms how acoustic models and language models are built from training data

        appreciate that knowledge in these systems is expressed probabilistically

        appreciate that recognition is finding the best explanation of the signal

 

Outline


 

10.  Speech Understanding

10.1.     General Architecture

10.2.     Phonetic Decoding

10.2.1.  Acoustic Model

10.3.     Word Decoding

10.3.1.  Dictionary

10.4.     Sentence Decoding

10.4.1.  Language Model

10.5.     Putting it all together

10.5.1.  Viterbi decoding

10.5.2.  Bayes Theorem

10.6.     Research Challenges

10.6.1.  Systematic phonetic variation

10.6.2.  Long-distance modelling

10.6.3.  Accent variation

10.6.4.  Recognition in noise

10.6.5.  Adaptation

Reading

 

W. Holmes and M. Huckvale, Why Have HMMs been so successful for Automatic Speech Recognition?, in Speech Hearing and Language - Work in Progress, University College London, 1994.

 

K.F.Lee, An Overview of the SPHINX Speech Recognition System, IEEE Transactions Acoustics, Speech and Signal Processing, 38 (1991) p423.