Speech Processing by
Computer
LECTURE 9
SPEECH RECOGNITION
Objectives
By
the end of the session you should:
• be able to describe the basic
architecture of an isolated word recognition system
• be able to list the main sources of
variability which present problems for such systems and how they are typically
overcome
• be able to describe what end-point
detection does
• be able to justify what acoustic
parameters are used for matching
• be able to explain why non-linear
time alignment is necessary and how it works in general terms
Outline
9.
Isolated
Word Recognition
9.1.
Aims
9.1.1.
Word
identification
9.2.
Applications
9.2.1.
Command and
Control
9.3.
Problems
9.3.1.
Segmentation
9.3.1.1.Isolated words
9.3.2.
Speaker
variability
9.3.2.1.Speaker dependency
9.3.3.
Environmental
variability
9.3.3.1.Close-talking microphone
9.3.4.
Discrimination
9.3.4.1.Limited vocabulary
9.4.
Construction
9.4.1.
End-point
detection
9.4.2.
Acoustic
Processing
9.4.2.1.Spectral shape parameters
9.4.3.
Matching
9.4.3.1.Spectrum distance
9.4.3.2.Time alignment
Reading
G. Chollet, "Automatic Speech
and Speaker Recognition", in Fundamentals of Speech Synthesis and
Speech Recognition ed E. Keller, Wiley, 1994.
J.N. Holmes, "Speech
recognition by pattern matching of whole words", in Speech Synthesis
and Recognition, van Nostrand, 1988.