Speech Processing by Computer

 

AIMS

 

Overview

 

            The speech processing by computer course aims to give an elementary introduction to concepts, methods and applications of speech signal processing.  The course covers the principles of digital processing, the acquisition and replay of signals, the operation of linear digital systems, spectral analysis, fundamental frequency analysis, formant analysis, applications in speech synthesis and speech recognition.  Lecture material is tied to laboratory sessions which involves the use of ready-made application programs.

 

Pre-requisites

 

            A background in the acoustics of speech and hearing is recommended, but no knowledge of computers or programming is required. 

 

Objectives

 

            By the end of the course, the student should have enough knowledge to appreciate the kinds of speech processing that computers can conveniently carry out.  The student should understand in principle how speech signals are acquired, represented and manipulated on digital systems.  For the main types of processing and analysis, the student should know what range of methods are available and some of the relative advantages and disadvantages.  For two main application areas of speech synthesis and speech recognition, the student should be able to identify the main processing stages and understand the main challenges.  At the end of the course, the student should be able to undertake a phonetic research project which involves the use of computer analysis.

 

Assessment

 

            Undergraduates: 2 laboratory reports, 1 two-hour written examination.

            Postgraduates: 1 laboratory report, 1 three-hour written examination.

 

Readings

 

            Readings for each week will be provided. 

            The best introduction to signals and systems theory is:

                        Rosen & Howell, Signals and Systems for Speech and Hearing, Academic Press 1990.

            The most accessible introduction to speech signal processing is:

                        Harrington & Cassidy, Techniques in Speech Acoustics, Kluwer, 1999

            The best introduction to speech synthesis and recognition is:

                        Holmes & Holmes, Speech Synthesis and Recognition, Taylor & Francis, 2001

            A good general introduction to the field is:

                        Morgan & Gold, Speech and Audio Signal Processing, Wiley, 1999

 

Syllabus

 

Week

Lecture

Lab

Slides

 

1

Digital sampling and editing of signals

Digital recording and editing

 

 

2

Principles of digital signal processing

Linear systems

 

 

3

Digital filtering

Filterbank analysis/synthesis

 

 

4

Spectral Analysis

DFT and LPC spectra

 

 

5

Fx and Formant Analysis

Fx and Formant Analysis

 

 

6

Neural Networks for Speech and Language – 1 Introduction 

Neural Networks for Speech and Language – 2 Applications

Slides (PDF)

Slides (PDF)

 

7

Speech Synthesis:

text to transcription

Synthesis by rule

 

 

8

Speech Synthesis:
transcription to sound

Concatenative synthesis

 

 

9

Speech Recognition:

isolated word recognition

Template matching

 

 

10

Speech Recognition:

speech understanding

Statistical matching