Speech Processing
by Computer
LAB 7
TEXT-TO-SPEECH
SYNTHESIS
In this lab
session we will use 3 web-based synthesis systems to investigate the components
of a text-to-speech system and the challenges they face.
1.
Locate these systems in three separate browser windows.
a. The AT&T
Next-Generation TTS system at:
http://www.research.att.com/~mjm/cgi-bin/ttsdemo
b.
The Lucent Technologies TTS system at
http://www.bell-labs.com/project/tts/voices-java.html
c.
The Edinburgh Festival TTS system at
http://www.cstr.ed.ac.uk/projects/festival/userin.html
2.
Design some sentences that stress
different levels of the system:
a. Text
normalisation (e.g. ambiguous abbreviations)
b. Prosodic
phrasing (e.g. garden path sentences)
c. Intonation
(e.g. locations of pitch prominences)
d. Letter-to-Sound
(e.g. some odd pronunciations)
3.
Test out the different systems with the
sentences.
a. Save
the audio to files on the computer, and view them with WASP
b. Are
there differences between the systems?
c. What
default assumptions do the system make?
4.
Listen to the durations, pitch and voice
quality of these systems.
a. What
aspects of the speech are still in need of improvement?
5.
Compare a couple of synthetic utterances
with natural recordings.
a. Print
out spectrograms of a few words of synthetic and few words of natural and
identify the largest differences.