VTDemo - Vocal Tract Acoustics Demonstrator
VTDemo is an interactive Windows PC program for demonstrating how the quality of different speech sounds can be explained by changes in the shape of the vocal tract. With VTDemo you can move the articulators in a 2D simulation of the vocal tract cavity and hear in real-time the consequences on the sound produced.
VTDemo is an implementation of the articulatory synthesizer of Shinji Maeda developed from the program VTCALCS as distributed by Satrajit Ghosh at Boston University. The VTCALCS program converts a set of seven vocal tract shape parameters into a vocal tract area function which is then used to filter a voicing signal from a modelled voice source.
VTDemo 1.4 extends VTCALCS in the following ways:
- Real-time synthesis allowing the effect of change to articulatory parameters to be heard as they are manipulated.
- A new voice source model based on the LF Model.
- A new GA parameter to control glottal area.
- A real-time spectral display.
- Creation/replay of animated articulations.
- Vocal tract length switch for adult male/adult female/child speaker.
VTDemo 3.6 adds these capabilities:
- Control table for editing dynamic synthesis
- NS parameter for controlling size of velo-pharyngeal port
- Animation of glottal area and velum
- Real-time formant frequency estimation
- Pre-programmed vowel and consonant examples
Download & Installation
The program is only available for Windows PCs by anonymous FTP from
- ftp://ftp.phon.ucl.ac.uk/pub/sfs/vtdemo/vtdemo140.exe VTDEMO version 1.4 (0.3MB)
- ftp://ftp.phon.ucl.ac.uk/pub/sfs/vtdemo/vtdemo360.exe VTDEMO version 3.6 (0.6MB)
To download the program, right click on the link above and choose "Save Target As". Save the file to your desktop or to a folder on your computer. Then run the file to install the program and to add an entry to your Start Programs menu. Once installed you can delete the downloaded file.
The VTDemo 1.4 menu options are as follows:
- File/Open Animation
- Open a text file describing a vocal tract animation. The format of each line in a file is as follows: a) Duration of section in ms, b) JW parameter, c) TP parameter, d) TS parameter, e) TA parameter, f) LA parameter, g) LP parameter, h) LH parameter, i) GA parameter, j) FX parameter. Each parameter value is in the range -3.0 .. 3.0, and identifies the value at the start of the section. Interpolation between sections is automatic. The last section specifies the final interpolation target and should have zero duration. Some example files are supplied with the VTDemo installation.
- File/Save Animation
- Saves an animation created or edited in the animation table to a text file.
- Synth/Play Animation
- Starts replay of the current animation from the beginning.
- Synth/Play Continuously
- Starts continuous synthesis and replay using the last known control parameters. Use the VT controls panel to manipulate the vocal tract shape.
- Synth/Play Table
- Starts replay of the animation stored in the edit table.
- Stops current animation or synthesis.
- Sets vocal tract length to typical adult male.
- Sets vocal tract length to typical adult female.
- Sets vocal tract length to typical child.
- Play an animated synthesis of a few simple vowel articulations.
- View/VT Controls
- Display the vocal tract shape parameter control panel.
- View/Animation Table
- Display the animation editing table.
- Display a spectrum during synthesis.
- Display some configuration options:
- Simulation frequency: sets the internal synthesis rate. Default: 32000.
- Decimation: sets how often an audio sample is output for each internal synthesis frame. A decimation of 2 means that the output sample rate is half the simulation rate. Default: 2.
- Voice model: The Fant model is the original voice source model of VTCALCS. the LF model is an implementation of the Liljencrantz-Fant model. Default: LF model.
Vocal Tract Shape Control Parameters
|JW||Jaw Height||−3 .. 3|
|TP||Tongue Position||−3 .. 3|
|TS||Tongue Shape||−3 .. 3|
|TA||Tongue Apex||−3 .. 3|
|LA||Lip Area||−3 .. 3|
|LP||Lip Protrusion||−3 .. 3|
|LH||Larynx Height||−3 .. 3|
|GA||Glottal Area||−3 .. 3||-3.0..-2.7 = Open
-2.7..-1.5 = Voiceless
-1.5..-1.0 = Breathy voice
-1.0..1.5 = Normal voice
1.5..3.0 = Creaky voice
|FX||Fundamental Frequency||−3 .. 3||Adult Male: 89-191Hz
Adult Female: 161-299Hz
|NS (3.5 only)||Velo-pharyngeal port||0 .. 3|
- Maeda, S. (1982), "A digital simulation method of the vocal-tract system", Speech Communication, 1, 199-229.
- Maeda, S. (1989) "Compensatory Articulation during Speech: Evidence from the Analysis and Synthesis of Vocal-Tract Shapes using an Articulatory Model", in Speech Production and Modelling, 131-149. W.J.Hardcastle & A. Marchal (Eds.), Academic Publishers, Kluwer.
- VTCALCS software download at http://www.cns.bu.edu/~speech/VTCalcs.php.
Please send suggestions for improvements and reports of program faults to email@example.com.
VTDemo is not public domain software, its intellectual property is owned by Mark Huckvale, University College London. However VTDemo may be used and
copied without charge as long as the program remains unmodified and continues to carry this copyright notice. Please contact the author for other licensing arrangements. VTDemo carries no warranty of any kind, you use it at your own risk.