Project Deliverables

PARTNERS
Cambridge
UCL
York

AIMS

PUBLICATIONS

OUTPUT

ProSynth Project Overview
ProSynth Sentence Database
ProSynth2 Synthesis for Windows
ProSynth Web Demonstration
ProSynth XML processing tools
PROCSY: ProSynth Copy Synthesis

Contents

The ProSynth Project

ProSynth was a project funded by the UK Engineering and Physical Sciences Research Council from 1996 to 2001.

These pages describe the publicly-available resources developed on that project.

You are welcome to use these resources for any purpose, but please acknowledge their source.

The Prosynth Investigators were

Sarah Hawkins (University of Cambridge)
Jill House (University College London)
Mark Huckvale (University College London)
John Local (University of York)
Richard Ogden (University of York)

aided by:

Sebastian Heid (University of Cambridge)
Paul Carter (University of York)
Jana Dankovicova (University College London)
Alex Fang (University College London)
Rachel Knight (University College London & University of Cambridge)
Mark Wainwright (University of Cambridge)

The aims of the Prosynth project were described in: Ogden, R., Hawkins, S., House, J., Huckvale, M., Local, J., Carter, P., Dankovicova, J., Heid, S. (2000). ProSynth: an integrated prosodic approach to device-independent natural- sounding speech synthesis. Computer Speech and Language, 14, 177-210.

Contents

Prosynth Sentence Database

A corpus of phonetically annotated recordings made during the project which were used for the modelling of prosody.

Contains 472 annotated sentences exploring rhythmic patterns and segmental content for the modelling of prosody. Files contain speech signal, laryngograph signal, fundamental frequency, formant frequencies, aligned phonological structure. All sentences were produced by the same male speaker.

You can browse through the sentences individually with our audio browsing tool:

The whole corpus is available at this address:

https://www.phon.ucl.ac.uk/downloads/prosynth/

The corpus can also be ordered on CD-ROM from the UCL Listening Centre.

Contents

Prosynth2 Synthesis for Windows

The ProSynth2 Windows application can be used to demonstrate all-prosodic speech synthesis - synthesis from a combination of a rich hierarchical phonological structure with declarative knowledge for phonetic interpretation. With ProSynth2 you can see text being converted to phonological form and then to sound stage by stage. You can even customise ProSynth2 for your own research in speech synthesis.

The current version supports input in the following formats:

Plain text with diacritics
Phonetic transcription with diacritics
Regular English with diacritics.

The specification of prosody and phonetic interepretation can be made with the following alternative rule sets:

Klatt rules for English sentences
Prosody modelled from sentences in the ProSynth corpus

Output generation can be produced used the following methods:

Formant synthesis
MBROLA diphone synthesis (requires MBROLA tools and database to be installed)
Prosody manipulated natural speech (requires source data to be in Speech Filing System (SFS) format)

ProSynth2 is only available from:

https://www.phon.ucl.ac.uk/downloads/prosynth/prosynth210.exe

The application comes with help files which describe the internal operation of the system. Limited help is available from Mark Huckvale.

Contents

Prosynth Web Demonstration

The ProSynth Web demonstration is a quick and easy way to learn more about the ProSynth approach to synthesis. The Web demonstration allows you to type in text and view the phonological representation generated by ProSynth and to hear a signal generated from that representation.

The current demonstration supports input in the following formats:

Plain text with diacritics
Phonetic transcription with diacritics
Regular English with diacritics.

The specification of prosody and phonetic interepretation can be made with the following alternative rule sets:

Klatt rules for English sentences
Prosody modelled from sentences in the ProSynth corpus

Output generation can be produced used the following methods:

Formant synthesis
MBROLA diphone synthesis
Festival diphone synthesis

The ProSynth web demonstration is at http://www.phon.ucl.ac.uk/project/prosynth/prosynthdemo.htm.

Contents

Prosynth XML Processing tools

In the ProSynth project we chose to encode the internal phonological representations in XML. The use of a standard text representation rather than a proprietary format for working data opens up the internal operation of our system. As can be seen in our Windows demonstration, it is easy to investigate the processing of the individual stages in synthesis by viewing the intermediate representations.

The XML processing tools we developed on the project fall in to two categories: (i) the UTT tools which are specifically design to operate on ProSynth data structures; and (ii) the ProXML interpreter and scripts. ProXML is a general purpose XML processing script language, which we have developed and used for encoding knowledge about phonetic interpretation in ProSynth.

The ProSynth XML tools are based on the package LT-XML from the Language Technology Group in Edinburgh. You will need their toolkit to compile the package. In addition some tools interface with Speech Filing System (SFS) files. You will need this package to compile those parts.

You can download the XML tools and the ProXML interpreter from our site:

https://www.phon.ucl.ac.uk/downloads/prosynth/xmltools_src_yyyymmdd.tar.gz Sources (800k)
https://www.phon.ucl.ac.uk/downloads/prosynth/xmltools_bin_solaris_yyyymmdd.tar.gz Binaries for Sun Solaris (3Mb)

Limited help is available from Mark Huckvale.

Contents

PROCSY: Prosynth Copy Synthesis

The PROCSY system generates control data for the Sensimetrics HLSyn quasi-articulatory synthesizer from audio file input. It can be used to speed up the generation of stimuli for perceptual experiments. For us it is a staging post for our work in constructing a full synthesis-by-rule system from text to speech through the HLSyn system.

You can read more about PROCSY at http://kiri.ling.cam.ac.uk/procsy/documentation/.

You can download the latest version compiled to run on Windows using the CYGWIN environment. Note that HLSYN is NOT included in the package - this must be purchased from Sensimetrics, see below.

Here is an example of PROCSY and HLSyn:

Input waveform
Input label file
Input transcription file
XML encoding (automatically generated by wav2xml.sh)
HLsyn control file (automatically generated by procsy.sh)
HLSyn output waveform (automatically generated by hlsyn)

You can download the entire ProSynth corpus converted to HL files from :

https://www.phon.ucl.ac.uk/downloads/prosynth/hlfiles_yyyymmdd.tar.gz

Note: We use a Unix version of HLSyn for our work. However this version is not currently available from Sensimetrics. Thus to convert HL files to audio requires the purchase of the Windows version of HLsyn (this costs $500 for educational use). To load an HL file into Windows HLSyn, first select 'File/New', then 'File/Import/HL Parameters'.

Limited help is available from Sarah Hawkins.