Many thanks to Sarah and Mark for salutary suggestions. I attach latest
version. I hope it goes some way to answering criticisms, though it's far
from perfect. If you have time, please check for sense, and for whether I
am claiming anything we cannot realistically have some results for by next
summer!
Further suggestions for improvement also welcome, though since it's
predictably going to be a last-minute submission, I shall send it tomorrow
lunchtime at the latest, and will be in a state of neurosis till I get
confirmation of its arrival. And I'm tied up 11-1 tomorrow, so not a lot
of time.
Have to go now and if necessary rescue sister and brother-in-law from my
doorstep, where they're due to arrive imminently.
Jill
--=====================_908382469==_
Content-Type: application/rtf; charset="us-ascii"
Content-Disposition: attachment; filename="ICPhS99.rtf"
{\rtf1\ansi \deff4\deflang1033{\fonttbl{\f4\froman\fcharset0\fprq2 Times New Roman;}{\f97\froman\fcharset2\fprq2 WP MultinationalA Roman;}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;
\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;
\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\widctlpar \f4\lang2057 \snext0 Normal;}{\*\cs10 \additive Default Paragraph Font;}}{\info{\title Intonation modelling in ProSynth: an integrated prosodic approach to speech synthesis}
{\author J.House}{\operator J.House}{\creatim\yr1998\mo10\dy12\hr21\min55}{\revtim\yr1998\mo10\dy14\hr18\min15}{\printim\yr1998\mo10\dy12\hr20\min23}{\version3}{\edmins91}{\nofpages1}{\nofwords464}{\nofchars2647}{\*\company Phonetics,UCL}{\vern57443}}
\paperw11906\paperh16838 \widowctrl\ftnbj\aenddoc\formshade \fet0\sectd \linex0\headery709\footery709\colsx709\endnhere {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl2\pnucltr\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl3
\pndec\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang{\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl6\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}
{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}\pard\plain \qc\widctlpar \f4\lang2057 {
\b\fs28 Intonation modelling in ProSynth: an integrated prosodic approach to speech synthesis
\par }\pard \widctlpar
\par {\fs20 Speech synthesis in ProSynth uses a rich linguistic representation, consisting of linked syntactic and prosodic hierarchical structures implemented in XML. Structural nodes store linguistic attributes and acoustic-phonetic values derived from a
spoken database exemplifying structures of interest. Phonetic interpretation integrates information stored at all levels to generate natural-sounding, perceptually robust speech.
\par
\par Intonation modelling involves identifying relevant properties of the F0 contour and relating them correctly to the constituents in the strictly layered prosodic hierarchy. The contour itself, chosen from a phonological inventory, is
specified as an attribute of the Accent Group (AG), and determined by discoursal information stored at the top of the hierarchy in the Intonational Phrase (IP). Components of the AG are Feet, and within these are Syllables and their
constituents, Onsets and Rhymes.
\par
\par Both frequency scaling and temporal alignment of F0 contours are sensitive to prosodic structure. For example, the alignment of pitch accent peaks and valleys is constrained by proximity to upcoming IP, AG, or Foot boundaries. Further adjustments to
timing and frequency depend on properties of the syllabic constituents. Word boundaries can also have relevant timing effects; though excluded from our strictly layered prosodic hierarchy, Word information is recovered from th
e syntactic hierarchy and integrated at Syllable level.
\par
\par The ProSynth database speech files are fully labelled in terms of our prosodic constituents. Acoustic-phonetic values at constituent boundaries are extracted and used as quantitative data for our predictive model.
F0 modelling uses an even richer segmentation: additional values are extracted for labels coinciding with turning points in a template shape associated with each pitch accent and defined in the AG. For example, in a falling contour (phonologically H*L)
we identify more than just H* and L; we need minimally to locate the point at which the contour reaches its peak, the point at which it begins its fall (points which do not necessarily coincide), and the point where it levels out.. For synthesis, t
emplate values are generated at AG level, then integrated with the boundary values of other constituents as they are pushed down through the structure, and a \ldblquote best fit\rdblquote contour generated. We compare our results with natural speech
and will be evaluating them perceptually. Our work is important for its use of structure to integrate F0 synthesis with temporal and segmental properties, and for the light it sheds on
the contribution of different domains (Syllable, Foot, AG) to pitch accent realisation.
\par
\par
\par Authors: \tab \tab Jill House, Jana Dankovi}{\fs20 {\field{\*\fldinst SYMBOL 133 \\f "WP MultinationalA Roman" \\s 10}{\fldrslt\f97\fs20}}}{\fs20 ov}{\fs20 {\field{\*\fldinst SYMBOL 60 \\f "WP MultinationalA Roman" \\s 10}{\fldrslt\f97\fs20}}}{\fs20
, Mark Huckvale
\par Affiliation: \tab \tab University College London, UK
\par Author to contact: \tab Jill House
\par Postal Address: \tab \tab Dept of Phonetics & Linguistics
\par \tab \tab \tab UCL
\par \tab \tab \tab Gower Street \tab
\par }\pard \fi720\li1440\widctlpar {\fs20 London WC1E 6BT, UK
\par }\pard \widctlpar {\fs20 E-mail: \tab \tab \tab jill@phonetics.ucl.ac.uk
\par Telephone:\tab \tab +44 171 419 3167
\par Fax:\tab \tab \tab +44 171 383 4108
\par No.of words in abstract: \tab 397
\par Subject area: \tab \tab Prosody
\par Preferred method of presentation: No preference
\par }
\par
\par }
--=====================_908382469==_
Content-Type: text/plain; charset="us-ascii"
--=====================_908382469==_--