UCL Phonetics and Linguistics

Valerie Hazan's home page

The effect of speaker variability on speech perception in children

A research project funded by the Wellcome Trust


Administrative Details

Grant Period:

20 September 1999- 19 September 2002 

Grant Award:

£116,003

Investigators: 
 

Research Fellow

Valerie Hazan
 

Duncan Markham

Overview

The acoustic patterns of speech vary significantly across speakers as a function of vocal tract size, sex, age, accent. A key process in speech perception is the ability to normalise, i.e. to recognise that sounds which are acoustically different may belong to a same consonant or vowel category. It is known that adult listeners are significantly affected by speaker variability: certain speakers are significantly more intelligible than others and lower intelligibility scores are obtained for tests using multiple speakers than for tests using single speakers.

One area which has hitherto been little explored is the way in which children are affected by speaker variability during language acquisition. This will be the focus of the proposed study. A study on speaker normalisation in children has quite practical applications apart from implications for models of speech development. A better understanding of what constitutes a highly-intelligible speaker for a child would be extremely useful in the development of computer-based auditory-training applications such as those linked to speech and language therapy or second-language learning.

In the first phase of the work, a large database of high-quality speech and laryngographic recordings was obtained for 45 speakers (18 women, 15 men, 6 boys, 6 girls) from a homogeneous accent group. This in itself constitutes a unique resource that is now being put in the public domain. The test material, which  consisted of 124 words familiar to 7 year olds that adequately covered all frequent consonant confusions, was presented to 135 listeners (adults, 11-12 year olds and 7-8 year olds) in a low level of background noise. Intelligibility for individual speakers on identical test material varied from 81.2 to 96.4% Younger listeners made significantly more errors, but the relative intelligibility of individual speakers was highly consistent across listener groups. In a second study, listener ratings on a number of subjective voice dimensions (e.g. mumbly-precise) were obtained for a subset of most and least intelligible speakers. Key descriptors correlating to intelligibility appeared to relate to articulation, voice dynamics and general quality. Finally, measures of fundamental frequency, long-term average spectrum, word duration, CV intensity ratio and vowel space were obtained for all 45 speakers. Overall, intelligibility was significantly correlated with two measures: the total energy contained in the long-term average spectrum region between 1 and 3 kHz and word duration; jointly, these measures predicted around 60% of the variability in the intelligibility data. However, there was variability across talker subgroups, with no significant correlations between intelligibility and acoustic-phonetic measures obtained for child talkers. Also, the profiles of the 'best' and 'worst' talkers highlighted the considerable diversity of factors contributing to intelligibility. These results confirm the difficulty in finding reliable acoustic-phonetic correlates of talker intelligibility.

Chapters in books

Hazan, V. and Markham, D. (2002) The perception of speaker characteristics in adults and children. Angelika Braun / Herbert R. Masthoff (ed.) Phonetics and its Applications.  Festschrift for Jens-Peter Kvster on the Occasion of this 60th Birthday.  Stuttgart: Steiner 2002. 118-126.

Papers in refereed journals

Markham, D. and Hazan, V. (2004) The effect of talker- and listener-related factors on intelligibility for a real-word, open-set perception test. Journal of Speech, Hearing and Language Research, 47,  725-737.

 

Hazan, V. and Markham, D. (2004) Acoustic-phonetic correlates of talker intelligibility in adults and children. Journal of the Acoustical Society of America, 116, 3108-3118.

Published conference proceedings and abstracts

Markham, D. and Hazan, V. (2002). Speaker intelligibility of adults and children. Proceedings of International Conference for Spoken Language Processing, Denver, 16-20 September 2002, 1685-1688

 

Hazan, V. and Markham, D. (2002) Do adults and children find the same voices intelligible. ISCA Workshop on Temporal Integration in the perception of speech, P3-19.

 

Markham, D. and Hazan, V. (2002) .Talker intelligibility: Child and adult listener performance. J. Acoust. Soc. Am. 111, 2481 (2002)

 

Hazan, V., & Markham, D. (2003). Acoustic-phonetic dimensions of speaker intelligibility. Proceedings of the 15th  International Congress of Phonetic Sciences, Barcelona, 3-9 August 2003, 1493-1496. 

Working papers

Markham, D. and Hazan, V. (2002) The UCL Speaker Database. Speech, Hearing and Language: UCL Work in Progress, vol. 14, 1-17.

Research Resource

UCL Speaker Database

Related Issues

If you would like to know more about the project, please contact Valerie Hazan. 


Author: Valerie Hazan . Last Changed: 8 December 2004