The effect of speaker variability on speech perception in children
A research project funded by the Wellcome Trust
|
Grant Period: |
20 September 1999- 19 September 2002 |
|
Grant Award: |
£116,003 |
|
Investigators: Research Fellow |
Duncan Markham |
The acoustic patterns of speech vary significantly across speakers as a function of vocal tract size, sex, age, accent. A key process in speech perception is the ability to normalise, i.e. to recognise that sounds which are acoustically different may belong to a same consonant or vowel category. It is known that adult listeners are significantly affected by speaker variability: certain speakers are significantly more intelligible than others and lower intelligibility scores are obtained for tests using multiple speakers than for tests using single speakers.
One area which has hitherto been little explored is the way in which children are affected by speaker variability during language acquisition. This will be the focus of the proposed study. A study on speaker normalisation in children has quite practical applications apart from implications for models of speech development. A better understanding of what constitutes a highly-intelligible speaker for a child would be extremely useful in the development of computer-based auditory-training applications such as those linked to speech and language therapy or second-language learning.
In the first phase of the work, a large database of high-quality
speech and
laryngographic recordings was obtained for 45 speakers (18 women, 15
men, 6
boys, 6 girls) from a homogeneous accent group. This in itself
constitutes a
unique resource that is now being put in the public domain. The test
material,
which consisted of 124 words
familiar to 7 year olds that adequately covered all frequent consonant
confusions,
was presented to 135 listeners (adults, 11-12 year olds and 7-8 year
olds) in a
low level of background noise. Intelligibility for individual speakers
on
identical test material varied from 81.2 to 96.4% Younger listeners
made
significantly more errors, but the relative intelligibility of
individual
speakers was highly consistent across listener groups. In a second
study,
listener ratings on a number of subjective voice dimensions (e.g.
mumbly-precise) were obtained for a subset of most and least
intelligible
speakers. Key descriptors correlating to intelligibility appeared to
relate to
articulation, voice dynamics and general quality. Finally, measures of
fundamental frequency, long-term average spectrum, word duration, CV
intensity
ratio and vowel space were obtained for all 45 speakers. Overall,
intelligibility was significantly correlated with two measures: the
total
energy contained in the long-term average spectrum region between 1 and
3 kHz
and word duration; jointly, these measures predicted around 60% of the
variability in the intelligibility data. However, there was variability
across
talker subgroups, with no significant correlations between
intelligibility and
acoustic-phonetic measures obtained for child talkers. Also, the
profiles of
the 'best' and 'worst' talkers highlighted the considerable diversity
of
factors contributing to intelligibility. These results confirm the
difficulty
in finding reliable acoustic-phonetic correlates of talker
intelligibility.
Markham, D. and Hazan, V. (2004) The effect of
talker-
and listener-related factors on intelligibility for a real-word,
open-set
perception test. Journal of Speech, Hearing and Language Research,
47, 725-737.
Hazan, V. and Markham, D. (2004) Acoustic-phonetic
correlates of talker intelligibility in adults and children. Journal of the Acoustical Society of
America, 116, 3108-3118.
Markham, D. and Hazan, V. (2002). Speaker intelligibility of adults and children. Proceedings of International Conference for Spoken Language Processing, Denver, 16-20 September 2002, 1685-1688
Hazan, V. and Markham, D. (2002) Do adults and children find the same voices intelligible. ISCA Workshop on Temporal Integration in the perception of speech, P3-19.
Markham, D. and Hazan, V. (2002) .Talker intelligibility: Child and adult listener performance. J. Acoust. Soc. Am. 111, 2481 (2002)
Hazan,
V., & Markham, D. (2003). Acoustic-phonetic dimensions of
speaker intelligibility. Proceedings of the 15th
International Congress of Phonetic Sciences, Barcelona, 3-9
August 2003, 1493-1496.
If you would like to know more about the project, please contact Valerie Hazan.
Author: Valerie Hazan . Last Changed: 8 December 2004