Job Opportunities in Speech, Hearing and Phonetic Sciences

PhD Studentship in Performance-based Measures of Speech Quality

Closing date for applications:1 December 2009
Required start date:September 2010, but an earlier start is possible
Duration:3 years, subject to satisfactory progress
Stipend:£15,000 per annum
Tuition Fees:The studentship also provides payment of tuition fees at the UK/EU rate

Overview

The goals of this PhD research project are to develop new behavioural measures of listening effort which can be used to assess the quality of distorted or noisy speech signals. Such measures would provide objective means of studying the effect of communication channels on listening effort, and would complement existing measures such as word intelligibility rate and mean-opinion score. These new measures could then be used to evaluate a range of signal processing strategies proposed to encode, clean or enhance speech signals, which have wide application in telecommunications and hearing-aid design. Further information.

Candidates should have a good first degree in Psychology, Speech & Hearing Sciences, Engineering, or Life-Sciences. They should have experience in research methods and have demonstrated success in undertaking a personal research project. Knowledge and skills in speech and hearing science would be beneficial.

The studentship is funded by Research in Motion through a donation to Dr. Mark Huckvale. The PhD project would be jointly supervised by Dr. Huckvale and Gaston Hilkhuysen within the Research Department of Speech, Hearing and Phonetic Sciences, in the UCL Division of Psychology and Language Sciences. The student would collaborate with members of the Centre for Law-Enforcement Audio Research (www.clear-labs.com), which is a joint research centre of Imperial College London and University College London, funded by the UK Home Office.

Students will formally register for the PhD programme in September 2010. For candidates who can start earlier, we may be able to offer an internship position at the same rate as the student stipend, starting as early as January 2010.

Please note the studentship is only available to UK and European candidates. Non-EU candidates can be only considered if they have lived in Europe for more than 3 years on a non-student visa.

Application is by a letter of application and a CV, sent to:

    Natalie Wilkins
    Speech, Hearing and Phonetic Sciences
    University College London
    Chandler House
    2 Wakefield Street
    London WC1N 1PF
    United Kingdom.
    n.wilkins@ucl.ac.uk

Shortlisted candidates will be required to complete a graduate student application form and supply references from two referees. For more details, contact Mark Huckvale (m.huckvale@ucl.ac.uk).

Background Information

Over many years the effect of noise and distortion on human perception of speech has been assessed on two scales, one of intelligibility and one of quality. Intelligibility has been measured using a performance index, usually % words correct, gathered in an articulation test (ISO, 2003); it has been widely used to assess telecommunications systems. Quality has typically been measured using listener opinions rather than listener performance, either by assigning a rating to a signal or by expressing a preference for one of a pair of signals (ITU, 1996). Quality measures are necessary in addition to intelligibility measures because intelligibility reaches a ceiling of 100% even for signals with noticeable noise or distortion.

However, the use of listener opinions to assess quality is not without its disadvantages. Primarily the problem is that opinions are insensitive to small changes in the signal, and unreliable since listeners may be biased and inconsistent in their judgments. We believe that the use of listener preference to assess quality makes it harder to build a scientific account of how the presence of noise or distortion in a speech signal of good intelligibility affects the listening effort required to understand its message. Thus the goal of this research is to develop measures of listener effort which are based on performance in some listening test, rather than on listener opinion. For example, we might measure the reaction times, memory recall or comprehension of listeners in some speech task. By finding reliable ways of measuring effort we would hope to re-base the concept of "speech quality" on the ease with which listeners process speech signals. This may lead to a psychological model of the impact of signal distortions on human cognitive processing of speech. For example, it is still not clear whether noisy speech is harder to understand because of auditory masking or because of increased demands within linguistic decoding (Brungart et al, 2001).

If performance based measures of listening effort were reliable enough, they could be used to compare different signal processing strategies in terms of the additional listening effort they impose. Ultimately we would hope to be able to predict listening effort from signal properties leading to applications that automatically choose between processing alternatives to improve the listener's experience.

We have already made a start with this research agenda through a pilot experiment looking at reaction time to speech in noise (Huckvale & Leak, 2009). There it was found that although the noise had no influence on the error rate, presenting digits in babble or car noise increased the reaction times needed for typing the digits. Interestingly, noise suppression did not compensate for the deteriorating effect of the noise on reaction times. Other authors have recently reported on similar approaches, e.g. Durin et al (2008) and Sarampalis et al (2009).

References

  • Brungart, D. S., Simpson, B. D., Ericson, M. A., and Scott, K. R. (2001) "Informational and energetic masking effects in the perception of multiple simultaneous talkers." Journal of the Acoustical Society of America. 110(5):2527-2538.
  • Durin, V., Gros, L., and Hericher, G., (2008) "Reaction times and performances in recognition tasks to assess speech quality", Audio Engineering Society Convention, May 2008, Amsterdam.
  • Huckvale, M. and Leak, J. (2009) "Effect of noise reduction on reaction time to speech in noise", Interspeech 2009, Brighton. [Download PDF]
  • International Organization for Standardization (2003). "Ergonomics- Assessment of speech communication" ISO 9921:2003.
  • International Telephone Union (1996) "Methods for subjective determination of transmission quality" ITU-T Recommendation P.800:1996.
  • Sarampalis, A., Kalluri, S., Edwards, B., and Hafter, E., (2009) "Objective measures of listening effort: Effects of background noise and noise reduction", J. Speech, Language and Hearing Research, April, 2009.

University College London - Gower Street - London - WC1E 6BT - Telephone: +44 (0)20 7679 2000 - Copyright © 1999-2016 UCL