Applied Research on Voice

This page describes some current and previous research activities in the area of voice and its applications. The principal investigator is Prof. Mark Huckvale.

Prediction of Fatigue and Stress from Voice in Safety-Critical Environments (iVOICE)

The iVOICE project was a feasibility study funded by the European Space Agency under the Artes 20 programme. The project partners were UCL Speech, Hearing and Phonetic Sciences, UCL Mullard Space Sciences Laboratory Centre for Space Medicine and the Gagarin Cosmonaut Training Centre (GCTC) in Star City, Russia. It ran from January 2014 to January 2015.

The goals of iVOICE were to test the feasibility of using changes in the speaking voice as indicators for changes in the levels of fatigue in the speaker or changes in the levels of cognitive load of the speaker. We developed technology that analysed recordings of speech under controlled conditions and predicted levels of fatigue or levels of cognitive load from characteristics of the audio signal.

For fatigue we were fortunate to obtain recordings from seven aeronautical professionals undertaking a training exercise at GCTC in which they had to stay awake for 60hours. We showed that we could make reasonable predictions of how long each of the speakers had been awake from characteristics of their speech. For a task in which we only asked whether the speaker had slept in the past 24 hours, we were able to obtain a 90% accuracy of prediction. The results of this experiment were reported in Baykaner et al (2015).

For cognitive load we performed two experiments, one looked at recordings of subjects performing the Stroop test, one looked at recordings of subjects performing a demanding visual task. The first experiment was reported in Huckvale (2014).

Publications

Prediction of Speaker Age from Voice

The estimation of the age of a speaker from his or her voice has both forensic and commercial applications. Previous studies have shown that human listeners are able to estimate the age of a speaker to within 10 years on average, while recent machine age estimation systems seem to show superior performance with average errors as low as 6 years. However the machine studies have used highly non-uniform test sets, for which knowledge of the age distribution offers considerable advantage to the system. In this study we compare human and machine performance on the same test data chosen to be uniformly distributed in age. We show that in this case human and machine accuracy is more similar with average errors of 9.8 and 8.6 years respectively, although if panels of listeners are consulted, human accuracy can be improved to a value closer to 7.5 years. Both humans and machines have difficulty in accurately predicting the ages of older speakers.

The figure below shows the predicted ages of 52 speakers made by 36 listeners. The mean absolute error of age prediction was about 10years, that is we can often estimate a speaker's age within a decade just by hearing their voice.

Publications

Monitoring of Psychological Well-Being in Long-Term Missions (VULCAN)

The VULCAN project is a new feasibility study funded by the European Space Agency under the Artes 20 programme. The project partners are UCL Speech, Hearing and Phonetic Sciences, UCL Mullard Space Sciences Laboratory Centre for Space Medicine and the Institute for Biomedical Problems (IBMP) in Moscow, Russia. It will run from January 2016 to January 2017.

The VULCAN project is part of a larger endeavour investigating how psychological support may be given to astronauts undertaking a long-term mission, for example a mission to Mars that might take up to two years. VULCAN builds on the outcomes of the iVOICE project that showed how signal analysis and machine learning methods may be applied to the prediction of speaker fatigue and cognitive load from voice recordings.

At the heart of VULCAN is a new technology for Longitudinal Voice Analysis. This is a combination of innovative signal analysis methods together with statistical modelling of sequences to uncover either anomalous recordings or long-term trends in the voice. We will test the effectiveness of the technique by applying it to several thousand spoken messages recorded as part of the Mars500 simulated mission to Mars experiment conducted by IBMP in 2010/11.

We also hope that Tim Peake will contribute to VULCAN during his stay at the International Space Station by making some test recordings for us to explore the practicalities of obtaining high quality audio recordings in space and to analyze how microgravity affects the voice.

 


Some other pages on our site you may enjoy:

WASP - Record & display speech signals

WASP is a simple program to record speech signals and to display a spectrogram and pitch track. More information.

Phonetic symbols, fonts and keyboards

Resources for word processing of phonetic transcription. More information.