Mark Huckvale
Research Projects
- Performance-based measures of speech quality
(2010-2013)
This project seeks to design and test new methods for the evaluation of speech communication systems. The area of application is for systems which operate at high levels of speech intelligibility or for systems which make little change to intelligibility (such as noise-reduction systems). Conventional intelligibility testing is not appropriate in these circumstances, and existing measures of speech quality are based on subjective opinion rather than speech communication performance.
- KLAIR - a virtual infant
(2009)
The KLAIR project aims to build and develop a computational platform to assist research into
the acquisition of spoken language. The main part
of KLAIR is a sensori-motor server that supplies a client
with a virtual infant on screen that can see, hear and speak. The client can monitor the audio visual
input to the server and can send articulatory gestures to the head for it to speak through an articulatory synthesizer.
The client can also control the position of the head and the eyes as well as setting facial expressions.
By encapsulating the real-time
complexities of audio and video processing within a server that will run on a modern PC,
we hope that KLAIR will encourage and facilitate more experimental research into
spoken language acquisition through interaction.
- Auditory Hallucinations Project
(2009-2011)
Auditory hallucinations are an enduring problem in the treatment of serious mental illness such as
schizophrenia. About 30% of people with this diagnosis continue to experience hallucinations and delusions despite treatment with antipsychotic medication. This study is designed to tackle the problem created by the inaccessibility of the patients'
experience of voices to the clinician. Patients troubled by persistent distressing auditory hallucinations will be invited to
create an external representation of their dominant voice hallucination using computer technology.
Graphics software will be used to create an avatar that will give a face to the voice, while
voice morphing software will realise it in sound.
The researcher can then use text-to-speech and animation software to cause the avatar to
respond to the patient's speech, creating a dialogue in which the voice
progressively comes under the patient's control.
The principal investigator is Prof. Julian Leff from the UCL Medical School.
- Centre for Law Enforcement Audio Research (CLEAR)
(2007-2012)
The CLEAR project aims to create a centre of excellence in tools and techniques for the cleaning of poor-quality audio recordings of speech. The centre is initially funded by the U.K. Home Office for a period of five years and will be run in collaboration with the Department of Electrical and Electronic Engineering at Imperial College.
- Spoken Language Conversion with Accent Morphing
(2006-)
Spoken language conversion is the challenge of using synthesis systems to generate utterances in the voice of a speaker but in a language unknown to the speaker. Previous approaches have been based on voice conversion and voice adaptation technologies applied to the output of a foreign language TTS system. This inevitably reduces the quality and intelligibility of the output, since the source speaker will not be a good source of phonetic material in the new language. Our work contrasts previous work with a new approach that uses two synthesis systems: one in the source speaker's voice, one in the voice of a native speaker of the target language. Audio morphing technology is then exploited to correct the foreign accent of the source speaker, while at the same time trying to maintain his or her identity. In this project we aim to construct a spoken language conversion system using accent morphing and evaluate its performance in terms of intelligibility and speaker identity.
- SYNFACE: Synthesised talking face derived from speech for hearing disabled users of voice channels
(2001-2004)
The main purpose of the SYNFACE project is to increase the possibilities for hard of hearing people to communicate by telephone. Many people use lip-reading during conversations, and this is especially important for hard of hearing people. However, this clearly doesn't work over the telephone!. This project aims to develop a talking face controlled by the incoming telephone speech signal. The talking face will facilitate speech understanding by providing lip-reading support. This method works with any telephone and is cost-effective compared to video telephony and text telephony that need compatible equipment at both ends.
- ProSynth: An integrated prosodic approach to device-independent, natural-sounding speech synthesis
(1997-2001)
This collaborative project between Linguistics departments in Cambridge, London and York aimed to construct a model of computational phonology that integrates and extends modern metrical approaches to phonetic interpretation and to apply this model to the generation of high-quality speech synthesis. The three focal areas of research were intonation, morphological structure and systematic segmental variation. Integrating these is a temporal model that provides a linguistic structure or 'data object' upon which phonetic interpretation is executed and which delivers control information for synthesis.
|
Some other pages on our site you may enjoy ...
A tutorial that provides an elementary introduction to the mathematics of Logarithms.
ESYSTEM is a free program for experimenting with signals and systems. With ESYSTEM you can see the effect of simple systems on a range of simple signals.
WASP is a simple program to record speech signals and to display a spectrogram and pitch track.
HearLoss is a free interactive Windows PC program for demonstrating to normally hearing people the effects of hearing loss.
RTSPECT is a free program for displaying a real-time waveform and spectrum display of an audio signal on Windows computers.
|