Mark Huckvale

Status: PhD, Senior Lecturer
Address: Room 320, Chandler House, UCL, Wakefield Street, London
Phone: + 44 (0) 20 7679 4087
Email: m.huckvale@ucl.ac.uk
Home page: http://www.phon.ucl.ac.uk/home/mark/
Primary Dept.: Speech, Hearing and Phonetic Sciences
Secondary Dept.: Gatsby
RAE Research group: Phonetics
Interests: Speech and language technology as a means to model human language processing and to build a conversational interface to computers.
Other: BSc Speech Communication Tutor, MSc Speech and Hearing Sciences Tutor

Research Projects

  • Performance-based measures of speech quality (2010-2013)
    This project seeks to design and test new methods for the evaluation of speech communication systems. The area of application is for systems which operate at high levels of speech intelligibility or for systems which make little change to intelligibility (such as noise-reduction systems). Conventional intelligibility testing is not appropriate in these circumstances, and existing measures of speech quality are based on subjective opinion rather than speech communication performance.
  • KLAIR - a virtual infant (2009)
    The KLAIR project aims to build and develop a computational platform to assist research into the acquisition of spoken language. The main part of KLAIR is a sensori-motor server that supplies a client with a virtual infant on screen that can see, hear and speak. The client can monitor the audio visual input to the server and can send articulatory gestures to the head for it to speak through an articulatory synthesizer. The client can also control the position of the head and the eyes as well as setting facial expressions. By encapsulating the real-time complexities of audio and video processing within a server that will run on a modern PC, we hope that KLAIR will encourage and facilitate more experimental research into spoken language acquisition through interaction.
  • Auditory Hallucinations Project (2009-2011)
    Auditory hallucinations are an enduring problem in the treatment of serious mental illness such as schizophrenia. About 30% of people with this diagnosis continue to experience hallucinations and delusions despite treatment with antipsychotic medication. This study is designed to tackle the problem created by the inaccessibility of the patients' experience of voices to the clinician. Patients troubled by persistent distressing auditory hallucinations will be invited to create an external representation of their dominant voice hallucination using computer technology. Graphics software will be used to create an avatar that will give a face to the voice, while voice morphing software will realise it in sound. The researcher can then use text-to-speech and animation software to cause the avatar to respond to the patient's speech, creating a dialogue in which the voice progressively comes under the patient's control. The principal investigator is Prof. Julian Leff from the UCL Medical School.
  • Centre for Law Enforcement Audio Research (CLEAR) (2007-2012)
    The CLEAR project aims to create a centre of excellence in tools and techniques for the cleaning of poor-quality audio recordings of speech. The centre is initially funded by the U.K. Home Office for a period of five years and will be run in collaboration with the Department of Electrical and Electronic Engineering at Imperial College.
  • Spoken Language Conversion with Accent Morphing (2006-)
    Spoken language conversion is the challenge of using synthesis systems to generate utterances in the voice of a speaker but in a language unknown to the speaker. Previous approaches have been based on voice conversion and voice adaptation technologies applied to the output of a foreign language TTS system. This inevitably reduces the quality and intelligibility of the output, since the source speaker will not be a good source of phonetic material in the new language. Our work contrasts previous work with a new approach that uses two synthesis systems: one in the source speaker's voice, one in the voice of a native speaker of the target language. Audio morphing technology is then exploited to correct the foreign accent of the source speaker, while at the same time trying to maintain his or her identity. In this project we aim to construct a spoken language conversion system using accent morphing and evaluate its performance in terms of intelligibility and speaker identity.
  • SYNFACE: Synthesised talking face derived from speech for hearing disabled users of voice channels (2001-2004)
    The main purpose of the SYNFACE project is to increase the possibilities for hard of hearing people to communicate by telephone. Many people use lip-reading during conversations, and this is especially important for hard of hearing people. However, this clearly doesn't work over the telephone!. This project aims to develop a talking face controlled by the incoming telephone speech signal. The talking face will facilitate speech understanding by providing lip-reading support. This method works with any telephone and is cost-effective compared to video telephony and text telephony that need compatible equipment at both ends.
  • ProSynth: An integrated prosodic approach to device-independent, natural-sounding speech synthesis (1997-2001)
    This collaborative project between Linguistics departments in Cambridge, London and York aimed to construct a model of computational phonology that integrates and extends modern metrical approaches to phonetic interpretation and to apply this model to the generation of high-quality speech synthesis. The three focal areas of research were intonation, morphological structure and systematic segmental variation. Integrating these is a temporal model that provides a linguistic structure or 'data object' upon which phonetic interpretation is executed and which delivers control information for synthesis.

Taught Courses

Selected Publications

    2013

    • Huckvale, M. A. (2013). An Introduction to Phonetic Technology. In Jones, M., Knight, R. -. A. (Eds.). The Bloomsbury Companion to Phonetics ( ). London Bloomsbury Academic.
    • Leff, J., Williams, G., Huckvale, M. A., Arbuthnot, M., Leff, A. P. (2013). Computer-assisted therapy for medication-resistant auditory hallucinations: Proof-of-concept study. Br J Psychiatry 202, 428-433 doi:10.1192/bjp.bp.112.124883. Author URL
    • Leff, J., Williams, G., Huckvale, M., Arbuthnot, M., Leff, A. P. (2013). Avatar therapy for persecutory auditory hallucinations: What is it and how does it work? Psychosis: Psychological, Social and Integrative Approaches doi:10.1080/17522439.2013.773457.

    2012

    • Hilkhuysen, G., Gaubitch, N., Brookes, M., Huckvale, M. (2012). Effects of noise suppression on intelligibility: dependency on signal-to-noise ratios. Journal of the Acoustical Society of America 131(1), 531-539
    • Hilkhuysen, G., Gaubitch, N., Huckvale, M. (2012). Effects of Noise Suppression on Intelligibility: Experts' Opinions and Naive Normal-Hearing Listeners' Performance. J Speech Lang Hear Res doi:10.1044/1092-4388(2012/11-0286).
    • Huckvale, M. A. (2012). Data processing: analysis of speech audio signals. In Müller, N., Ball, M. J. (Eds.). Research Methods in Clinical Linguistics and Phonetics ( ). Wiley-Blackwell.
    • Huckvale, M. A., Hilkhuysen, G. (2012). Performance-Based Measurement of Speech Quality with an Audio Proof-Reading Task. Journal of the Audio Engineering Society 60(6) [Accepted]

    2011

    • Huckvale, M. A. (2011). Recording caregiver interactions for machine acquisition of spoken language using the KLAIR virtual infant. 12th InterSpeech Conference.
    • Huckvale, M. A. (2011). The KLAIR toolkit for recording interactive dialogues with a virtual infant. 12th InterSpeech Conference.
    • Pinet, M., Iverson, P., Huckvale, M. (2011). Second-language experience and speech-in-noise recognition: Effects of talker-listener accent similarity. J Acoust Soc Am 130(3), 1653-1662 doi:10.1121/1.3613698. Author URL

    2010

    • Campbell, P., Huckvale, M., Martelli, S., Steele, J., Tracy, C. (2010). Voicebox: The Physics and Evolution of Speech. London Gatsby Science Enhancement Programme. Publisher URL
    • Hilkhuysen, G., Huckvale, M. (2010). Adjusting a commercial speech enhancement system to optimize intelligibility. AES 39th Conference on Audio Forensics.
    • Hilkhuysen, G., Huckvale, M. (2010). Signal properties reducing intelligibility of speech after noise reduction. European Conference on Signal Processing.
    • Huckvale, M., Frasi, D. (2010). Measuring the effect of noise reduction on listening effort. AES 39th Conference on Audio Forensics.
    • Huckvale, M., Hilkhuysen, G., Frasi, D. (2010). Performance-based Measurement of Speech Quality with an Audio Proof-reading Task. 3rd ISCA Workshop on Perceptual Quality of Systems.
    • Naylor, P., Gaubitch, N., Sharma, D., Hilkhuysen, G., Huckvale, M. (2010). Intelligibility Estimation in Law Enforcement Speech Processing. 9th ITG conference on Speech Communication.
    • Sharma, D., Hilkhuysen, G., Gaubith, N., Naylor, P., Brookes, M., Huckvale, M. (2010). Data driven method for non-intrusive speech intelligibility estimation. European Conference on Signal Processing.
    • Yanagisawa, K., Huckvale, M. (2010). A Phonetic Alternative to Cross-language Voice Conversion in a Text-dependent Context: Evaluation of Speaker Identity. InterSpeech 2010.

    2009

    • Huckvale, M., Howard, I., Fagel, S. (2009). KLAIR: a Virtual Infant for Spoken Language Acquisition Research. InterSpeech 2009.
    • Huckvale, M., Leak, J. (2009). Effect of Noise Reduction on Reaction Time to Speech in Noise. InterSpeech 2009.

    2008

    • Yanagisawa, K., Huckvale, M. (2008). A Phonetic Assessment of Cross-Langage Voice Conversion. Interspeech 2008, Brisbane, Australia.

    2007

    • Dellwo, V., Huckvale, M., Ashby, M. (2007). How is individuality expressed in voice? An introduction to speech production & description for speaker classification. In Müller, C. (Ed.). Speaker Classification I ( pp.1-20). Berlin Springer Verlag.
    • Huckvale, M. (2007). ACCDIST: an accent similarity metric for accent recognition and diagnosis. In Müller, C. (Ed.). Speaker Classification II (4441 ed. pp.258-275). Berlin Springer.
    • Huckvale, M. (2007). Hierarchical clustering of speakers into accents with the ACCDIST metric. International Congress of Phonetic Sciences. Saarbrücken, Germany
    • Huckvale, M., Yanagisawa, K. (2007). Spoken Language Conversion with Accent Morphing. 6th ISCA Speech Synthesis Workshop. ( pp.64-70). Bonn, Germany University of Bonn.
    • Yanagisawa, K., Huckvale, M. (2007). Accent morphing as a technique to improve the intelligibility of foreign-accented speech. International Congress of Phonetics Sciences. Saarbrücken, Germany

    2006

    • Huckvale, M. (2006). The new accent technologies:recognition, measurement and manipulation of accented speech. Research and Application of Digitized Chinese Teaching and Learning. Hong Kong Beijing: Language and Culture Press.
    • Hunter, G., Huckvale, M. (2006). Cluster-based approaches to the statistical modelling of dialogue data in the british national corpus. 2nd IEE International Conference on Intelligent Environments, 5-6 July, 2006, Athens, Greece.. Athens

    2005

    • Howard, I., Huckvale, M. (2005). Training a Vocal Tract Synthesiser to imitate speech using Distal Supervised Learning. SpeCom: 10th International Conference on Speech and Computer 2005, Patras, Greece.. Patras, Greece
    • Huckvale, M., Howard, I. (2005). Teaching a vocal tract simulation to imitate stop consonants. Proc. EuroSpeech Lisbon, Portugal.
    • Hunter, G., Huckvale, M. (2005). An evaluation of statistical language models of spoken dialogue using the British National Corpus. IEE International workshop on Intelligent Environments, Essex University. Essex University
    • Tjalve, M., Huckvale, M. (2005). Pronunciation variation modelling using accent features. Proc. EuroSpeech Lisbon, Portugal.

    2004

    • Howard, I., Huckvale, M. (2004). Learning to control an articulatory synthesizer through imitation of natural speech. Summer School on Cognitive and physical models of speech production, perception and perception-production interaction. Lubmin, Germany Author URL
    • Howell, P., Huckvale, M. (2004). Facilities to assist people to research into stammered speech. Stammering Research 1, 130-242 Author URL
    • Huckvale, M. (2004). ACCDIST: a metric for comparing speakers' accents. ICSLP 2004. ( pp.29-32). Jeju, Korea Author URL

    2003

    • Huckvale, M., Shaw, M. (2003). The intelligibility of a spelling-regular English accent. Proceedings of the 15th International Congress of Phonetic Sciences. ( pp.2509-2512). Barcelona Author URL

    2002

    • Huckvale, M. A., Fang, A. (Eds.) (2002). Using phonologically-constrained morphological analysis in continuous speech recognition. Computer Speech and Language 16, 165-181 doi:10.1006/csla.2001.0187. Author URL
    • Huckvale, M. (2002). Speech synthesis, speech simulation and speech science. International Conference on Spoken Language Processing, Denver, 2002. ( pp.1261-1264). Author URL
    • Hunter, G., Huckvale, M. (2002). Studies in the Statistical Modelling of DialogueTurn Pairs in the British National Corpus. Proceedings of the 3rd WSEAS International Conference on Acoustics, Music, Speech and Language Processing (Tenerife).
    • Vazquez-Alvarez, Y., Huckvale, M. (2002). The Reliability of the ITU-P.85 Standard for the Evaluation of Text-to-Speech Systems. Proceedings of the International Conference for Speech and Language Processing (Denver). ( pp.329-332). Author URL

    2001

    • Chung, H., Huckvale, M. (2001). Linguistic factors affecting timing in Korean with application to speech synthesis. EuroSpeech.
    • Huckvale, M. (2001). The Use and Potential of Extensible Mark-Up (XML) in Speech Generation. In Keller, (Ed.). Improvements in Speech Synthesis ( ). Wiley.
    • Huckvale, M., Fang, A. (2001). Experiments in applying morphological analysis in speech recognition and their cognitive explanation. Proceedings of the Intsitute of Acoustics. Institute of Acoustics.
    • Huckvale, M., Hunter, G. (2001). Learning on the job: the application of machine learning within the speech decoder. Proceedings of the Institute of Acoustics. ( Vol. 23 pp.71-79). St. Albans Institute of Acoustics.

    2000

    • Orgega-Llebaria, M., Hazan, V., Huckvale, M. (Eds.) (2000). Automatic cue-enhancement of natural speech for improved intelligibility. Speech, Hearing and Language: work in progress 12, 42-56
    • Fang, A. C., Huckvale, M. (2000). Out of vocabulary rate reduction through dispersion based lexicon acquisition. Literary and Linguistic Computing 15(3), 251-263
    • Fang, A. C., Huckvale, M. A. (2000). Enhanced Language Modelling with Phonologically Constrained Morphological Analysis. IEEE Conference Acoustics, Speech and Signal Processing.
    • Hawkins, S., Heid, S., House, J., Huckvale, M. (2000). Assessment of Naturalness in the ProSynth Speech Synthesis Project. IEE Workshop on Speech Synthesis. Institute Electrical Engineers.
    • House, J., Dankovicova, J., Huckvale, M. (2000). Intonation modelling in ProSynth. Phonetics and Linguistics.
    • Ogden, R., Hawkins, S., House, J., Huckvale, M., Local, J., Carter, P., Dankovicova, J., Heid, S. (2000). ProSynth: an integrated prosodic approach to device-independent, natural-sounding speech synthesis. Computer Language and Science 14, 177-210

    1999

    • House, J., Dankovicova, J., Huckvale, M. (Eds.) (1999). Intonation modelling in ProSynth. Speech, Hearing and Language: work in progress 11, 51-61
    • Huckvale, M. (Ed.) (1999). Opportunities for re-convergence of engineering and cognitive science accounts of spoken word recognition. Speech, Hearing and Language: work in progress 11, 62-75
    • Bowerman, C., Eriksson, A., Huckvale, M., Rosner, M., Tatham, M., Wolters, M. (1999). Criteria for Evaluating Internet Tutorials in Speech Communication Sciences. EuroSpeech. ( pp.2455-2458).
    • Chung, H., Kim, K., Huckvale, M. A. (1999). Consonantal and Prosodic Influences on Korean Vowel Duration. EuroSpeech 1999.
    • House, J., Dankovicova, J., Huckvale, M. (1999). Intonation modelling in ProSynth: An integrated approach to speech synthesis. ICPhS. ( pp.2343-2346). San Francisco
    • House, J., Dankovicova, J., Huckvale, M. (1999). Intonation modelling in ProSynth: an integrated prosodic approach to speech synthesis. Proceedings of the ICPhS. ( Vol. 3 pp.2343-2346).
    • Huckvale, M. (1999). Representation and processing of linguistic structures for an all-prosodic synthesis system using XML. EuroSpeech.
    • Huckvale, M., Bowerman, C., Eriksson, A., Rosner, M., Tatham, M., Wolters, M. (1999). Computer-aided learning and use of the internet. In Bloothooft, G. (Ed.). The Landscape of Future Education in Speech Communication Sciences: 3: Recommendations for European education in phonetics, spoken language engineering, speech and language therapy ( pp.98-117). Utrecht, The Netherlands OTS Publications. Publisher URL

    1998

    • Bloothooft, G., van Dommelen, W., Espain, C., Hazan, V., Huckvale, M., Wigforss, E. (Eds.) (1998). The landscape of future education in speech communication sciences: 2 Proposals. Utrecht OTS.
    • Fang, A. C., House, J., Huckvale, M. (1998). Investigating the syntactic characteristics of English tone units. Proceedings of the ICSLP.
    • Hawkins, S., House, J., Huckvale, M., Local, J., Ogden, R. (1998). ProSynth: an integrated prosodic approach to device-independent, natural-sounding speech synthesis. Proceedings of the ICSLP.
    • Hazan, V., Simpson, A., Huckvale, M. (1998). Enhancement techniques to improve intelligibility of consonants in noise: speaker and listener effects. Proceedings of the ICSLP.
    • Huckvale, M. (1998). Opportunities for Re-convergence of Engineering and Cognitive Science Accounts of Spoken Word Recognition. Proceedings of the IOA Conference on Speech and Hearing.
    • Huckvale, M., Bowerman, C., Eriksson, A., Pompino-Marschall, B., Rosner, M., Tatham, M., Williams, B., Wolters, M. (1998). Computer-Aided Learning and use of the Internet. In Bloothooft, G. (Ed.). The Landscape of Future Education in Speech Communication Sciences: 2: Proposals for European education in phonetics, spoken language engineering, speech and language therapy ( pp.81-110). Utrecht, The Netherlands OTS Publications. Publisher URL

    1997

    • Huckvale, M., Benoit, C., Bowerman, C., Eriksson, A., Rosner, M., Tatham, M. (1997). Computer-Aided Learning and Use of the Internet. In Bloothooft, G. (Ed.). The Landscape of Future Education in Speech Communication Sciences: 1: Analysis of European education in phonetics, spoken language engineering, speech and language therapy ( pp.94-130). Utrecht, The Netherlands OTS Publications. Publisher URL
    • Huckvale, M., Benoit, C., Bowerman, C., Eriksson, A., Rosner, M., Tatham, M., Williams, B. (1997). Opportunities for Computer-Aided Instruction in Phonetics and Speech Communication Provided by the Internet. EuroSpeech 1997.

    1996

    • Fang, A., Huckvale, M. (1996). PROSICE: A Spoken English Database for Prosodic Research. In Greenbaum, S. (Ed.). Comparing English worldwide ( pp.262-279). Oxford University Press, USA.
    • Huckvale, M., Fang, A. (1996). Analysis of the prosody of read speech with the PROSICE corpus. IOA Conference on Speech and Hearing.

    1995

    • Faulkner, A., Darling, A. M., Rosen, S., Huckvale, M. (1995). MODELING CUE INTERACTION IN THE PERCEPTION OF THE VOICELESS FRICATIVE AFFRICATE CONTRAST. Language and Cognitive Processes 10, 369-375
    • Huckvale, M. (1995). Phonetic characterisation and lexical access in non-segmental speech recognition. 13th Int. Congress Phonetic Sciences.
    • Rosen, S., Darling, A. M., Faulkner, A., Huckvale, M. (1995). Cue interaction in the perception of intervocalic and syllable-initial voiceless fricative/affricate contrasts. Proc. Int. Cong. Phonetic Sciences. ( Vol. 2 pp.502-505).

    1994

    • Huckvale, M. (1994). Word recognition from tiered phonological models. IOA Conference on Speech and Hearing.

    1993

    • Darling, A., Rosen, S., Huckvale, M., Faulkner, A. (1993). Phonetic classification of plosive voicing using computational modelling. Journal of the Acoustical Society of America 93, 2320-
    • Huckvale, M. (1993). The Benefits of Tiered Segmentation for the Recognition of Phonetic Properties. EuroSpeech 1993.
    • Rosen, S., Darling, A., Faulkner, A., Huckvale, M. (1993). Cue interaction in an intervocalic voiceless affricate/fricative contrast. Journal of the Acoustical Society of America 93, 2932-

    1992

    • Darling, A. M., Huckvale, M. A., Rosen, S., Faulkner, A. (1992). Phonetic classification of the plosive voicing contrast using computational modelling. Speech and Hearing; Proc. Inst. Acoust. 1992 Conference. ( Vol. 14 ).
    • Huckvale, M. (1992). A Comparison of Neural-Network and Hidden-Markov Model Approaches to the Tiered Segmentation of Speech. IOA Conference on Speech and Hearing.

    1990

    1989

    • HOWARD, I. S., HUCKVALE, M. A. (1989). 2-LEVEL RECOGNITION OF ISOLATED WORD USING NEURAL NETS. FIRST IEE INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS. ( pp.90-94). INST ELECTRICAL ENGINEERS.

    Current PhD Students

    • Jin Kyu Park
    • Mark Wibrow

    Completed PhD Students

    • Kayoko Yanagisawa
      Spoken Language Conversion with Accent Morphing (2010)
    • Michael Tjalve
      Accent Features and Idiodictionaries: On Improving Accuracy for Accented Speakers in ASR (2007)
    • Alex Fang
      Robust Practical Parsing of English with an Automatically Induced Large Grammar (2005)
    • Gordon Hunter
      Statistical Language Modelling of Dialogue Material in the British National Corpus (2004)
    • Hyunsong Chung
      Analysis of the Timing of Spoken Korean with Application to Speech Synthesis (2001)
    • W.J. Holmes
      Modelling segmental variability for automatic speech recognition (1997)
    • Won Choo
      Relationships between Phonetics, Perceptual and Auditory Spaces for Fricatives (1996)
    • Shinichi Tokuma
      The Perceived Quality of Vowels Showing Formant Undershoot (1996)

    (Possibly incomplete list)

Some other pages on our site you may enjoy ...

Web Tutorial on Logs

A tutorial that provides an elementary introduction to the mathematics of Logarithms.

ESYSTEM - Signals and Systems teaching tool

ESYSTEM is a free program for experimenting with signals and systems. With ESYSTEM you can see the effect of simple systems on a range of simple signals.

WASP - Record & display speech signals

WASP is a simple program to record speech signals and to display a spectrogram and pitch track.

HEARLOSS - Hearing Impairment Demonstrator

HearLoss is a free interactive Windows PC program for demonstrating to normally hearing people the effects of hearing loss.

RTSPECT - Real-time Waveform and Spectrum Display

RTSPECT is a free program for displaying a real-time waveform and spectrum display of an audio signal on Windows computers.

University College London - Gower Street - London - WC1E 6BT - Telephone: +44 (0)20 7679 2000 - Copyright © 1999-2013 UCL