Department of Phonetics and Linguistics


the ucl speaker database

The UCL Speaker Database was primarily developed for a project on the perception of speaker variability in children and adults funded by the Wellcome Trust. The database contains recordings of a wide range of speech materials for 45 speakers of South-Eastern British English and is now being made available to other researchers investigating speaker variability and speaking styles. Full details of the database materials, speakers and recording conditions can be found here.

speakers

45 speakers of British English with a fairly neutral accent or mild South-Eastern English accent. These include 18 women (mean age: 33;11 yrs), 15 men (mean age: 30;7 yrs), 6 girls (mean age: 13;2 yrs) and 6 boys (mean age: 13;2 yrs). To get an impression of the range of voices in the database, click here to hear the word ‘park’ spoken by all 45 speakers (.wav file 722KB).

speech materials

 

word level materials

  • Manchester Junior Word lists
  • UCL Markham word lists

sentence-level materials

  • semantically-unpredictable sentences

read texts

  • 'Arthur the Rat' passage
  • 'Rainbow' passage

semi-spontaneous speech

  • Description of a cartoon
  • Retelling of same cartoon from memory

recording conditions

Speech recordings were made in the anechoic chamber of the Department of Phonetics and Linguistics, UCL using a Brüel & Kjær sound level meter. Glottal activity was measured using an electrolaryngograph. Recordings were made to DAT at a sampling rate of 44.1 kHz.

file format

Materials are included as WAV files at the original sampling rate of 44.1 KHz.  The majority of the materials described above have been assembled onto a set of two DVDs which can be made available to interested researchers at low cost (to cover production costs and postage). A complete set of materials may not be available for all speakers due to time constraints during recordings or to technical problems. A list showing detailed availability for each type of material is contained in this paper [link to SHL paper].

please note

We cannot offer further support for the materials provided. The recordings other than the UCL Markham test are provided unsegmented (one file per material type) so will require further preparation. Due to space limitations, materials are mostly provided without the laryngographic channel but the original recordings can be provided on request in DAT format.

The use of this material in research projects should be acknowledged in publications relating to the project. Authors can cite the following paper which contains a full description of materials:

Markham, D. and Hazan, V. (2002) The UCL Speaker Database. Speech, Hearing and Language: UCL Work in Progress, vol. 14, p.1-17

acknowledgment

This database was produced as part of a study funded by the Wellcome Trust (055651/ Z/98/ JRS/ JP/ JAT).

further information

If you are interested in this database, please contact Valerie Hazan for further details.