EUROM_1 European Languages Speech Database
One of the major tasks of the working group "Enabling Technology and Research" in the ESPRIT-project 2589 (SAM) was the production of standard European speech databases. Within the SAM project, standard recording protocols and several tools for recording and annotion of speech databases were developed. The whole EUROM.1 database covers eight European languages (Danish, Dutch, British English, French, German, Italian, Norwegian and Swedish). The Italian database is available separately, the other languages are described below.
Description
Overview
- Languages:
- Danish, Dutch, English, French, German, Norwegian, Swedish
- Speakers:
- 60 speakers per language
- Protocol:
- 20KHz 16bit sampling, anechoic room
- Media:
- 5 CDROMs per language
- Content: (for each language)
- Many Talker Corpus (30 women, 30 men),
100 numbers,
3 passages,
5 sentences,
(speech signal)
Few Talker Corpus (5 women and 5 men),
100 numbers x 5,
15 passages,
25 sentences,
C(C)VC(V) x 5,
(speech + laryngographic signals)
Very Few Talker Corpus (1 woman and 1 man),
C(C)VC(V) material embedded in 5 context,
phrases.,
Context words x 5,
(speech + laryngographic signals)
Reference
D. Chan, A. Fourcin, D. Gibbon, B. Granstrom, M. Huckvale, G. Kokkinakis, K. Kvale, L. Lamel, B. Lindberg, A. Moreno, J. Mouropoulos, F. Senia, I. Trancoso, C. Veld & J. Zeiliger,
"EUROM- A Spoken Language Resource for the EU", in Eurospeech'95. Proceedings of the 4th European Conference on Speech Communication and Speech Technology. Madrid, Spain, 18-21 September, 1995. Vol 1, pp. 867-870
Documentation
Overview
Languages
Availability
EUROM_1 CD-ROMs are available for order from the European Language Resources Association.