EUROM_1 European Languages Speech Database

One of the major tasks of the working group "Enabling Technology and Research" in the ESPRIT-project 2589 (SAM) was the production of standard European speech databases. Within the SAM project, standard recording protocols and several tools for recording and annotion of speech databases were developed. The whole EUROM.1 database covers eight European languages (Danish, Dutch, British English, French, German, Italian, Norwegian and Swedish). The Italian database is available separately, the other languages are described below.



Danish, Dutch, English, French, German, Norwegian, Swedish
60 speakers per language
20KHz 16bit sampling, anechoic room
5 CDROMs per language
Content: (for each language)
Many Talker Corpus (30 women, 30 men), 100 numbers, 3 passages, 5 sentences, (speech signal)

Few Talker Corpus (5 women and 5 men), 100 numbers x 5, 15 passages, 25 sentences, C(C)VC(V) x 5, (speech + laryngographic signals)

Very Few Talker Corpus (1 woman and 1 man), C(C)VC(V) material embedded in 5 context, phrases., Context words x 5, (speech + laryngographic signals)


    D. Chan, A. Fourcin, D. Gibbon, B. Granstrom, M. Huckvale, G. Kokkinakis, K. Kvale, L. Lamel, B. Lindberg, A. Moreno, J. Mouropoulos, F. Senia, I. Trancoso, C. Veld & J. Zeiliger, "EUROM- A Spoken Language Resource for the EU", in Eurospeech'95. Proceedings of the 4th European Conference on Speech Communication and Speech Technology. Madrid, Spain, 18-21 September, 1995. Vol 1, pp. 867-870




Danish Danish Project
Danish Technical Appendix
Dutch Dutch Project
Dutch Technical Appendix
English English Project
English Technical Appendix
French French Project
German German Project
German Technical Appendix
Norwegian Norwegian Project
Norwegian Technical Appendix
Swedish Swedish Project
Swedish Technical Appendix


EUROM_1 CD-ROMs are available for order from the European Language Resources Association.