Re: Phase 1 label files

Mark Huckvale (mark@phonetics.ucl.ac.uk)
Mon, 01 Jun 1998 17:22:56 +0100

At 12:30 19/05/98 +0100, you wrote:
>Dear Mark,
>
>I've been having a look at the label files for the database. I have a
>couple of comments about the SAMPA transcriptions and a question about the
>alignment with the waveforms.
>
>Most of the occasions where I disagree with the transcriptions as they
>stand are cases of unstressed i/I where the transcriptions generally have
>I (hI, SI, DI, etc) but where I'd be inclined to write hi, Si, Di. The
>vowels are of short duration, so I don't know whether marking quality (as
>opposed to quantity) is important. There are other occasions which may
>be of more consequence: should I mail them to you and / or alter the .lab
>files?

A significant question is whether the .lab files are simply a means
to getting timing information into the prosodic structure or whether
they are a significant source of information in their own right.

I think it is worthwhile to spend time correcting alignments, but
as far as quality is concerned this should be reflected in the
XML prosodic annotation somehow. Suggestions for how we might do this
would be welcome.

>I've started to look at the alignment of the labels with the waveforms
>too. I expect precise alignment will be important for our durational work
>here - should I just tweak the alignments and mail you the new .lab files?

Ideally we need a definitive set of .lab files for the purposes of
alignment. At the last meeting it was suggested that I assign about
80 files to each investigator for hand checking. I guess I can
set up some mechanism for assigning files to people and keeping track
of responses. In the meantime, just send me replacement lab files
and I will keep them in a separate directory. There is a /pub/temp
directory on synth.phon.ucl.ac.uk suitable for sending me stuff by
anonymous FTP. Be sure to also mail me to say they are there.

Regards

Mark