Re: Generated lexicon for Phase2 db

Alex Chengyu Fang (alex@phonetics.ucl.ac.uk)
Wed, 11 Feb 1998 13:06:49 +0000

At 12:00 11/02/98 +0000, Richard Ogden wrote:
>at some stage, we will need to convert the phoneme strings under "IPA"
>into phonological structures of the kinds that I sent a grammar for the
>other day. I think we should talk about how this will be done.
>

Mark and I will propose at the meeting alternative structures for the
lexicon so that desirable properties can be accommotated.

>Would it be possible to keep our structured representations alongside
>lexical entries, so that once something is parsed it needn't be parsed
>again, but just looked up?

Yes, it would and this will be discussed at the meeting.

>A related question, more to do with Alex's "small lexicons" point: how
>feasible would it be to have a lexicon of all things that behave in
>word-like ways? I'm thinking in particular of Germanic affixes like
><-less> and <-ness>, which seem to join to other morphemes in the same
>sort of way that proper words do; and in Mark's productions that I
>listened to this morning, they have different vowels in them (quite
>[E]-like in "less" and quite [I] like in "ness", though both are
>transcribed as [@] in the lexicon).

I don't know where to represent what is actually said in the recording.
Maybe in the transcription of the recordings? The idea is to transfer basic
information from the lexicon to the transcription, to be modified manually
according to the recording. I'm sure Mark will have more to say about this.

>A couple of things I don't understand in the lexicon:
>
>"the" is ADV(ge). what does this mean?

This covers the otherwise unanalysable use of the in, for instance, "the
more the better".

>"was", "were" (etc.) have full forms in the lexicon, no weak forms(*). how
>are we going to get weak forms out? there's no labelling of eg. "was" as
>an auxiliary, which I think may be needed. These words are so recurrent in
>the material, that I think we need to consider carefully how they get
>represented. what is more they are, in our material, regularly metrically
>weak, not strong, so that the full vowel forms don't occur. (But I admit
>to also having a bee in my bonnet on this one -- though the function words
>are so ubiquitous that we've got to get them right.)
>(*) Though I note that "to" has both [t@] and [tu], "an" has [@n] and
>[&n].

This has to do with the inconsistencies in the original lexicon (Mitton's
version of OALD), which need to be modified manually, though words like
"paddle" may be treated automatically to account for the missing syllable.
Maybe Jill will say some more about this?

Thanks for the feedback.

Alex