In response to Jill's request for a progress report, here is the result of
Ali's efforts over the past few days. There is a 30000 word list (!)
(available separately, not in this email - see below), and several scripts
which sort it in various ways, eg according to number of syllables,
location of stress in the word, nucleus of the stressed syllable, coda of
the stressed syllable.
things we're interested in, it also contains a lot of things we're not
interested in, and it certainly doesn't cover everything we ARE interested
in. Ali is presently working on creating a controlled set of sentences
containing some of the sounds etc. that we are interested in.
We're sending this because we recognize that Alex needs something to work
on, but from our point of view it needs more work. Our work is out of
synch with Alex's requirements in this respect, and it would be useful to
discuss this.
What follows here is the summary of the work. The actual word list is
available by ftp from the following site:
login name: prosynth@bess.ling.cam.ac.uk
password: Eugenie1
The word list is in /usr/guest/prosynth/word_lists
(just in case, IP address of bess is 131.111.162.1)
Hope you appreciate our continuing the noble password tradition set up by
Mark. (- Sarah!).
Summary of Alison's work, up to 21st November 1997
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What I've been doing is trying to extract words from the online OALD which
conform to the criteria outlined a document John sent to me last week,
which attempted to bring together York, Cam and UCL's requirements. (cf.
details below). John's criteria focus on the rimes of stressed syllables.
We also need to make sure that the onsets satisfy various criteria, and
I'm currently working on this.
I've restricted the search to disyllables and trisyllables, more to
make the task tractable and to limit the output than for any other reason.
STRESS PATTERN
~~~~~~~~~~~~~~
I have divided the words firstly according to which syllable is stressed:
2 syllable words with stress on first syllable
2 syllable words with stress on second syllable
3 syllable words with stress on first syllable
3 syllable words with stress on second syllable
3 syllable words with stress on third syllable
RIME STRUCTURE
~~~~~~~~~~~~~~
In my account here in Cambridge I have subdivisions of these groups, which
correspond approximately to the following rime structures of the stressed
syllables, as outlined in John's document.
(C)VV
(C)VVC
(C)VVCC
(C)VCC
(C)VC
(where (C) = none or more onset consonants)
(I have assumed that VV = long vowel or diphthong)
*** Should note at this point that there is no information directly
about syllable structure in the OALD, so I have had to work on a
simple phoneme string, which means a whole load of junk gets into
these lists along with the stuff we might be interested in.
SORTING ACCORDING TO CODA or NUCLEUS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I then wrote three simple awk scripts, which when given one of the above
word lists as an argument, create a number of new files sorted according
to either the coda or the nucleus of the stressed syllable.
The first two scripts use JL's groups, as described in his document.
The third script categorizes words according to the hypothesized
'blocking' ability of the coda of their stressed syllable. (cf. SH's
document on Cam's criteria)
Details of the kind of groups created by each script:
====================================================
1. Sorting by words by Coda characteristics
-------------------------------------------
clusters: sonorant + [td]
fricative + stop
stop + stop (only after short V)
(*** we're not sure why only here eg. what about
'leaked, looped' etc)
sonorant + fricative (only after short V)
(*** ditto above query)
non-clusters: sonorant
stop
fricative
2. Sorting words by Nuclei
--------------------------
long high nonround/round ii ei uu ow
short high nonround/round I e u
long low nonround aa
short low nonround a
long central 3: V
short central @
real diphthongs ai au oi
3. Sorting words by 'blocking' ability of coda
----------------------------------------------
strong blockers s S z Z
medium blockers d t n
minimal blockers b p f v m
interesting l r
*note no r's in codas
for clusters I've defined their 'blocking' ability by the
consonant ranked highest in the list above.
eg. st = a hi_block whereas nt = a mid_block
END OF SUMMARY
~~~~~~~~~~~~~~
Sarah Hawkins
& Alison Tunley
**********************************************************************
Alison Tunley
Phonetics Laboratory, Department of Linguistics
University of Cambridge, Sidgwick Avenue
Cambridge, CB3 9DA
homepage: http://www.cus.cam.ac.uk/~ajt20
tel: +44 1223 335026; fax: +44 1223 335053
**********************************************************************