Dear All
This message has several parts. Although I hope everyone will read all of
it, I indicate here which items I think are of particular interest to
different people.
1. Jill, Jana and Sebastian.
2. Ditto, plus all Yorkies
3. As 2, and especially Sebastian
4. Mark, Jill, Jana
0. Intro
Daniel and I have had a brief but useful talk about f0 testing. This is
just to put you in touch with two things we discussed, and recommendations
we came up with. If you want me to talk about any more things with him
while I'm here, then it would be best if you could let us know in time for
us to talk on Thursday afternoon.
1. Jill's comparison ("Standard f0"), or "wrong" condition.
Daniel suggests that you should use the average of the properties that
you're trying to see if it's worth distinguishing. e.g if your "right"
contours distinguish sonorant onsets into subsets (empty, nasal, approx)
then the "wrong" comparison should be with the average of these. If you
reckon that most synthesis systems wouldn't distinguish even obstruent
from sonorant onsets, then you might use an average of all of them, unless
you felt that bog-standard systems tend to use a model developed from one
main type (e.g. those with voiceless obstruent onsets) and generalise
inapproriately from those to al others -- in which case your "wrong"
condition would be contours appropriate for voiceless abstruents, but you
could have no voiceless obstruents on test set OR those would voiceless
obstruents would have to have contours that you yourself have established
as appropriate for sylls with sonorant onsets. If you follow any of these
methods, and get your data from our own database, remember to calculate
the average weighted by the number of items per onset type. E.g. if you
have 10 obstruents, and 50 sonorants, you can't just take the average of
alignemnt point for 60 undifferentiated items, as it won;t reflect the
presence of the obstruents very well. (Sebastian says "Of coruse!".)
2. Methodology for f0 testing, especially in expensive factorial expt.
Daniel agrees that RTs are interesting but there are big problems in
knowing where to measure from. He thinks the idea of testing phoneme
intelligibility in noise might be just as interesting/informative, at
least for the factorial experiment. So, I wonder if we should consider
only one factorial, with phoneme/word intell in noise as the dependent
variable?
This might reduce the risk of finding nothing, and allow more flexibility
in coming up with sentences to test.
We could leave the RT on truth values expt EITHER for a pretest of f0
only, OR for next grant period (if we get one). If we do test RTs in an
f0-only pretest, and if the data are again noisy, we will have gained
knowledge to use for further development when/if we get the "testing RA"
in the next grant. If the data are fine, we might be able to use the same
data in a factorial design (with duration and spectral mods added), if
there were time and available funds etc.
3. Further thoughts.
(3a) If we want to build in a "cognitive" element to the factorial test,
I wonder if we couldn't still do that by having stimuli pesented in noise
in the standard way, and we evaluate phoneme intelligibility, but with the
addition of one of the following (or something similar):
(i) an item = context + statement to be evaluated for truth (as we were
planning for the RT expt, in other words, but Ss respond T/F, AND write
down what they hear).
(ii) Ss do something on line (track a moving target?) while listening to
each item, and then write down what they heard.
(iii) Ss look at e.g. coloured shapes on line while listening to the
items. They press a button (or other simple task) every time one
particular type appears (e.g. it's yellow, or it's square, or (much
harder) it's NOT yellow, or NOT a square), and they always write down
what they hear.
(iv) VERY simple mental arithmetic, following Sonntag (jenolan papers),
while listening, and then write down what they hear.
Of course, we risk any of the above being too hard too, but there might be
time to pilot some of that.
(3b) I know that Richard isn't keen on intell in noise because of the
time it takes to score. He has a point, but on the other hand we know it
produces analysable results. Sebastian will have to advise on time in
scoring versus time in preparation, in analysing noisy data, etc. Also, if
we have long enough sentences, we could try scoring words rather than
phonemes correct. Fowler and Housum found no difference. I think I found a
small difference, but not a huge one. We can;t use words correct with very
short phrases though - too granular. It may also be possible to do some of
the scoring algorythmically, if people type in their answers BUT I would
hesitate to use that method because they're far more likely to misspell
words they type than words they write by hand, and misspellings would be
lethal.
4. COLLABORATION
Daniel has asked me to say that he would very much like to see some
collaboration between the Aix and PRoSynth groups. He is especially
interested in talking with Mark about using PS XML-based methods, and
hopes Jill might be persuaded to check out his/Aix's abstract f0 system.
There are some possibilities for funding. I have now put you in touch, and
perhaps you can take it from there.
best
Sarah
_____________________________________________________________________
Dr. Sarah Hawkins Email: sh110@cam.ac.uk
Dept. of Linguistics Phone: +44 1223 33 50 52
University of Cambridge Fax: +44 1223 33 50 53
Sidgwick Avenue or +44 1223 33 50 62
Cambridge CB3 9DA
United Kingdom
This archive was generated by hypermail 2b29 : Wed Jan 12 2000 - 14:23:42 GMT