Re: prosody testing and Collab with Aix

From: Sarah Hawkins (sh110@cam.ac.uk)
Date: Wed Jan 12 2000 - 18:18:54 GMT

  • Next message: Sebastian Heid: "perceptual tests"

    On Wed, 12 Jan 2000, Sebastian Heid wrote:

    yes, your objection (a) to measuring RT in the factorial could be
    valid, so would be an additional reason not to measure RT in this test,
    since we're under time pressure.

    I don't think I agree with gthe last part of your point (b), re
    automatically scoring words. I think we HAVE to allow people to write down
    nonsense words, if that's what they hear. Otherwise, they may write
    nothing, and although nothing can be scored as 0 words, I fear that a
    constraint on not writing nonsense may make some people omit words that
    they have in fact heard. e.g.
     there's a severed head here -> there's a feather bed here = no problem
    but
     there's a severed head here -> there's a = a problem, if
    the subject actually heard e.g. "there's a sevver dead ear" and stopped
    writing because there is no word "SEVVER" and they got confused and then
    forgot the rest of the sentence. If this happened a lot, we could end up
    with spuriously bad results, and also risk floor effects, even for our
    "good" items.
      BUT - is there some way of identifying correctly spelled words and
    counting phonemes in them automatically, and just spitting back the
    unidentified words so that we count their phonemes by hand??? (Any
    automatic counting system will have to be very carefully tested for how it
    deals with omissions and substitutions. Andrew and I came up with a fairly
    rigorous procedure, but it did sometimes involve some difficult decisions.
    I suppose we could have a low threshold forthe automatic system to ask for
    help?)

    Comment (c). Basically, yes, but just to clarify: if Ss press a button for
    T/F, we don;t HAVE to measure RT.

    Re your P.S. Yes, nice idea, but I'm leery of counting words. I don't
    think we have time to make sure that they won't interfere, since there
    are so many different ways in which they COULD interfere. At the very
    least, we'd have to use the same words with all conditions of the same
    sentence, but even then I;d worry. Couldn't we use digits, for
    example? If we had to make it more complicated, we could present
    digit strings (1 through say 9 digits) and count how many
    strings have more than e.g. 5 digits. Or even 1 through 4, and
    count all those that have 3 or 4 (which you could do by assessing by eye
    rather than counting the digits in the string). The latter form might be
    safest, as it would invovle less of a learning effect: people could
    probably estimate by eye immediately, whereas witht he longer strings,
    they'd probably start by counting, and learn to estimate by eye as the tak
    continued. But, you could also do this with shapes, even if the shapes
    are defined by making them with e.g. x's, as people do in their email
    banners etc. e.g. below is (sort of) an O with x's, and one with
    apostrophes.
       x
     x x
    x x
     x x
       x

       '
     ' '
    ' '
     ' '
       '

    good luck - I have to get on with Noel's stuff now!

    S

    >
    >
    > Dear Sarah and dear All,
    >
    > just three quick comments:
    >
    > a) It occurred to me that it might be virtually impossible to find a place
    > from which to measure the reaction time in the factorial, because we can't
    > assume that all changes are done on the same structures, i.e. timing
    > manipulation might be on a different foot as the f0 changes and the vowel
    > changes due to the resonance effect might be in again a different foot.
    > Or do we want to combine all manipulations on the same parts for
    > every sentences? What I understood from my conversation with Jill this
    > might be very difficult (at least to combine York and UCL manipulations,
    > because they seem to have different in structures in mind).
    >
    >
    >
    > b) I think the counting of words in an intelligibility test could be done
    > automatically. We could use a automatic spelling checker and even though
    > we would have to do manual corrections of the spelling, this would be
    > mauch faster than actually having to count the words (at least that's what
    > I assume). I will have a look into the possibility of letting the subject
    > type the sentences into the computer (the program then must provide
    > editing possibilities for them, which gets really tricky, if they have to
    > type more than one line and may go back a line and change a character
    > there).
    >
    > Maybe, if we don't allow nonsense words, we might even count the phonemes
    > automatically, with a lookup procedure and an electronic phonetic
    > dictionary.
    >
    >
    > c) We have to keep in mind that the sentences for such a test must be
    > choosen in a way which makes the individual words not to easy to predict,
    > but that should be feasable in the museum context with many unexpected
    > content words.
    >
    > Since the RT test would need a completly different set of sentences I
    > think we should make the basic decision whether RT or intell as soon as
    > possible.
    >
    > I guess we should go for intelligibility and maybe have a look into
    > possibilities to put additional cognitive load (additional to noise)
    > on the subjects. I don't think it would make sense to combine RT and
    > intell due to the different constraints on the material.
    >
    >
    > Sebastian
    >
    >
    >
    > P.S.:
    >
    > I just had a idea for an additional task, which would be easy to
    > implement. The task is count words on the screen. That means while the
    > sentence is presented acoustically, a random number of words (between 3
    > and 7) appear at various positions of the screen, the subjects have to
    > write down the number of words the saw together with the sentences they
    > heard. We could simplify things by using just characters, and complicate
    > things by using words that are especially choosen to distract them (e.g.
    > words which are phonetically or semantically similar to some that occur in
    > the test sentence).
    >
    > S.
    >
    >
    >
    >
    > On Wed, 12 Jan 2000, Sarah Hawkins wrote:
    >
    > > Dear All
    > >
    > > This message has several parts. Although I hope everyone will read all of
    > > it, I indicate here which items I think are of particular interest to
    > > different people.
    > > 1. Jill, Jana and Sebastian.
    > > 2. Ditto, plus all Yorkies
    > > 3. As 2, and especially Sebastian
    > > 4. Mark, Jill, Jana
    > >
    > > 0. Intro
    > > Daniel and I have had a brief but useful talk about f0 testing. This is
    > > just to put you in touch with two things we discussed, and recommendations
    > > we came up with. If you want me to talk about any more things with him
    > > while I'm here, then it would be best if you could let us know in time for
    > > us to talk on Thursday afternoon.
    > >
    > > 1. Jill's comparison ("Standard f0"), or "wrong" condition.
    > > Daniel suggests that you should use the average of the properties that
    > > you're trying to see if it's worth distinguishing. e.g if your "right"
    > > contours distinguish sonorant onsets into subsets (empty, nasal, approx)
    > > then the "wrong" comparison should be with the average of these. If you
    > > reckon that most synthesis systems wouldn't distinguish even obstruent
    > > from sonorant onsets, then you might use an average of all of them, unless
    > > you felt that bog-standard systems tend to use a model developed from one
    > > main type (e.g. those with voiceless obstruent onsets) and generalise
    > > inapproriately from those to al others -- in which case your "wrong"
    > > condition would be contours appropriate for voiceless abstruents, but you
    > > could have no voiceless obstruents on test set OR those would voiceless
    > > obstruents would have to have contours that you yourself have established
    > > as appropriate for sylls with sonorant onsets. If you follow any of these
    > > methods, and get your data from our own database, remember to calculate
    > > the average weighted by the number of items per onset type. E.g. if you
    > > have 10 obstruents, and 50 sonorants, you can't just take the average of
    > > alignemnt point for 60 undifferentiated items, as it won;t reflect the
    > > presence of the obstruents very well. (Sebastian says "Of coruse!".)
    > >
    > > 2. Methodology for f0 testing, especially in expensive factorial expt.
    > > Daniel agrees that RTs are interesting but there are big problems in
    > > knowing where to measure from. He thinks the idea of testing phoneme
    > > intelligibility in noise might be just as interesting/informative, at
    > > least for the factorial experiment. So, I wonder if we should consider
    > > only one factorial, with phoneme/word intell in noise as the dependent
    > > variable?
    > > This might reduce the risk of finding nothing, and allow more flexibility
    > > in coming up with sentences to test.
    > >
    > > We could leave the RT on truth values expt EITHER for a pretest of f0
    > > only, OR for next grant period (if we get one). If we do test RTs in an
    > > f0-only pretest, and if the data are again noisy, we will have gained
    > > knowledge to use for further development when/if we get the "testing RA"
    > > in the next grant. If the data are fine, we might be able to use the same
    > > data in a factorial design (with duration and spectral mods added), if
    > > there were time and available funds etc.
    > >
    > > 3. Further thoughts.
    > > (3a) If we want to build in a "cognitive" element to the factorial test,
    > > I wonder if we couldn't still do that by having stimuli pesented in noise
    > > in the standard way, and we evaluate phoneme intelligibility, but with the
    > > addition of one of the following (or something similar):
    > > (i) an item = context + statement to be evaluated for truth (as we were
    > > planning for the RT expt, in other words, but Ss respond T/F, AND write
    > > down what they hear).
    > > (ii) Ss do something on line (track a moving target?) while listening to
    > > each item, and then write down what they heard.
    > > (iii) Ss look at e.g. coloured shapes on line while listening to the
    > > items. They press a button (or other simple task) every time one
    > > particular type appears (e.g. it's yellow, or it's square, or (much
    > > harder) it's NOT yellow, or NOT a square), and they always write down
    > > what they hear.
    > > (iv) VERY simple mental arithmetic, following Sonntag (jenolan papers),
    > > while listening, and then write down what they hear.
    > >
    > > Of course, we risk any of the above being too hard too, but there might be
    > > time to pilot some of that.
    > >
    > > (3b) I know that Richard isn't keen on intell in noise because of the
    > > time it takes to score. He has a point, but on the other hand we know it
    > > produces analysable results. Sebastian will have to advise on time in
    > > scoring versus time in preparation, in analysing noisy data, etc. Also, if
    > > we have long enough sentences, we could try scoring words rather than
    > > phonemes correct. Fowler and Housum found no difference. I think I found a
    > > small difference, but not a huge one. We can;t use words correct with very
    > > short phrases though - too granular. It may also be possible to do some of
    > > the scoring algorythmically, if people type in their answers BUT I would
    > > hesitate to use that method because they're far more likely to misspell
    > > words they type than words they write by hand, and misspellings would be
    > > lethal.
    > >
    > >
    > > 4. COLLABORATION
    > > Daniel has asked me to say that he would very much like to see some
    > > collaboration between the Aix and PRoSynth groups. He is especially
    > > interested in talking with Mark about using PS XML-based methods, and
    > > hopes Jill might be persuaded to check out his/Aix's abstract f0 system.
    > > There are some possibilities for funding. I have now put you in touch, and
    > > perhaps you can take it from there.
    > >
    > > best
    > >
    > > Sarah
    > >
    > > _____________________________________________________________________
    > >
    > > Dr. Sarah Hawkins Email: sh110@cam.ac.uk
    > > Dept. of Linguistics Phone: +44 1223 33 50 52
    > > University of Cambridge Fax: +44 1223 33 50 53
    > > Sidgwick Avenue or +44 1223 33 50 62
    > > Cambridge CB3 9DA
    > > United Kingdom
    > >
    > >
    > >
    >
    >
    > *********************************************************************
    >
    > Sebastian Heid Email: sh276@cam.ac.uk
    > Phonetics Laboratory Phone: +44 1223 33 50 50
    > Dept. of Linguistics Fax: +44 1223 33 50 53
    > University of Cambridge
    > Sidgwick Avenue
    > Cambridge CB3 9DA
    > United Kingdom
    >
    > *********************************************************************
    >
    >
    >

    Sarah

    _____________________________________________________________________

     Dr. Sarah Hawkins Email: sh110@cam.ac.uk
     Dept. of Linguistics Phone: +44 1223 33 50 52
     University of Cambridge Fax: +44 1223 33 50 53
     Sidgwick Avenue or +44 1223 33 50 62
     Cambridge CB3 9DA
     United Kingdom



    This archive was generated by hypermail 2b29 : Wed Jan 12 2000 - 18:19:21 GMT