Re: prosody testing and Collab with Aix

From: Sebastian Heid (sh276@cus.cam.ac.uk)
Date: Wed Jan 12 2000 - 16:36:07 GMT

  • Next message: Sarah Hawkins: "Re: prosody testing and Collab with Aix"

    Dear Sarah and dear All,

    just three quick comments:

    a) It occurred to me that it might be virtually impossible to find a place
    from which to measure the reaction time in the factorial, because we can't
    assume that all changes are done on the same structures, i.e. timing
    manipulation might be on a different foot as the f0 changes and the vowel
    changes due to the resonance effect might be in again a different foot.
    Or do we want to combine all manipulations on the same parts for
    every sentences? What I understood from my conversation with Jill this
    might be very difficult (at least to combine York and UCL manipulations,
    because they seem to have different in structures in mind).

    b) I think the counting of words in an intelligibility test could be done
    automatically. We could use a automatic spelling checker and even though
    we would have to do manual corrections of the spelling, this would be
    mauch faster than actually having to count the words (at least that's what
    I assume). I will have a look into the possibility of letting the subject
    type the sentences into the computer (the program then must provide
    editing possibilities for them, which gets really tricky, if they have to
    type more than one line and may go back a line and change a character
    there).

    Maybe, if we don't allow nonsense words, we might even count the phonemes
    automatically, with a lookup procedure and an electronic phonetic
    dictionary.

    c) We have to keep in mind that the sentences for such a test must be
    choosen in a way which makes the individual words not to easy to predict,
    but that should be feasable in the museum context with many unexpected
    content words.

    Since the RT test would need a completly different set of sentences I
    think we should make the basic decision whether RT or intell as soon as
    possible.

    I guess we should go for intelligibility and maybe have a look into
    possibilities to put additional cognitive load (additional to noise)
    on the subjects. I don't think it would make sense to combine RT and
    intell due to the different constraints on the material.

     
            Sebastian

    P.S.:

    I just had a idea for an additional task, which would be easy to
    implement. The task is count words on the screen. That means while the
    sentence is presented acoustically, a random number of words (between 3
    and 7) appear at various positions of the screen, the subjects have to
    write down the number of words the saw together with the sentences they
    heard. We could simplify things by using just characters, and complicate
    things by using words that are especially choosen to distract them (e.g.
    words which are phonetically or semantically similar to some that occur in
    the test sentence).

            S.

    On Wed, 12 Jan 2000, Sarah Hawkins wrote:

    > Dear All
    >
    > This message has several parts. Although I hope everyone will read all of
    > it, I indicate here which items I think are of particular interest to
    > different people.
    > 1. Jill, Jana and Sebastian.
    > 2. Ditto, plus all Yorkies
    > 3. As 2, and especially Sebastian
    > 4. Mark, Jill, Jana
    >
    > 0. Intro
    > Daniel and I have had a brief but useful talk about f0 testing. This is
    > just to put you in touch with two things we discussed, and recommendations
    > we came up with. If you want me to talk about any more things with him
    > while I'm here, then it would be best if you could let us know in time for
    > us to talk on Thursday afternoon.
    >
    > 1. Jill's comparison ("Standard f0"), or "wrong" condition.
    > Daniel suggests that you should use the average of the properties that
    > you're trying to see if it's worth distinguishing. e.g if your "right"
    > contours distinguish sonorant onsets into subsets (empty, nasal, approx)
    > then the "wrong" comparison should be with the average of these. If you
    > reckon that most synthesis systems wouldn't distinguish even obstruent
    > from sonorant onsets, then you might use an average of all of them, unless
    > you felt that bog-standard systems tend to use a model developed from one
    > main type (e.g. those with voiceless obstruent onsets) and generalise
    > inapproriately from those to al others -- in which case your "wrong"
    > condition would be contours appropriate for voiceless abstruents, but you
    > could have no voiceless obstruents on test set OR those would voiceless
    > obstruents would have to have contours that you yourself have established
    > as appropriate for sylls with sonorant onsets. If you follow any of these
    > methods, and get your data from our own database, remember to calculate
    > the average weighted by the number of items per onset type. E.g. if you
    > have 10 obstruents, and 50 sonorants, you can't just take the average of
    > alignemnt point for 60 undifferentiated items, as it won;t reflect the
    > presence of the obstruents very well. (Sebastian says "Of coruse!".)
    >
    > 2. Methodology for f0 testing, especially in expensive factorial expt.
    > Daniel agrees that RTs are interesting but there are big problems in
    > knowing where to measure from. He thinks the idea of testing phoneme
    > intelligibility in noise might be just as interesting/informative, at
    > least for the factorial experiment. So, I wonder if we should consider
    > only one factorial, with phoneme/word intell in noise as the dependent
    > variable?
    > This might reduce the risk of finding nothing, and allow more flexibility
    > in coming up with sentences to test.
    >
    > We could leave the RT on truth values expt EITHER for a pretest of f0
    > only, OR for next grant period (if we get one). If we do test RTs in an
    > f0-only pretest, and if the data are again noisy, we will have gained
    > knowledge to use for further development when/if we get the "testing RA"
    > in the next grant. If the data are fine, we might be able to use the same
    > data in a factorial design (with duration and spectral mods added), if
    > there were time and available funds etc.
    >
    > 3. Further thoughts.
    > (3a) If we want to build in a "cognitive" element to the factorial test,
    > I wonder if we couldn't still do that by having stimuli pesented in noise
    > in the standard way, and we evaluate phoneme intelligibility, but with the
    > addition of one of the following (or something similar):
    > (i) an item = context + statement to be evaluated for truth (as we were
    > planning for the RT expt, in other words, but Ss respond T/F, AND write
    > down what they hear).
    > (ii) Ss do something on line (track a moving target?) while listening to
    > each item, and then write down what they heard.
    > (iii) Ss look at e.g. coloured shapes on line while listening to the
    > items. They press a button (or other simple task) every time one
    > particular type appears (e.g. it's yellow, or it's square, or (much
    > harder) it's NOT yellow, or NOT a square), and they always write down
    > what they hear.
    > (iv) VERY simple mental arithmetic, following Sonntag (jenolan papers),
    > while listening, and then write down what they hear.
    >
    > Of course, we risk any of the above being too hard too, but there might be
    > time to pilot some of that.
    >
    > (3b) I know that Richard isn't keen on intell in noise because of the
    > time it takes to score. He has a point, but on the other hand we know it
    > produces analysable results. Sebastian will have to advise on time in
    > scoring versus time in preparation, in analysing noisy data, etc. Also, if
    > we have long enough sentences, we could try scoring words rather than
    > phonemes correct. Fowler and Housum found no difference. I think I found a
    > small difference, but not a huge one. We can;t use words correct with very
    > short phrases though - too granular. It may also be possible to do some of
    > the scoring algorythmically, if people type in their answers BUT I would
    > hesitate to use that method because they're far more likely to misspell
    > words they type than words they write by hand, and misspellings would be
    > lethal.
    >
    >
    > 4. COLLABORATION
    > Daniel has asked me to say that he would very much like to see some
    > collaboration between the Aix and PRoSynth groups. He is especially
    > interested in talking with Mark about using PS XML-based methods, and
    > hopes Jill might be persuaded to check out his/Aix's abstract f0 system.
    > There are some possibilities for funding. I have now put you in touch, and
    > perhaps you can take it from there.
    >
    > best
    >
    > Sarah
    >
    > _____________________________________________________________________
    >
    > Dr. Sarah Hawkins Email: sh110@cam.ac.uk
    > Dept. of Linguistics Phone: +44 1223 33 50 52
    > University of Cambridge Fax: +44 1223 33 50 53
    > Sidgwick Avenue or +44 1223 33 50 62
    > Cambridge CB3 9DA
    > United Kingdom
    >
    >
    >

    *********************************************************************

      Sebastian Heid Email: sh276@cam.ac.uk
      Phonetics Laboratory Phone: +44 1223 33 50 50
      Dept. of Linguistics Fax: +44 1223 33 50 53
      University of Cambridge
      Sidgwick Avenue
      Cambridge CB3 9DA
      United Kingdom

    *********************************************************************



    This archive was generated by hypermail 2b29 : Wed Jan 12 2000 - 16:36:56 GMT