Sounds and spectrograms of a hierarchy of stimuli, varying in complexity and intelligibility, all constructed using the first two formants of sine-wave speech.



stimuliimage1 a


b


c


d


e

stimuli2
a


b


c


d


e
 
The left column contains sine-wave versions of the various manipulations. The top sentence (a) is a straightforward version of the original sentence “The clown had a funny face” with natural formant and amplitude variations. b) contains interpolated formant tracks from (a), as seen in (c), with the amplitude variations from another sentence imposed. This leads to a sound that is unintelligible, but has the same spectro-temporal complexity as natural speech. d) represents steady-state formants with the natural amplitude variations, whereas e) the simplest case, consists of two steady-state formants at a constant amplitude. Once the manipulations of the sine wave formants are done, the stimuli are passed through a 16-channel noise-excited vocoder (Shannon et al., 1995) so as to replace the sine waves with a continuous spectrum whose envelope is more reminiscent of natural speech. The common excitation also causes the two ‘formants’ to cohere perceptually, leading to a unitary percept.



Back to Index


University College London - Gower Street - London - WC1E 6BT - Telephone: +44 (0)20 7679 2000 - Copyright © 1999-2015 UCL