Sounds and spectrograms of a hierarchy of stimuli, varying in complexity and intelligibility, all constructed using the first two formants of sine-wave speech.

	a b c d e		a b c d e

The left column contains sine-wave versions of the various manipulations. The top sentence (a) is a straightforward version of the original sentence “The clown had a funny face” with natural formant and amplitude variations. b) contains interpolated formant tracks from (a), as seen in (c), with the amplitude variations from another sentence imposed. This leads to a sound that is unintelligible, but has the same spectro-temporal complexity as natural speech. d) represents steady-state formants with the natural amplitude variations, whereas e) the simplest case, consists of two steady-state formants at a constant amplitude. Once the manipulations of the sine wave formants are done, the stimuli are passed through a 16-channel noise-excited vocoder (Shannon et al., 1995) so as to replace the sine waves with a continuous spectrum whose envelope is more reminiscent of natural speech. The common excitation also causes the two ‘formants’ to cohere perceptually, leading to a unitary percept.

Back to Index

UCL PSYCHOLOGY & LANGUAGE SCIENCES Faculty of Brain Sciences
UCL » Psychology & Language Sciences » Resources in Speech, Hearing & Phonetic Sciences