|
|
Menu |
The
perceptual magnet effect is not specific to speech prototypes: new evidence
from music categories.
Sarah BARRETT
Abstract
Previous work on prototypicality in music has led to the claim that
music prototypes act in the opposite way to speech prototypes - as anchors
rather than magnets. In one such study, professional musicians were
given discrimination tasks in which they had to distinguish acoustically-similar
sounds in the context of both a prototypical and non-prototypical C-major
chord (Acker et al., 1995). In contrast to what has been found for American
English listeners for various speech-sound categories (e.g. Kuhl, 1991),
professional musicians show enhanced discrimination in the region of
the prototype. The present study questions whether the performance of
such musically-trained subjects is representative of the average listener's
perception of music categories. Here, 10 non-musicians as well as 10
musicians were given a discrimination task in which they were required
to distinguish prototypical and non-prototypical C-major chords from
a series of acoustically-similar variants. Unlike the musicians who
showed enhanced discrimination in the context of the prototype, the
non-musicians showed reduced discrimination. The results have implications
for the applicability of the perceptual magnet effect to domains other
than speech and are interpreted in terms of a new theory called A&R
theory, which suggests that prototypes have a dual role in the perceptual
system depending upon the amount of attention paid to them by the listener.
TOP Download
pdf file now
|
|



Designed
and built by
Martyn Holland
February 2000.
Click to comment.
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b)
|
|
Periodicity
and Pitch Information in Simulations of Cochlear Implant Speech Processing
Andrew FAULKNER, Stuart ROSEN and Clare SMITH
Abstract
Pitch, periodicity and aperiodicity are regarded as important cues for
the perception of speech. However, modern CIS cochlear implant speech
processors, and recent simulations of these processors, provide no explicit
representation of these factors. We have constructed four-channel vocoder
processors that manipulate the representation of periodicity and pitch
information, and examined the effects on the perception of speech and
the ability to identify pitch glide direction.
A vocoder providing highly salient pitch
and periodicity information used a pulse train source during voiced
speech, and a noise source in the absence of voicing. The pulse train
was controlled by voice fundamental frequency. A second condition provided
a salient auditory contrast to periodicity but no pitch information,
through the use of a fixed rate pulse source during voicing, and a noise
source at other times. Further processing conditions were independent
of input speech excitation. One such condition used a constant pulse
train throughout, with neither periodicity nor pitch represented. Two
further conditions used a noise source throughout. In one noise condition,
the amplitude envelope extracted from each band was low-pass filtered
at 32 Hz, eliminating pitch and periodicity cues from the envelope.
In the second noise condition, the envelope was low-pass filtered at
400 Hz; this was expected to provide a relatively weak indication of
pitch and periodicity.
The vocoder using a pulse source that followed
the input fundamental frequency gave substantially higher performance
in identification of frequency glides than vocoders using noise carriers,
which in turn showed better performance than processors using a fixed
rate pulse carrier. However, performance in consonant and vowel identification
and sentence recognition was remarkably similar through all of the processors.
Connected discourse tracking rates were affected by the envelope filter
of the noise carrier processors, although this effect was small. We
conclude that whilst the processors achieved the desired control over
the salience of pitch and periodicity, the speech tasks used here show
little sensitivity to this manipulation.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b) |
|
Effects
of the number of channels and speech-to-noise ratio on rate of connected
discourse tracking through a simulated cochlear implant speech-processor
Andrew FAULKNER, Stuart ROSEN AND Lucy WILKINSON
Abstract
A number of recent studies have investigated simulations of cochlear
implant speech processors with the aim of establishing the minimum number
of channels required to support speech perception in quiet and in noise.
These studies have all used citation form consonant and vowel stimuli
or simple sentences. Intelligibility measures for such materials, especially
sentences, can often show ceiling effects. The present study has examined
this issue using connected discourse tracking, a task that can be less
subject to ceiling effects and is more representative of everyday communication.
Speech processing employed a real-time sine-excited vocoder having three,
four, eight or 12 channels. Amplitude envelopes extracted from each
band modulated sinusoidal carrier signals placed at each band centre
frequency. Speech-spectrum shaped random noise was added to speech prior
to the vocoder processing to give three signal-to-noise ratios of +7,
+12, and +17 dB. Noise levels were adjusted in real time according to
measurements of speech level. Connected discourse tracking rates through
the vocoders increased significantly with number of channels up to 12
in both quiet and noise, and decreased significantly with each increase
in the noise level from quiet. For natural speech, these levels of noise
had little effect on tracking rate. We conclude that with connected
speech, optimal performance from a cochlear implant in the quiet and
in modest levels of noise is likely to require more than eight independent
frequency channels.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b)
|
|
Intonation
Modelling in ProSynth
Jill HOUSE, Jana DANKOVICOVA and Mark HUCKVALE
Abstract
ProSynth uses a hierarchical
prosodic structure (implemented in XML) as its core linguistic representation.
To model intonation we map template representations of F0 contours onto
this structure. The template for a particular pitch pattern is derived
from analysis of a labelled speech database. For a falling nuclear pitch
accent this template has three turning points: two define the F0 peak
and one marks the end of the F0 fall. Statistical analysis confirmed
that the alignment and shape of the template are sensitive to the properties
of the structure and also provided quantitative values for F0 synthesis.
Our results suggest that phonetic interpretation of the nuclear pitch
accent is best related to the accented Foot rather than to the accented
syllable. In determining parameter values for synthesis, we conclude
that F0 information should be integrated with temporal and segmental
information.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b)
|
|
Opportunities
for re-convergence of engineering and cognitive science accounts of
spoken word recognition
Mark HUCKVALE
Abstract
This article traces the roots of the divergence between the engineering
community and the cognitive science community accounts of word recognition.
It argues that although there are cultural differences, when looked
at objectively, there is considerable overlap in the desires and motivations
of the two communities. It suggests that the criticisms of engineering
systems that caused the original divergence in the late 1970s are much
less valid today, that re-convergence is timely and will help create
a theory of speech processing which will explain both primary and emergent
phenomena. It proposes that the study of LVCSR systems as if they were
human, and the study of humans as if they were LVCSR systems, could
lead to a research agenda which would benefit both communities. It introduces
elements of a programme to encourage joint research and co-operation.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b)
|
|
Effect
of interactive visual feedback on the improvement of English intonation
of Japanese EFL learners
Masaki TANIGUCHI and Evelyn ABBERTON
Abstract
This paper is based on research
dedicated to helping to improve the teaching and learning of English
intonation (prosody) for Japanese EFL learners. It attempts to evaluate
the effectiveness of the use of real time interactive visual feedback
on the learners' approximation of their fundamental frequency contours
to those of native speakers. It also attempts to investigate characteristic
features of Japanese EFL learners' English intonation and how their
Japanese accents are affecting their English intonation. This investigation
enabled us to reaffirm our confidence in the effectiveness of interactive
visual feedback of the voice fundamental frequency pattern in helping
Japanese EFL learners improve their English intonation. We saw that
there was a great difference in improvement between the group of learners
who had the advantage of being exposed to interactive visual feedback
for an hour every day in the two-week course and the group of learners
who did not. We also found that the use of tone marks helped the learners
a great deal, but an important finding was that if no tone marks were
provided, it was extremely difficult for the learners to improve without
any interactive visual feedback. With the use of interactive visual
feedback, the learners were able to improve even in material without
tone marks.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b) |
|
The
intermediate phrase in central Catalan declaratives: a case for questioning
the representation of downstep
Eva ESTEBAS I VILAPLANA & John A. MAIDMENT
Abstract
This paper examines two aspects of the intonation of S(ubject) V(erb)
O(bject) Central Catalan declaratives produced in reading speech. First,
it deals with the identification of an intermediate level of prosodic
phrasing in Central Catalan declaratives. Second, it analyses the immediate
implications of this intermediate phrase on the phonological representation
of the F0 contours and, in particular, on the interpretation of downstep.
Three different cues are used for the identification of the prosodic
boundaries: a pause, a local F0 fall, and the lengthening of the boundary
syllable. In all sentences, an intermediate level of prosodic structure,
marked with a H- phrase accent, is observed between the subject and
the verb. The tonal representation of the sentences is determined through
both an auditory analysis and an acoustic analysis of the data. A pitch
reset is observed at the beginning of the second intermediate phrase,
which starts with a drastic lowering on the peak of the first pitch
accent. Evidence for treating this lowering as an intended downstep
movement is provided with the analysis of speaking rate differences.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b)
|
|
Overcoming
phonetic interference
John C. WELLS
Abstract
The phenomenon of phonetic interference in foreign language learning
is addressed by considering first the phonetics of loan-words from Japanese
to English and from English to Japanese and then the specific pronunciation
difficulties it causes Japanese learners of English. There are well-known
problems with
and with the phonemic contrasts .
Other difficulties are context-dependent: e.g. both /s/ and /t/ before
high front vowels. Many involve phonotactics: consonant clusters, final
consonants, the Japanese mora vs. the English syllable. Compound stress
is also discussed. In dealing with all of these problems, ear-training
may be as important for the learner as articulation practice.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b)
|
|
Pronunciation
preferences in British English: a new survey
John C. WELLS
Abstract
A second poll of BrE pronunciation preferences was carried out in late
1998. It was based on a self-selected sample of nearly 2000 'speech-conscious'
respondents, who answered a hundred questions about words of uncertain
or controversial pronunciation. The findings allow us to answer questions
about lexical incidence and sound changes in progress
TOP
Download
pdf file now
|
|
|
|
Auditory
filter nonlinearity in mild/moderate hearing impairment
Richard J. BAKER and Stuart ROSEN
Abstract
Sensorineural hearing loss has frequently
been shown to result in a loss of frequency selectivity. Less attention
has been paid to the level dependency of selectivity that is so prominent
a feature of normal hearing. The aim of the present study is to characterise
such changes in nonlinearity as manifested in the auditory filter shapes
of listeners with mild/moderate hearing impairment. Notched-noise masked
thresholds were measured over a range of stimulus levels at 2kHz in
hearing-impaired listeners with losses of 20-50 dB. Growth of masking
functions for different notch-widths are more parallel for hearing impaired
than for normal hearing listeners, indicating a more linear filter.
Level dependent filter shapes estimated from the data show relatively
little change in shape across level. The loss of nonlinearity is also
evident in the input/output functions derived from the fitted filter
shapes. Reductions in nonlinearity are clearly evident even in a listener
with only 20 dB hearing loss.
TOP
Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b) |
|
The relationship
between speech and nonspeech auditory processing in children with dyslexia
Stuart ROSEN & Eva MANGANARI
Abstract
Although there is good evidence that some dyslexic children show at
least small deficits in speech perceptual tasks, it is not yet clear
the extent to which this results from a general auditory, as opposed
to a specifically linguistic/phonological problem. Here we have investigated
the extent to which performance in backward and forward masking can
explain identification and discrimination ability for speech sounds
in which the crucial acoustic contrast (the second formant transition)
is followed ("ba" vs. "da") or preceded ("ab" vs. "ad") by a vowel.
More specifically, we expect children with elevated thresholds in backward
masking to be relatively more impaired for tasks involving "ba" and
"da" than for tasks involving "ab" and "ad". In order to determine whether
poor performance with speech sounds reflects a general deficit for perceiving
formant transitions, we also constructed nonspeech analogues of the
speech syllables - the contrastive second formant presented in isolation.
Two groups of 8 children matched for age
(mean of 13 years) and nonverbal intelligence were selected to be well
separated in terms of their performance in reading and spelling. All
underwent the same set of auditory tasks: 1) forward, backward and simultaneous
masking with a short (20 ms) 1-kHz probe tone in a broadband and notched
noise; 2) identification as "b" or "d" of synthetic "ba"-"da" and "ab"-"ad"
continua; 3) same/different discrimination of pairs of stimuli drawn
from the endpoints of the two speech continua (e.g., "ba-da", "da-ba",
"da-da", "ba-ba"), as well as their nonspeech analogues.
There were no differences between dyslexic
and control children in forward and simultaneous masking, but thresholds
for backward masking in a broadband noise were elevated for the dyslexics
as a group. Overall speech identification and discrimination performance
was superior for the controls (barely so for identification), but did
not differ otherwise for the two speech contrasts (one of which should
be influenced by backward masking, and one by forward). Thus, although
dyslexics show a clear group deficit in backward masking, this has no
simple relationship to the perception of crucial acoustic features in
speech. Furthermore, the deficit for the nonspeech analogues was much
less marked than for the speech sounds, with ¾ of the dyslexic listeners
performing equivalently to controls. Either there is a linguistic/phonological
component to the speech perception deficit, or there is an important
effect of acoustic complexity.
TOP
Download
pdf file now
|
|
|
|
Minimising
boredom by maximising likelihood - an efficient estimation of masked
thresholds.
Richard J. BAKER and Stuart
ROSEN
Abstract
One of the main problems in carrying out psychoacoustic experiments
is the time required to measure a single threshold. In this study we
compare the accuracy of threshold estimation in a 2I2AFC task for detecting
a 2kHz tone in either a broadband noise or a notched-noise. Tone thresholds
were estimated in three normal-hearing listeners using either a Levitt
procedure to track 79% correct, or a maximum-likelihood estimation (MLE)
procedure to track 70, 80 or 90% correct. Given the chosen parameters
for the different procedures, the MLE procedure proved to be approximately
2.5 times faster at estimating masked thresholds than the Levitt procedure.
Only thresholds using the 70% MLE procedure were significantly different
in magnitude from those obtained using the Levitt procedure. To test
the repeatability of the measurements the standard deviations (SD) of
the threshold were calculated. Statistical analyses show smallest SDs
for the Levitt and 90% MLE procedures, with significantly larger SDs
for the 70% and 80% MLE.
TOP Download
pdf file now
|
|
Barrett Faulkner(a) Faulkner(b) House Huckvale Taniguchi
Estebas Wells(a) Wells(b)
Baker(a) Rosen
Baker(b) |
|