Department of Phonetics and Linguistics


Deborah A. VICKERS and Andrew FAULKNER

Bandwidth is one spectrally contrastive feature of voiceless speech that could be used in a low-frequency speech recoding scheme for hearing-impaired people. This study examines the bandwidth discrimination ability of severe-to-profoundly hearing-impaired listeners for low-frequency bands of noise. Task 1 investigated the discrimination of bandwidth for noises symmetrical around a 300-Hz centre frequency. Two conditions were used, in which the wider or the narrower of two bandwidths was fixed, and the other was varied. The wide fixed bandwidth was 500 Hz, and the narrow fixed bandwidth was 124 Hz. Psychometric functions were measured and fitted by Probit analysis to the logarithm of bandwidth ratio. All subjects were capable to some extent of discriminating bandwidth. 75% correct bandwidth ratios for the hearing-impaired group ranged from 1.32 to 2.91. Threshold bandwidth ratios for normally hearing subjects ranged from 1.16 to 1.74.

In a second task the estimated threshold from Task 1 was used as a reference point to examine whether performance was affected by eliminating either the upper or the lower frequency differences between the stimuli. The results showed that subjects tended to use high frequency edge cues to discriminate the noises when the wide noise band was fixed at 500Hz. When the narrow noise band was fixed at 124 Hz, individual subjects relied on different cues to discriminate the stimuli.

1. Introduction
The work described in this paper is aimed at determining the potential for encoding the spectra of voiceless fricatives for severe-to-profoundly hearing-impaired listeners. A practical objective is to describe a region of hearing in these listeners in which the main spectral features of fricatives could potentially be encoded.

It is well known that severe and profoundly hearing-impaired listeners experience problems in understanding speech, particularly in noisy environments. These difficulties arise mainly due to their limited frequency and dynamic range and also from the loss of much of their spectral analytic abilities (e.g. Rosen et al.,1990; Faulkner et al., 1992). These limited abilities mean that many of the acoustic cues essential for speech perception are not available. Many profoundly hearing-impaired listeners cannot use the cues associated with formant structure and transitions. In addition, the high-frequency aperiodic cues important for manner and place distinctions amongst plosive and fricative consonants are lost. Hence these listeners are largely dependent on lip-reading for the perception of these speech contrasts.

Many of these listeners receive useful lip-reading support from conventional amplifying hearing aids, largely from the low-frequency temporal information representing the voicing pattern, fundamental frequency, and amplitude variations. However, psychoacoustic studies have shown that the majority of profoundly hearing impaired people retain potentially useful auditory abilities which are not fully exploited by conventional amplification, and which could be utilised to extend the range of speech information which is perceived by these listeners. Amongst these residual abilities are:

1) significant frequency selectivity in frequency regions of less profound hearing loss (Faulkner et al., 1990);

2) the ability to distinguish between periodic and aperiodic stimuli with a similar frequency content (Rosen et al., 1990);

3) relatively good resolution of amplitude modulations of low-frequency carrier signals at modulation rates of 80 Hz and below (Faulkner and Rosen, 1990).

The SiVo hearing aid (Rosen et al., 1987) was developed to allow profoundly hearing impaired listeners to utilise these residual auditory abilities to improve speech perception. The aid is based in the speech pattern element approach (Fourcin 1977). The objective is to select those elements that are most useful for speech perception so as to simplify the speech signal. The elements provided depend on the individual's hearing abilities and are matched to the individual frequency and intensity range. The first implementation of the speech pattern element approach (SiVo-1) represented the voice fundamental frequency as an acoustic sinusoid.

Results with this version of the SiVo aid proved to be promising, with subjects showing improvements in their perception of voicing contrasts and intonation (Faulkner et al., 1992). Additional features have since been added to the SiVo aid. These include speech amplitude envelope and voiceless frication information. Amplitude envelope information is intended to convey loudness variations, which carry suprasegmental cues, and to enhance the perception of some manner and voicing contrasts. Voiceless excitation information provides an audible indicator of the presence of frication to assist with the perception of voiceless fricative and plosive consonants. Laboratory evaluations with English listeners (Faulkner et al., 1992) and Chinese listeners (Wei, 1993) have given promising results for some subjects, due to the addition of extra features providing more speech information. However many subjects with a severe or profound hearing loss still gain more information from a conventional aid than from the SiVo with this limited set of features. It is therefore necessary to increase the number of features on the SiVo to assess if a larger number of simplified features may help with speech perception.

1.1 Simplified representations of voiceless frication
The aim of this work is to simplify the spectra of voiceless fricatives and provide them as a low-frequency noise. There are two essential aspects to the development of such encoding - 1) to discover the basic perceptual limitations of the listeners, and 2) to identify the most important acoustic cues for consonant identification. The latter issue has received some attention from other workers.

Harris (1958) carried out an experiment with normally hearing listeners where she spliced and recombined vocalic and frication sections from 16 CV syllables. In a perceptual test to determine the importance of the vocalic and frication components, she found that the sibilant/non-sibilant classification was based upon the intensity of the frication. The identification of the individual sibilants was determined by the main frequency peak of the frication and the distinction between the non-sibilants was based upon the pattern of vowel formant transitions.

Heinz and Stevens (1961) synthesised a set of fricative stimuli with a range of centre frequencies (from 2 to 8 kHz) and bandwidths. They found that for normally hearing subjects there was little effect of bandwidth but there was an effect of centre frequency. As centre frequency increased, responses changed from to [ç] to [s] to . The addition of a low-frequency noise to the high-frequency stimuli (centre frequencies of 6.5 and 8 kHz) improved the perception of the non-sibilants.

Although we have information about which acoustic cues should be considered when encoding the voiceless fricatives, we cannot apply this information to hearing- impaired listeners until we understand more about their perceptual abilities for discriminating noise-like stimuli within the low frequency region of their hearing.

1.2 Noise centre-frequency discrimination
Vickers and Faulkner (1993; 1996) described the results of an experiment with profoundly hearing-impaired listeners where spectrally adjacent bands of noise were discriminated at different frequencies and different bandwidths. The stimuli were fixed in bandwidth and frequency within a run and the duration was adaptively altered. A criterion duration for useful discrimination of 80 ms was chosen because it is the shortest duration of fricative noise in running speech (Howell and Rosen 1983). All subjects could discriminate at least two adjacent pairs of 250-Hz wide noise bands at durations of 80 ms or less. Subjects with better hearing could discriminate up to six such noise bands. With the subjects who had better hearing, it was also possible to test with larger bandwidths and over a wider frequency range.

These results show that for even the most impaired subjects there is a possibility that residual hearing could be used to discriminate between spectrally adjacent bands of noise. This is encouraging for developing a fricative encoding scheme whereby fricatives are represented by simple noise bands, transposed into the subject's residual hearing range.

The experiment reported here examines whether subjects can discriminate the bandwidth of noises when the spectra of the stimuli overlap. Bandwidth is one cue to fricative identity, and would represent a potentially useful additional dimension in a simplified encoding. The better the discrimination abilities of the listener, the more potential there is for flexibility in the encoding strategy, thus allowing the stimuli to be as similar as possible to the original sounds to improve naturalness and facilitate training.

2. Task 1. Bandwidth Discrimination with fixed centre frequency
This experiment examined the discrimination of noise bandwidth for noise bands of fixed centre frequency. The aim was to discover if subjects had retained any useful discrimination abilities which could be transferred to the identification of encoded fricatives

2.1 Subjects
Five hearing-impaired and four normally hearing subjects took part. They all had previous experience in psycho-acoustic and speech perceptual experiments. The hearing-impaired subjects were aged between 35 and 75 and had post-lingual severe-to-profound sensori-neural hearing losses. The normally hearing group were aged between 25 and 45 years old. All subjects underwent hearing tests to ensure the reported loss was correct. Audiograms for the hearing impaired subjects are shown in Figure 1.

Figure 1. Audiograms for the hearing impaired subjects

2.2 Task and conditions
Subjects were required to discriminate noises of different bandwidths with a fixed centre frequency. The stimuli were band-pass filtered noises centred at 300 Hz with a duration of 80 ms. A low centre frequency was selected to ensure that the stimuli were audible to all subjects. It was assumed that if subjects could discriminate the stimuli at short durations they could probably discriminate them under easier conditions when the durations of the stimuli were longer. Bandwidths of the noises ranged from 124 to 500 Hz. 500 Hz was chosen for the widest bandwidth because it was wide enough for there to be a large difference between the narrow and wide band stimuli but not so wide that the majority of the information fell outside some of the subjects' hearing range. The narrowest bandwidth was chosen to be 124 Hz because this was narrow enough to be useful as a bandwidth for encoding fricatives but not so narrow as to give difficulties with audibility due to the limited dynamic ranges of some of the subjects. When the noise bandwidth is too narrow, subjects have a tendency to choose a lower listening level to avoid discomfort due to the strong inherent fluctuations. This can make the stimuli hard to discriminate because much of the sound falls below absolute threshold, and could also lead to audible loudness fluctuations being used as a cue to discrimination. For the purpose of this experiment we tried to eliminate loudness cues so we could assess how well subjects could discriminate stimuli on a purely spectral basis. However for the purpose of an encoding strategy the use of loudness cues would be beneficial to improve identification.

There were two main conditions; in condition 1 the fixed reference stimulus had the broader bandwidth, this being 500 Hz, and the narrower bandwidth was varied. In condition 2 the reference stimulus had the narrower bandwidth, which was fixed at 124 Hz, and the broader bandwidth was varied.

2.3 Stimulus generation and control
The noise stimuli were derived from a white-noise source, and band-pass filtered using two Kemo VBF/8 filters (48dB/octave). Each stimulus was gated on and off with a raised half-cosine envelope of 5 ms. A Masscomp 5400 was used to control stimulus presentation. The stimuli were gated by multiplying the noise with a gating signal derived from a 12-bit D/A converter. After gating and filtering, the stimuli were sent along a balanced line to a sound attenuating chamber. A Yamaha P2100 amplifier, Hatfield manual attenuators, Charybdis programmable attenuators and a potentiometer wheel were used to control the output levels of the stimuli. The stimuli were balanced in loudness and jittered in level as described below. The stimuli were presented monaurally through Beyer Dynamic DT48 headphones; electronic equalisation was used to give a flat headphone frequency response down to about 70 Hz. An FFT analyser (Ono-Sokki CF-910) was used to monitor the signal levels and spectra of the stimuli via a Bruel and Kjær artificial ear.

2.4 Procedure
Psychometric functions were measured in conditions 1 and 2 for a range of bandwidth ratios using a three-interval two-alternative forced-choice paradigm. In condition 1 the 500-Hz wide noise was presented in the first interval and the narrow noise was presented in either the second or third interval with a 500-Hz wide noise in the other interval. In condition 2 the 124 Hz noise band was played in the first interval and the wider noise was presented in either the second or third interval with a 124-Hz wide noise in the other interval. The subjects had to decide whether the second or third stimulus was the odd one out and press the appropriate button on a response box. Feedback lights were used after each trial to indicate the correct response. Within a run the bandwidths of the two noises were fixed and a percent correct score was obtained from a hundred trials. Each point on the psychometric function was repeated in a different session and the scores from the two sessions were combined producing a percent correct score from 200 presentations. All subjects received a minimum of 4 hours training before data collection began. At the start of each session the threshold for detecting the noise bands was confirmed to ensure that the subject's thresholds had not changed.

2.4.1 Loudness balancing
Three precautions were taken to ensure that discrimination was not based on loudness differences. For the hearing-impaired subjects, all stimuli were presented through a filter based on equal-loudness judgements. For all subjects, the two stimuli used in a given run were additionally matched in loudness. Finally, a random amplitude jitter was applied to stimuli within each trial.

a) Equal-loudness frequency shaping

Prior to testing, an individually tailored equal-loudness FIR (finite impulse response) filter was set up for each hearing-impaired subject. The filter parameters were individually calculated from a loudness balancing experiment. The procedure involved subjects balancing the level of 200-ms tones with 5-ms raised half-cosine ramps at 125-Hz intervals from 125 - 1125 Hz (where possible). Balancing began by comparing the loudness level of a 500-Hz tone at comfortable listening level with the loudness level of a 625-Hz tone, the two tones were played in continuous alternation and the level of the 625-Hz tone was adjusted until it appeared equally loud to the 500-Hz tone. This procedure was repeated for all the audible tones within the 125 - 1125 Hz range above and below 500 Hz. 500 Hz was always kept as the reference tone. Each match was made three times and the average intensity level of the matches was used. The attenuation values required to produce the average intensity level were recorded at all frequencies. A matching filter shape was interpolated from these values and the FIR filter coefficients were generated to implemented this response on an EF8 programmable filter.

b) Pair-wise loudness matching of stimuli

At the start of each run subjects were required to rotate a potentiometer wheel whilst hearing the two noises in alternation. The potentiometer controlled the level of one noise while the level of the other noise was fixed. The subjects were instructed to adjust the wheel and find the point at which the two sounds appeared to be equally loud. They were advised to pass through the equal loudness point a few times to ensure accurate balancing. This procedure was carried out three times and the mean level was used as the point of equal loudness. Once the equal loudness point was determined the subject listened to the stimuli again to ensure they were comfortably loud. If necessary the output level was adjusted.

c) Level Jitter

In order to eliminate any residual loudness differences due to balancing errors, levels were randomly varied over a 3 dB range from stimulus to stimulus for the hearing-impaired group and over a 6 dB range for the normally hearing subjects. The level of jitter was determined from a pilot study to assess the amount of jitter which was comfortable for the subject so to eliminate loudness cues but preserve performance. The normally hearing subjects were able to tolerate a higher level of jitter than the hearing-impaired subjects.

2.5 Results of task 1
Psychometric functions (percent correct as a function of the logarithm of the bandwidth ratio) were fitted by Probit analysis (Finney, 1971) and the 75% point estimated. The 75% level was chosen because this point represented a bandwidth ratio where performance was above chance (50 % for a two alternative task) but well below the ceiling level.

The 75% threshold point was used as a way of comparing performance across subjects and between the normally hearing and hearing-impaired groups.

Psychometric functions from conditions 1 and 2 for the normally hearing subjects are plotted in figures 2 and 3 and those for the hearing-impaired subjects are shown in figures 4 and 5. Figures 2 and 4 are the functions when the wide noise band (500 Hz) was fixed and figures 3 and 5 show the functions when the narrow band (124 Hz) was fixed.

Figure 2. Psychometric functions for the normal hearing group in condition 1. The bandwidth ratio is plotted along the abscissa and the percent correct is shown on the ordinate. The mean percent correct scores from the two sessions as a function of bandwidth ratio are shown by the filled circles and the vertical lines represent plus and minus one standard deviation for each point.

Figure 3. Psychometric functions for the normal hearing group in condition 2

Figure 4. Psychometric functions for the hearing-impaired group in condition 1

Figure 5. Psychometric functions for the hearing-impaired group in condition 2

All subjects showed an ability to discriminate the stimuli presented to them to some extent.

For the normally hearing group the mean 75% correct threshold bandwidth ratios were 1.30 and 1.50 for conditions 1 and 2 respectively. The threshold ratio of 1.30 for condition 1 corresponds to being able to discriminate bandwidths of 500 Hz and 385 Hz, which is an absolute edge difference of 57.5 Hz on each edge. In percentage terms this is 12% on the upper edge and 54% on the low frequency edge. A ratio of 1.50 in condition 2 corresponds to the bandwidths of 124 Hz and 186 Hz being discriminable. This corresponds to an edge difference of 31 Hz on each edge, and equivalent percentage differences of 9% on the high frequency edge and 13% on the low frequency edge.

For the hearing-impaired group the mean threshold bandwidth ratios for subjects UJ, RF and JH were 1.67 and 1.80 for condition 1 and 2 respectively. Subjects MB and AV had much larger bandwidth ratios at threshold in condition 1 than subjects UJ, RF and JH. This could have been because they had difficulty using the high frequency edge of the stimuli because it fell in a region of poorer hearing so they had to rely on the low frequency edge alone. If the results from all the hearing-impaired subjects were used to compute a mean threshold bandwidth ratio, then the result for condition 1 was 2.16 and that for condition 2 was 1.89. A threshold ratio of 1.67 corresponds to being able to discriminate noise with a bandwidth of 500 Hz from noise with a bandwidth of 299 Hz. This corresponds to an edge difference on both the upper and lower edges of 100.5Hz, which when translated in to a percentage difference (of the cut-off frequency of the fixed noise), is 22% on the upper edge and 67% on the lower edge. The 1.80 threshold ratio for condition 2 corresponds to discriminating bandwidths of 124 Hz and 223 Hz, a difference of 49.5 Hz on each edge, a percentage difference of 14% on the high frequency edge and 54% on the low frequency edge.

3. Task 2. Discrimination of spectra with single edge frequency cues
A second task was run to assess if the subjects were favouring any particular edge of the noise bands for discriminating the stimuli. For the hearing-impaired subjects in particular, where hearing ability is generally better at lower frequencies, it was thought possible that only the lower band edge frequencies were being used as a discrimination cue. Hence, stimuli were generated with either the upper or lower edge frequencies of the stimulus pair set to the same frequency, and the other edge frequencies differing.

3.1 Procedure

The bandwidth ratios corresponding to the 75 % point estimated in conditions 1 and 2 of task 1 were used as the point for assessing if subjects were favouring a particular edge of the stimuli for discrimination. The first stage of testing was to check the accuracy of the 75% point for both conditions 1 and 2 by checking the discrimination of the pair of noises corresponding to the 75% point.

For each of condition 1 and 2, there were two further conditions in which either the upper or lower edge frequencies of the variable stimulus were fixed at the same frequency as those of the reference noise band, i.e. for condition 1 it was 50 Hz for the lower edge and 550 Hz for the upper edge and in condition 2 it was 238 Hz for the lower edge and 362 Hz for the upper edge.

The other edge of the variable stimulus was the same frequency as the corresponding edge in the 75% estimate. A stylised representation of an example set of stimuli is shown in figure 6. The example is based on a 2:1 bandwidth ratio at the 75% point.

Each point was produced from a presentation of 200 three-interval trials, 100 run in one session and a 100 run on a separate day. In all other respects the procedure was the same as that for task 1, as were the subjects.

Figure 6. Stylised diagram of a set of stimuli in task 2.

3.2 Results of task 2.

Hearing-Impaired group
75 % pointRatio From Probit % score Re-test Upper edge fixed Lower edge fixedRatio From Probit % score Re-test Upper edge fixed Lower edge fixed
SubjCondition 1 (w-b fixed) Condition 2 (n-b fixed)
JH1.3272.0 63.070.0 1.36 74.570.072.0
RF1.8283.0 70.0 72.5 68.0
AV2.9169.0 71.546.52.18 74.054.565.0
Normally hearing group
75 % pointRatio From Probit % score Re-test Upper edge fixed Lower edge fixedRatio From Probit % score Re-test Upper edge fixed Lower edge fixed
SubjCondition 1 (w-b fixed) Condition 2 (n-b fixed)
JD1.2980.5 1.41 76.5
KD1.1671.0 70.563.567.5
Table 1. Thresholds estimated from Probit analysis

Table 1 shows the estimated 75 % correct bandwidth ratio thresholds interpolated from the Probit analysis. The empty diamonds in figures 3,4,5 and 6 show scores obtained when the predicted 75 % points were re-run. The upward-pointing arrows show the scores when the upper edges of the noises were fixed and the downwards- pointing arrows show the scores when the lower edges were fixed.

Figure 7 shows the estimated 75 % points for all the subjects. The initials under each point refer to a subject. The open squares indicate the 75% point estimated from condition 1 and the open diamonds from condition 2.

The repeated 75% points ranged from 67-83.5%, this good repeatability indicates that the estimates from the Probit analysis were reasonably accurate.

Figure 7. Estimated 75% points for all subjects in conditions 1 and 2

One of the hearing-impaired subjects (MB) only managed to complete the psychometric function in condition 1 because of work pressures.

The results from task 2 using the wider bandwidth reference stimulus, showed that all the normal-hearing subjects and the hearing-impaired subject JH performed similarly when only the high frequency edges of the stimuli were available for discrimination as when both edges were available. This indicates that they attended to the high frequency edge to discriminate the stimuli when both edges were available. For the hearing-impaired subjects whose high frequency hearing was not as good as JH the results are slightly different. Subjects RF and UJ appeared to use both edges for discrimination because neither of the asymmetric conditions on their own gave a similar level of performance to that when both edges were available and subject AV who had the poorest hearing relied totally on the low frequency edge of the stimuli.

The results from task 2 using the narrower bandwidth reference stimulus show that the normal-hearing subjects DV and KD and the hearing-impaired subject AV appeared to use both edges for discrimination in the symmetric condition. This is indicated by the fact that neither edge alone produced performance as good as when both edges were available. Probably the subjects were integrating cues from both band edge regions. Subject UJ appeared to use the low frequency edge and the normal-hearing subject AF and the hearing-impaired subjects JH and RF appeared to be able to do the task with either edge. It is hard to know what cues AF, JH and RF were using in the symmetric condition to discriminate the stimuli because they seem to be capable of performing the task with the same degree of success with either edge available to them but did not perform better when both edges were available.

4. Discussion
Speech is a complex and dynamic signal, continuously varying in frequency, amplitude and time. The production of one phoneme can vary dramatically with such factors as the preceding and following speech sounds and the rapidity of speech. Although this work appears to treat speech as a very static entity focusing on specific speech cues, the final aim is to provide an encoded signal that varies in a similar way to the original speech but with some of the features simplified to improve perception for those people who can not cope with the complexity of the original speech. Such an encoding strategy would include not only bandwidth but also other cues, for example intensity and centre frequency. Bandwidth is expected to play an important role in improving naturalness and providing a further cues to improve perception.

The results of the experiments carried out in this paper have been very encouraging. It was interesting to discover that all the subjects tested could, to some degree, perform the tasks. This suggests that even profoundly hearing-impaired subjects have hearing abilities in the low-frequency region which potentially could be used to identify encoded speech stimuli. These abilities are also very relevant to the use of other prosthetic strategies such as frequency transposition (Velmans, 1973; Braida et. al., 1976; Rosenhouse, 1989).

In general the hearing-impaired group performed more poorly than the normally hearing group except for JH whose results fall within the normally hearing range. For both groups, thresholds were usually slightly larger when the narrower bandwidth was fixed; the impaired listener AV was the only exception. This may be because, when the narrow bandwidth was fixed, the non-overlapping frequency region of the stimuli was smaller for a given bandwidth ratio. For example for an equivalent ratio of 2:1, this would correspond to a noise of 500 Hz bandwidth being discriminated from a noise of 250 Hz bandwidth in condition 1, or a noise of 124 Hz bandwidth being discriminated from a noise of 248 Hz bandwidth in condition 2.

Another interesting finding from this experiment is how similar the performance of the normal-hearing and hearing-impaired subjects was given the degree of hearing loss of some of the subjects. Threshold bandwidth ratios were 1.30 and 1.67 in condition 1 for the normal-hearing and hearing-impaired subjects respectively, and 1.50 and 1.80 in condition 2. It appears that bandwidth could be a very important cue for encoding fricatives because of its accessibility to hearing-impaired subjects.

Is discrimination based on the upper, lower or both edges?

Task 2, where we assessed performance when either the upper or lower frequency edge was fixed, should give us an insight in to whether performance was based upon attending to one edge, both edges or either edge. It may be that there was quite a strong edge pitch cue available to the subjects. The noise bands used in this experiment are similar to those used by Fastl (1971). He found that both edges of a 600 Hz wide noise band in the low frequency region elicited a strong pitch sensation for normally hearing listeners. At higher centre frequencies the pitch sensation related more to the centre frequency of the band of noise and not the edges. Our stimuli were low in frequency so it is likely that edge pitch cues were available. We also used stimuli with fairly sharp cut-off slopes (48 dB/Oct.) which increase the pitch strength of the edge pitch cues (Fastl, 1980).

Outcome of task 2 using the wider bandwidth stimuli.

Where the fixed reference stimulus had the wider 500 Hz bandwidth, the results from task 2 appear to be fairly straightforward. If the subject had useable hearing around 500 Hz, then the high frequency edge was used for discrimination, probably because the low frequency edges would have fallen in a frequency region (50 - 100 Hz) where discrimination abilities are poorer (Usher, 1976; Harris, 1952) than at higher frequencies. The hearing-impaired subjects who did not have such good hearing at 500 Hz, could not make such effective use of the high frequency edges of the stimuli for discrimination. Subjects UJ and RF used both edges of the stimuli and probably integrated cues from both band edge regions, whereas subject AV relied totally on the low frequency edge which may explain why her ratio for the symmetric condition was typically much worse than for the other subjects.

Outcome of task 2 using the narrower bandwidth stimuli.

Where the fixed reference stimulus had the narrower 240 Hz bandwidth, the results from task 2 were not so easy to interpret. This is probably because there were more perceptual cues available to the subjects. The low frequency edge fell in a higher, more useable, frequency region, approximately around 180-240 Hz. Furthermore, the subjects who weren't able to use the high frequency edge previously probably could in condition 2 because it was much lower, approximately around 360-420 Hz. Some of the subjects relied on both of the edges whereas other subjects focused their attention on one edge but could have performed just as well with the other edge. Subject UJ was the only subject relying only on the low frequency edge.

Related work

The results from this study are consistent with those of Pickett, Daly and Brand (1965). They measured just-noticeable differences (jnds) for the cut-off frequency of low pass noise at a variety of cut-off frequencies (250, 500, 1000, 1500 and 2000 Hz) in normal and severely cochlear damaged subjects. At 250 Hz they found that the jnd in normally hearing subjects was 12%, which represents a 30 Hz difference. At the higher frequencies their jnds ranged from 3 to 5%. Another interesting result from the Pickett, Daly and Brand study was that the difference between the two groups of subjects was not large. The jnd for the hearing-impaired subjects was 15% at 250 Hz, which corresponds to an absolute frequency difference of 37.5 Hz, and performance worsened at higher cut-off frequencies.

The normal-hearing subjects and subjects JH appeared to be attending to the high frequency edge to perform the task in condition 1. If we assume that these subjects were not attending to the low frequency edge it allows us to compare our results to those of Pickett, Daly and Brand (1956). The mean 75% bandwidth ratio in condition 1 for the normal-hearing subjects and JH was 1.30, corresponding to a frequency difference of 57.5 Hz on each edge. If the jnd for high frequency edge is calculated as a proportion of the high frequency edge of the narrower noise, then the percentage difference was 11.7%. This is similar to the 12% value calculated by Pickett, Daly and Brand. The cut-off frequency in our experiment was slightly higher (492.5 Hz) so it would perhaps be predicted from their results that the jnd in our experiment would be slightly less. Nevertheless our results are certainly in a similar range to those of Pickett, Daly and Brand indicating that normal-hearing subjects can only detect quite large spectral differences in noise stimuli.

It is also interesting to note that the normal-hearing and hearing-impaired subjects in the Pickett, Daly and Brand study performed very similarly to each other, which is also what we found.


These results show that profoundly hearing-impaired subjects were able to detect changes in the bandwidth of noise stimuli under conditions where the loudness of the bands did not provide useable discrimination cues. This ability could potentially be exploited in the encoding of phonetically relevant acoustic cues in aperiodic speech sounds. It remains to be determined whether bandwidth could be used as a cue for absolute identification as opposed to discrimination.

Further work

Further psychophysical investigations into the hearing of the subjects used here is required to determine the mechanisms they used to perform the task and to discover why the difference between the two groups is so small.

Identification experiments will be conducted to determine the usefulness of bandwidth as a cue for speech perception in a speech coding where acoustic cues in high frequency aperiodic speech are represented as low frequency noise bands.

The authors would like to thank John Deeks, Brian Moore, Marina Rose and Stuart Rosen for helpful comments on earlier drafts of this paper. This work was supported by the Medical Research Council and TIDE project TP1217 OSCAR

Braida, L. D., Durlach, N.I., Hicks B.L., Reed, C.M. and Lippman, R.P. (1976) Matching speech to the residual auditory function III - Review of previous research on frequency lowering. Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts.

Fastl, H. (1971) Über tonhöhempfindungen bei rauschen (Pitch of noise). Acustica. 25, 350-354.

Fastl, H. (1980) Pitch strength and masking patterns of low-pass noise. In Psychophysical, physiogical and behavioural studies in hearing. Ed. G. van den Brink and F.A. Bilson. 334-340.

Faulkner, A., and Rosen, S. (1990). "Spectral and temporal resolution in the profoundly deaf," Br. J. Audiol., 24, 193-194.

Faulkner, A., Rosen, S., and Moore, B. C. J. (1990) "Residual frequency selectivity in the profoundly hearing-impaired listener," Br. J. Audiol. 24, 381-392. Also in Speech, Hearing and Language, work in progress, University College London, Department of Phonetics and Linguistics, 4, 82-97.

Faulkner, A., Ball, V., Rosen, S. R., Moore, B. C. J., and Fourcin A. J. (1992) "Speech pattern hearing aids for the profoundly hearing-impaired: Speech perception and auditory abilities". Journal of the Acoustical Society of America, 91, 2136-2155.

Finney, D.J. Probit Analysis. (1971) Cambridge: Cambridge University Press.

Fourcin, A. J. (1977) "English speech patterns with special reference to artificial auditory stimulation" In: A review of artificial auditory stimulation: Medical Research Council Working Group Report, edited by A. R. D. Thornton. (Institute of Sound and Vibration Research, University of Southampton, pp 42-44.

Harris J.D. (1952) Pitch discrimination. Journal of the Acoustical Society of America, 24, 750-755.

Harris, K.S. (1958) Cues for the discrimination of American English fricatives in spoken syllables. Lang. Speech. 1, 1-7.

Heinz, J.M. and Stevens K.N. (1961) On the properties of voiceless fricative consonants. Journal of the Acoustical Society of America, 22, 6-13.

Howell, P. & Rosen. S. (1983) Closure and frication measurements and perceptual integration of temporal cues for the voiceless affricate / fricative contrast. Speech, Hearing and Language: Work in Progress, Department of Phonetics and Linguistics, University College London, 1: pp 109-117.

Howell P., and Rosen S., (1986) Closure and frication measurements and perceptual integration of temporal cues for the voiceless affricate fricative contrast. Speech Hearing and Language: Work in Progress. Dept. of Phonetics and Linguistics, University College London. 1, 108-117.

Pickett, J. M., Daly, R.L. and Brand S.L. (1965) Discrimination of spectral cutoff frequency in residual hearing and in normal hearing. Journal of the Acoustical Society of America, 38, 923 (A).

Rosen, S., Faulkner, A. and Smith, D. A. J. (1990) The psychoacoustics of profound hearing impairment. Acta Oto-laryngologica (Stockh), Suppl. 469, 16-.22

Rosen, S., and Fourcin, A. (1983) When less is more - Further work. Speech Hearing and Language: Work in Progress. Dept. of Phonetics and Linguistics, University College London. 1, 1-27.

Rosen, S., Walliker, J.R., Fourcin, A.J. & Ball, V. (1987) A micro-processor based acoustic hearing aid for the profoundly impaired listener. Journal of Rehabilitation Research and Development, 24: pp 239-260.

Rosenhouse, J. (1989) The Frequency Transposing Hearing Aid: A new prototype. The Hearing Journal. Feb. 14-15.

Usher, N. (1976) Pitch discrimination at low frequencies. M.Sc. Thesis. Chelsea College, University of London.

Velmans, M. (1973) Aids for deaf persons. U.K. Patent 1340105.

Vickers, D. A. and Faulkner, A. (1993) "Results from noise discrimination tasks with the hearing impaired" Speech, Hearing and Language, Work in Progress, Department of Phonetics and Linguistics, University College London, 7, 257-265.

Vickers, D.A. and Faulkner A. (1996) Noise spectrum discrimination by severe-to-proundly hearing-impaired listeners. In Psychoacoustics, Speech and Hearing Aids. Ed. B. Kollmeier. World Scientific.

Wei, J. (1993) A speech pattern processing method for Chinese listeners with profound hearing loss. Ph. D. Thesis, Department of Phonetics and Linguistics, University College London.

© Deborah A. Vickers and Andrew Faulkner


Page created by Martyn Holland
for comments