Speech Perception Studies

The UCLID speech processor (Walliker and Daley, 1997) is a highly flexible experimental tool capable of running a variety of speech processing algorithms.

A programme of research is currently in progress to examine issues in cochlear implant speech processing. At present the work is based on patients implanted with the INERAID 6-electrode intra-cochlear array which allows a direct electrical connection to the speech processor.

The studies presented here have three purposes:

Comparison of UCLID CIS and INERAID CA processors:

Subjects

Subjects were three adult users of the Ineraid cochlear implant. In each case only five of the six Ineraid electrodes were usable as the most basal electrode gave rise to non-auditory sensations. The subjects have made use of the UCLID CIS processor only while attending for testing, whereas they have each been using the Ineraid processor daily for several years.

Speech processing

The Ineraid processor band-pass filters the acoustic input after initial compression using filters centred on 500, 1000, 2000 and 4000 Hz. The four filter outputs are delivered to the four most apical electrodes after a gain adjustment to avoid exceeding comfortable stimulation levels.

The CIS processing method extracts the amplitude envelope from each of a set of band-pass filters, and uses a compressed form of the extracted envelope to amplitude modulate a fixed-rate bi-phasic pulse train delivered to each of the electrodes. The pulses are presented non-simultaneously along the electrode array.

We used a five-band processor, whose first four analysis filters were closely matched to the filters used in the Ineraid processor. The fifth filter was centred on 6 kHz. The envelope smoothing filter used a 500 Hz cut-off frequency. In two of the subjects, the rate of stimulation per electrode was 1667 Hz, and the pulses were typically 35µs per phase. For subject RC, 65µs per phase pulses were used to increase dynamic range, with a stimulation rate per electrode of 1235 Hz. The time intervals between each phase of the pulse to one electrode, and that between the offset of the negative phase to one electrode and the onset of the positive phase to the next electrode, were 10 or 15µs.

Speech Assessments

An 18 item intervocalic consonant test and the BKB sentence materials have been used in speech assessment. Both tests were used in audio-visual and sound only forms. Both tests were run in quiet and with speech-spectrum shaped noise added. Signal-to-noise ratios are based on the rms levels of speech and noise.

Statistical tests are based on a general linear model analysis of variance (using SPSS) in which the subject factor is treated as a fixed rather than a random effect. This is considered to be reasonable since the subject group is small and individual subjects differ markedly in their hearing abilities using a cochlear implant. Hence, our subjects cannot reasonably be regarded as representative of a larger sample of implant users.


Results

An example of consonant identification results are displayed above. The data are dispayed as box and whisker plots. These represent the median score as the bar within each box. The number of test lists contributing to the data is shown on the X axis. Where there are more than two scores per condition, the box represents the interquartile range, and the whiskers represent the extreme values.

There was a significant overall effect of processor, with average scores around 5% higher with the CIS processor.

Significant effects of subject, of audio vs. audio-visual presentation, and a significant interaction between subject and audio/audio-visual presentation were also found.

Processor shows an overall effect on performance, also an interaction with subject (GA shows some lower sound-alone scores with the CIS processor).

Signal-to-noise ratio and audio/audio-visual presentation both show strong significant effects. There is also an interaction between audio-audio-visual presentation and the effect of the processor.

Conclusions

The CIS processor gave significantly higher scores overall for both consonant and sentence materials. For sentences, there was rather more variation in the effect of the processor across subjects.

Study 2:

High-rate CIS methods are now being proposed that require pulses to be delivered with intervals of 50µs or less between electrodes. While electrical interactions are not expected with non-simultaneous stimulation, neural interactions may occur. The effect of the timing of stimulation between electrodes on such interactions is unknown, but it seems a priori likely that close temporal proximity of stimulation will increase the degree of neural-interaction, especially where a degree of current spread occurs between adjacent electrodes. This might be expected to reduce the sharpness of effective frequency selectivity with possible deleterious effects on speech perception, especially in noise.

We have investigated the effects of the temporal proximity of pulse stimulation across electrodes using a CIS processor with a fixed update rate per electrode of 500 Hz and a 200 Hz envelope smoothing filter. The timing of bi-phasic pulses presented sequentially along the electrode array was varied so that the time interval between the offset of the negative pulse phase on one electrode and the onset of the positive pulse phase to its neighbour was either 10 or 260µs. For the 10µs IPI, there was a long interval between stimulation of electrode 5 and electrode 1.

CIS pulse timing for experiment 2

Inter-electrode pulse timing conditions for Study two.

Results of Study II

An example of the consonant identification scores is shown in the figure above.

Counter to the initial expectations, a longer time interval between pulses delivered to adjacent electrodes does not increase speech performance. Rather, the reverse effect is generally found.

An ANOVA showed significant main effects of inter-pulse interval, audio/audio-visual presentation, signal-to-noise ratio and subject.

There were no significant interactions, although the subject x interpulse interval x signal-to-noise ratio interaction was close to significance.

This result suggests that the timing of pulses between electrodes has a significant effect on speech performance. Presumably the effect of pulses to adjacent electrodes being closer in time is to increase the neural interaction between electro-cochlear stimulation channels. This would seem likely to impair spectral resolution, but further investigation is needed to understand these effects.

Higher rate CIS inevitably reduces the scope for allowing longer time intervals between pulses to different electrodes, and this study appears to demonstrate that this does not have deleterious effects on speech perception.

Study 3:

The third study examines the effect of CIS cycle rate with both pulse duration, and inter-pulse interval (between electrodes) fixed.

Having established an effect of the time interval between pulses to adjacent electrodes, we wanted to see how much this effect contributed to the overall effect of increasing CIS rate.

Method

CIS cycle rates of 500 and 1000 Hz were compared using the same pulse widths and pulse-spacings as in the "closely-spaced" pulses condition of study two. The high-pass cut-off frequency of the envelope extraction filter was fixed at 250 Hz. Stimulation is, therefore, identical except in respect of the CIS cycle rate.

Only one of our Ineraid subjects (RC) has so far taken part. Testing has so far been with sound alone consonant identification only. Preliminary results are based on two vcv lists with each CIS cycle rate at three signal-to-noise ratios. Data from study 2 are also included in the figure from both the 10µs and 260µs interpulse interval conditions.

Results

The results are shown in the figure below. An ANOVA showed that both CIS cycle rate and signal-to-noise ratio have significant effects on consonant identification. The interaction of rate and signal-to-noise ratio was not significant.

Conclusions

At least in this one subject, an increase of the CIS cycle rate from 500 to 1000 Hz appears to lead to an increased speech performance, especially in noise. Other studies of CIS rate have confounded the high-pass cut-off of the envelope extraction high-pass filter with CIS rate. Here, however, only the CIS rate has been varied We do not yet have sufficient data to compare the extent of the effect of CIS rate here to the effect of inter-pulse interval.


References

Walliker, J. R. and Daley, J. (1997), BSA Short Papers Meeting on Experimental Studies of Hearing and Deafness

Wilson, B., Finley, C., Lawson, D., Wolford, R., Eddington, D., and Rabinowitz, W. (1991), Nature, 352, 236-238

Link to Speech Hearing and Language paper