Speech Processing by Computer

LAB 4

SPECTRAL ANALYSIS AND MODELLING

This lab session demonstrates the use of spectral analysis applied to vowel sounds. Spectral analyses with different degrees of frequency resolution are compared. All-pole models of vowel spectra are calculated using linear prediction. A demonstration is made of the inversion of the linear predicted filter to provide an estimate of the glottal input to the vocal tract.

1. Frequency resolution of spectral analysis

(i) Acquire a steady vowel at 20000 samples/second.

(ii) Using the 'Esection' program, display and print spectral cross sections and LPC modelled spectral cross-sections corresponding to time windows of 1ms, 3ms, 10ms, 30ms and 100ms. You may need to adjust the analysis window size for the longer windows.

(iii) Estimate the accuracy to which a frequency component may be identified in each case and for each type of analysis.

(iv) Explain why different resolutions may be required to determine fundamental frequency as opposed to formant frequency.

(v) Contrast the LPC modelled spectra with DFT spectra. What are the useful characteristics of the modelled spectra?

2. Inverse-filtered waveforms

(i) Acquire a monophthongal vowel on a falling pitch at 10,000 samples/sec.

(ii) Use the 'Eswin' program to copy about 0.1 seconds of the vowel region to a new file.

(iii) Use the 'invfilt' program to (a) calculate a vocal tract filter from this signal, (b) to invert this filter, (c) to filter the original signal back through the inverted filter, and (d) to save the glottal pressure wave and two successive integrations of the glottal pressure wave.

(iv) Display the waveform and a spectrogram of the original and inverse filtered signals. Explain what you see. Replay the original and inverse filtered signal - can the quality of the vowel still be identified?

(v) Repeat steps (i) to (iv) with a different vowel.

3. Inverse-filtered spectra

(i) Use the ‘Esection’ program to acquire some different vowels at 10,000 samples/sec.

(ii) Use the ‘view source spectrum’ option to study the inverse filtered spectrum for the different vowels.

(iii) How well is the inverse filtering able to remove the effects of the filter?