Wells, A study of the formants of the pure vowels of British English

The tape-recorder used had been found to have a flat frequency response over the range of interest for formant measurements.

From time to time during the analysis of the material, frequency calibrations were carried out on the spectrograph by means of a 1000 cps square wave source. It was found that the frequency scale was extremely constant, and linear at least over the range 0-3000 cps. A factor of 5.138 was fixed and used to convert frequency measurements (in hundredths of a centimetre on the spectrograph marking paper) into cycles per second. This gave a minimum of 986 cps and a maximum of 1012 cps for the 1 kc/s calibration tone, on 36 different measurements of it. This represents a maximum intrinsic spectrograph variation of less than ±1½ percent, that is less than three-tenths of a millimetre above or below the mean on the marking paper.

It was found simpler to perform the necessary statistical calculations on the frequency readings *before* conversion into cycles per second.

For each formant of each vowel, the sum of the frequency values was obtained and divided by the number of callings (fifty) to give the MEAN. The mean formant frequencies for the three formants of the various vowels studied are listed in Table 2 and shown graphically in fig. 3. It should be noted that this mean is the arithmetic mean, not the geometric mean, of the observed values, although of course the ear perceives frequencies as spaced not linearly but logarithmically.

As well as the sum of the frequency values for each formant of each vowel, the sum of the squares was calculated, and the STANDARD DEVIATION derived from it. This is a useful measure of the scatter or dispersion of the observed values with respect to the mean, and is equivalent to the root-mean-square deviation. Stating a standard deviation (*s*) gives an idea of how densely values are scattered around a mean (*x̄*): slightly over two-thirds of all values are within one standard deviation of the mean, 95 percent within two standard deviations, and over 99 percent within three standard deviations -- always provided that the distribution is reasonably similar to that of the normal curve and not too skew. The standard deviation for each formant of each vowel is tabulated in Table 2; in fig. 4 the length of the lines extending to each side of the mean for each vowel's F_{1} and F_{2} corresponds to one standard deviation in each direction, that is ( x̄ ± s ).

In fig. 4 the observed mean formant frequencies are plotted on a logarithmically scaled F_{1}/F_{2} graph. The striking similarity between this way of plotting formants and the familiar vowel quadrilateral based on articulatory-auditory criteria is well known. [25, p. 54] Bearing in mind the known phonetic tamber of the vowels studied (see above, The phonetic nature of the vowels studied), one may make certain observations about the relationship between the positions of the vowels on the formant chart to their positions on an ordinary vowel quadrilateral. The main discrepancy appears to be in the ʌ-ɑ-ɒ-ɔ area, where one is surprised to find that on the formant chart /ʌ/ comes out "opener" (in articulatory terms) than /ɑ/, while /ɔ/ is "backer" than /ɒ/.

Now the open-back area is notorious as the part of the vowel quadrilateral where it is most difficult to place sounds by ear. Its extent on a formant chart is limited by the approach of the first and second formants to one another: if they are less than one harmonic apart, they will not be separately distinguishable, and it will be impossible to place the sound on a formant chart. One can therefore readily understand that it is in this area that discrepancies are likely to arise.

The unexpectedly extreme "backness" of /ɔ/ no doubt results from its unusual phonetic nature (see above). The rather wide jaw-opening combined with fairly high tongue raising and close lip-rounding gives it a distinctive "over-back" quality, which is evidently reflected in the comparative proximity of the two lowest formants to one another.

It will be seen from fig. 3 that as one passes from /i/ through the other front vowels to /æ/ the first formant rises while the second and third formants fall. From /æ/ to /ɑ/ the second formant falls sharply, and remains close to F_{1} as one passes from /ɑ/ through /ɒ/ to /ɔ/, while the lower two formants continue to fall and the third formant rises slightly. As one continues to /ɷ/ and /u/ the first formant continues to fall, but the second formant is rather higher and F_{3} starts falling again. Finally, /ʌ/ and /ɜ/ are central vowels with distinctive mid values for F_{2}, F_{1} values comparable to /æ/ and /ɛ/ respectively, and F_{3} values similar to those of /ɑ/ and /ɷ/.

From the standard deviations illustrated in fig. 4 it will be evident that there is a certain amount of overlapping between the various vowels. It will be convenient to distinguish *major* overlapping, where the mean frequency for the formant of one vowel lies within the ( x̄ ± s ) limits of the corresponding formant of another vowel, and *minor* overlapping, where the ( x̄ ± s ) ranges for the corresponding formants of two vowels overlap slightly, but not to the extent that the mean for either vowel lies within the ( x̄ ± s ) range of the other vowel. There is then some degree of overlap in both F_{1} and F_{2} in the case of the follwing pairs: /i-ɩ/, /ɑ-ʌ/, /ɑ-ɒ/, /ɷ-u/, /ʌ-ɜ/. The following table shows the degree of overlap for each formant, and also the difference in decibels in the amplitudes of the formant concerned (vide infra, ch. IV).

F_{1} | dB | F_{2} | dB | F_{3} | dB | |

/i - ɩ/ | minor | 7 | minor | 14 | none | 11 |

/ɑ - ʌ/ | major | 2 | minor | 3 | major | 3 |

/ɑ - ɒ/ | major | 4 | minor | 4 | major | 0 |

/ɷ - u/ | minor | 2 | major | 12 | major | 5 |

/ʌ - ɜ/ | minor | 0 | minor | 8 | major | 1 |

It will be seen that the greatest overlaps are in the cases of /ɑ-ʌ/ and /ɑ-ɒ/. But in every pair in the above table there is also a difference in linguistic vowel length: this explains why we are normally able to keep them apart in spite of their similar tambers. There are no cases of F_{1} and F_{2} overlap between pairs of vowels where both are short or both are long.

In the case of five out of the 25 subjects, there had been some doubt whether their speech was completely RP, that is absolutely free from regional influence. The speech of three of them seemed perhaps to have slight Northern or Midlands influence, and that of the other two to have slight London or Home Counties influence. The twenty speakers who definitely spoke RP will be referred to as Group A, and the possibly divergent group of five as Group B. It was desired to ascertain whether the formant frequencies observed for Group B were such as to suggest that their divergence from the Group A mean might be assigned to chance, or whether, on the other hand, the speakers of Group B differed significantly in this respect from the RP population of which Group A was a sample.

Statistical tests were carried out on the four formant frequencies where there was a divergence of 10 percent or more between the two group means, namely the F_{1}'s of /i ɩ ɔ ɷ/. In each case, the means and standard deviations for the two groups were established, and a Student's *t* test (small sample method) carried out [32, p.232]. These tests showed that the divergence in the F_{1} of /i/ was not significant at the 5 percent level, while the divergence in the F_{1}'s of /ɩ ɔ ɷ/ was significant at the 5 percent level but not at the one percent level. It was concluded that Group B speakers, as a group, have an F_{1} frequency for /i ɔ ɷ/ which is probably significantly lower than the mean for Group A; but that for the remaining thirty formants there is no significant difference in mean frequency between the two groups. Group B was therefore accepted as speaking RP (at least as far as vowel formant frequencies are concerned), and, except in Table 3, all values for the two groups have been pooled.

The effect of this pooling on the standard deviation of formant frequency values was varied. The dispersion of F_{1} usually increased; that of F_{2} sometimes increased, sometimes decreased, and sometimes stayed the same; that of F_{3} always decreased. It is of course to be expected that increasing the sample size would cause the sample standard deviation to increase slightly [32, p.225].

Table 3 shows the mean frequency for Group A, for Group B, and for the two groups pooled; also the standard deviation for Group A and for the two groups pooled.

There are several factors which contribute to the divergence of observed individual vowel formant frequencies from the mean.

First, there is the experimental error implicit in measuring distances on the marking paper with a ruler by eye. All measurements were made to the nearest tenth of a millimetre: repeat measurements might well differ by one-tenth of a millimetre or more. Limitations are imposed too by the very nature of a spectrograph, since all parts work only to a certain tolerance, and inaccuracies may arise particularly in the performance of the stylus and marking paper.

Secondly, there is a variation due to the fact that the moment chosen for making the section may not be typical of the vowel (see above, ch. II, Analysing procedure).

Thirdly, there is a variation implicit in the examination of only two samples of a given speaker's pronunciation of a particular allophone of a given vowel. To get a genuine idea of the range and central tendency of, say, a given speaker's /i/, it might be necessary to study twenty, fifty, or a hundred instances of it. By taking only a minute sample, namely two instances, one runs the risk of receiving a very imperfect idea of the speaker's norm for that vowel.

Fourthly, individual speakers' norms are scattered around a dialect norm. Even within one dialect, RP for example, there are considerable variations. The use of twenty-five different speakers gives some idea of the range and central tendency: but study of 2500 speakers would give a much better idea.

There is also the complication of differing bases of articulation [25, p. 86]. Even if all speakers were to pronounce /i/ as the closest and most front vowel of which they are capable (which they do not), we should still find that the formant frequencies varied from speaker to speaker.

Lastly there is the scatter of different dialect norms around a possible language norm. But the terms "dialect" and "language" cannot be properly delimited. In any case, this variation is so great that it has been excluded from this present study by examining only one dialect.

The formant frequencies found for RP British English pure vowels were compared with the values found for General American vowels (male speakers) by Peterson and Barney [37].

It is difficult to know how representative Peterson and Barney's subjects were of the non-Eastern, non-Southern dialect area labelled "General American". They state: "The male speakers represented a much broader regional sampling of the United States; the majority of them spoke General American." (p.177) At the very least it is admitted that not all of them had the same phonemic system: "some members of the speaking group ... speak one of the forms of American dialects in which [ɑ] and [ɔ] are not differentiated" (p.178). It would appear to be unfortunate to throw together data referring to different phonological systems. As far as the present study is concerned, however, it can be confidently stated that all speakers have phonemic contrasts between all the vowels studied.

British vowels are not absolutely comparable with American vowels: in particular, RP has an extra vowel phoneme, /ɒ/, which has no counterpart in General American. The following table shows the distribution in different words of the three RP phonemes /ɑ ɒ ɔ/ and the two General American phonemes /ɑ ɔ/.

R.P. | Gen.Am. | |

father, calm | ɑ | ɑ |

bother, hod | ɒ | ɑ |

long, dog | ɒ | ɑ or ɔ |

pause, hawed | ɔ | ɔ |

Besides, there is of course no /r/ in RP in such words as *hard, hoard*, whilst in General American there is an /r/ after the vowel.

The number of words where RP /ɑ/ corresponds to Gen.Am. /ɑ/ is very small. if we exclude the words which have a following /r/ in Gen.Am. (which often colours the whole vowel phonetically). Nevertheless, the phonetic quality of the two /ɑ/'s is quite similar, and indeed their formant frequencies are not far apart.

RP /ɒ/ excluded, then, the two sets of vowel formant frequencies are compared in Table 4 and fig. 5.

In the following cases the Gen. Am. mean formant frequency was within half a standard deviation of the RP mean formant frequency:

F_{1} of /i u/ | F_{2} of /i æ ɑ u ɜ/ | F_{3} of /i æ/. |

F_{1} of /ɒ ɛ æ ɑ ʌ/ | F_{2} of /ɒ ɷ ʌ/ | F_{3} of /ɑ u ʌ/. |

F_{1} of / ɔ ɷ ɜ/ | F_{2} of /ɛ ɔ/ | F_{3} of /ɒ ɛ ɔ ɷ/. |

This shows that the Gen.Am. /i/ is acoustically very similar to the RP /i/. In descending order of similarity then follow /æ u/, /ɑ/, /ʌ/, /ɒ/, /ɛ ɷ/, /ɔ/, and lastly /ɜ/. This is in reasoinable agreement with one's auditory impressions. The marked divergence of /ɜ/ (/ɝ/) is of course due to its strong r-colouring in Gen.Am.: the main acoustic correlate of this is the sharp lowering of F_{3}.

It is noteworthy that the mean frequencies of F_{1} and F_{2} of Gen.Am. /ɔ/ are within half a standard deviation of RP /ɒ/, and F_{3} within one standard deviation -- as great a similarity as that of /æ/ between the two dialects.

In 23 out of the thirty cases, the RP mean formant frequency was found to be higher than the Gen.Am. value. Two possible suggestions are made, either of which could account for this:

- It is purely due to linguistic factors; RP vowels are, in general, articulatorily fronter and opener than Gen.Am. vowels.
- It is due to the differing bases of articulation of the speakers chosen for the two sample groups: they constituted after all rather small samples. This would result in general shifting of the acoustic loop of possible vowels for different speakers [25, p.86].