Overcoming phonetic interference

J.C.Wells, University College London

Article published in English Phonetics, Journal of the English Phonetic Society of Japan, 3.9-21 (2000).

This Unicode text includes both IPA symbols and Japanese katakana. If your screen does not display them properly, you may need to install a Unicode font or and/or to set your browser to it. (Download Arial Unicode MS, free — with this font installed and the browser set to Unicode (UTF-8), everything displays correctly in the current versions of IE and NN.)

1. The nature of interference

Phonetics is about describing the sounds of speech and the patterns they make. Among its various practical applications the one that will be uppermost in the minds of most readers is that of teaching and learning the pronunciation of a foreign language. This article is addressed particularly to those concerned with teaching English pronunciation to Japanese learners.

When we encounter a foreign language, our natural tendency is to hear it in terms of the sounds of our own language. We actually perceive it rather differently from the way native speakers do. Equally, when we speak a foreign language we tend to attempt to do so using the familiar sounds and sound patterns of our mother tongue. We make it sound, objectively, rather differently from how it sounds when spoken by native speakers. This is the well-documented phenomenon of phonological interference (Crystal 1987: 372). Our L1 (mother tongue) interferes with our attempts to function in the L2 (target language).

We can easily demonstrate the effects of interference by considering the pronunciation of loanwords. English has borrowed the word futon from Japanese; Japanese has borrowed the word football from English. In each case the loanword has its pronunciation modified so that it accords with the sounds and sound patterns of the language into which it is borrowed.

In Japanese, a futon is pronounced [ɸɯtoɴ] (フトン). The word begins with a rather weak voiceless fricative made with the lips (bilabial). This consonant is followed by a vowel that is usually made with unrounded lips, usually voiceless, and often absent. After the voiceless alveolar plosive comes a mid back rounded vowel, and after that a rather long uvular nasal. Japanese people perceive the word as composed of three pieces (moras): [ɸɯ], [to], [ɴ]. The first and last of these moras are composed of what from an English point of view are exotic sounds - sounds unknown in English. So when English learners of Japanese encounter this word, they tend to replace them by English sounds; and the same happens when the word is borrowed into English. We are not quite all agreed how to anglicize it, but in Britain it is mostly called a [ˈfuːtɒn]. The Japanese bilabial fricative becomes an English labiodental (made with the lower lip pressed against the upper teeth). The vowel gets a degree of lip-rounding, and - perhaps mainly because of the spelling - is identified with English long [uː] rather than with short [ʊ]. It is not only firmly voiced but also stressed. Although in some respects Japanese [o] is more similar to English [ɔː] than to [ɒ], we nevertheless equate it with [ɒ]. Again, although the Japanese moraic nasal in this position is more similar to English velar [ŋ] than to alveolar [n], we render it as the latter, again perhaps mainly because of the transliterated spelling. And we perceive the word as composed of two pieces (syllables), not three: in English a nasal following a vowel sound must be non-syllabic. In this way the distinctively Japanese sounds are replaced by characteristic English sounds, and the Japanese structuring in moras by an English structuring in syllables.

When the Japanese confront English words (something that happens considerably more often than the other way round), the effects are similar. In English football is pronounced [ˈfʊtbɔːl]. The Japanese perceive the unfamiliar [f] and [l] in terms of their own sounds, and render it as [ɸɯˌttoˈboːɾɯ] (フットボール) (I show the Japanese pitch accent by the mark [ˈ] placed before the mora in question, in accordance with IPA principles. The mark [ˌ] denotes a non-contrastive pitch upstep.) Not only do speakers of Japanese replace the exotic English consonants by familiar Japanese ones, they also reorganize the way the sounds are arranged (the phonotactics). Both syllables of the English word end in consonants; but Japanese syllables cannot end in a consonant (other than the moraic obstruent or the moraic nasal). An English short vowel followed by a voiceless plosive is rendered as a Japanese geminate, for reasons that are not exactly clear. The syllable-final consonant has to be followed by a supporting vowel, normally [o] after [t] or [d], and [ɯ] otherwise. Thus [fʊt] becomes [ɸɯtto] (フット). The first vowel is actually likely to be voiceless, just as in futon, since Japanese high vowels tend to get devoiced between voiceless consonants. Japanese does not distinguish between r-sounds and l-sounds, so the English final [l] (usually dark in this position, more narrowly [ɫ]) is rendered as [ɾɯ] or something similar. The English word consists of two syllables. Since it is a compound (foot plus ball), in English it has the early stress characteristic of compounds. If a native speaker of English pronounces it as a one-word answer, the pitch of the voice typically falls from high to low on foot and then remains low on ball. In Japanese, on the other hand, this word has six moras [ɸɯ-t-to-bo-o-ɾɯ] and arguably four syllables. Japanese compounds typically have a word accent on the second element, so the accent gets placed on the mora [bo]. The resultant pitch pattern in a citation form typically involves a high-to-low fall during the long [oː] vowel, which to English ears sounds as if it has the main word stress.

These examples demonstrate that incorporating a loanword from one language into another may involve not only the sounds (phonetic segments, phonemes) of which the word's pronunciation is composed, but also the positions in which those sounds are used (syllable structure, phonotactics), the phonetic processes they undergo (phonological rules) and their accompanying suprasegmental features (duration, stress/accent). (For further discussion of English loans in Japanese, see Ishiwata, 1986: 461-467.)

2. Phoneme difficulties

It is well understood that certain sound-types are intrinsically more difficult than others. According to one phonological theory, they are 'marked' (Chomsky & Halle 1968: ch. 9; Lass 1984: 7.4). Quite apart from this, any sound-types in the L2 that have no obvious counterpart in the L1 are likely to cause problems for learners. Thus the English dental fricatives, [θ] and [ð], are a familiar stumbling-block for beginning learners from many language backgrounds. (They are also a stumbling-block for native speakers, being among the last sounds that children acquire and tending to be replaced by [f, v] or [t, d] in various local accents -Wells 1982: 96-97.) Teachers and learners of EFL know that they have to devote time and energy to the articulation of these sounds.

Ever since the heyday of structuralist linguistics in the middle of the twentieth century, teachers and textbook writers have known of the usefulness of minimal-pair drills in which the difficult sounds are compared and contrasted with other sounds that might be confused with them. We can practise, for example, with pairs such as [θʌm] thumb - [sʌm] sum, [θɪk] thick - [sɪk] sick, [pɑːθ] path - [pɑːs] pass. (This article is written from a British English perspective. But all the points made remain equally valid for learners of American English, making appropriate changes. In AmE the last example becomes [pæθ] - [pæs]). It is not only that the dental fricatives are problematic in themselves, being articulatorily difficult; they also stand in phonemic contrast with the alveolar fricatives: /θ/ vs. /s/, /ð/ vs. /z/. There are many pairs of words which are distinguished from one another only by this contrast, and there are therefore messages that have the potential for being misunderstood if the contrast is not mastered (Look at that strange moth/moss!).

It can be very helpful for learners to be given an articulatory explanation of what is involved, particularly in cases where the relevant organs of speech can be easily seen. English [v] is another difficult sound for Japanese learners, and it needs to be carefully distinguished from [b]. In the case of [v], the lower lip, as active articulator, is pressed against the upper teeth in such a way as to allow the air expelled from the lungs to continue to pass through: in phonetic terminology, it is labiodental and fricative. With [b], on the other hand, the lower lip articulates with the upper lip and forms a firm contact with it such that the air flow is completely blocked for a moment: it is bilabial and plosive. Learners can easily see the difference if the teacher demonstrates it accurately and confidently, and they can usually manage to reproduce it themselves by imitation.

Sound production, however, is only one side of the coin. We also need to train learners in sound perception. This is where ear-training is vital. The learner must learn to hear the phonemic contrast /v/ vs. /b/. With a picture showing a vote and a boat learners can be drilled to respond correctly to Is this the boat? Is this the vote? Which is the boat? Show me the vote.

The same thing can be done with the very much more difficult contrast /r/ vs. /l/ (difficult, that is, for Japanese learners). Articulatory explanations — /r/ with central air-flow, side rims of tongue in contact with side teeth, tongue tip retracted, some lip-rounding; /l/ with lateral air-flow, side rims of tongue free of contact, tongue tip firmly on the alveolar ridge — must be supplemented by ear-training and minimal pair practice. Is it right? Is it light? A red pencil? A lead pencil? Shall I correct them or collect them?

We can combine the two problems by drilling loving [ˈlʌvɪŋ] and rubbing [ˈrʌbɪŋ] . Students must learn to identify the two words on hearing them, and they must learn to pronounce them in a way that leaves no doubt as to which is which.

Similar considerations apply to vowels and vowel contrasts. Learners must learn to both hear and reproduce the difference between central /ʌ/ and front /æ/: fun and fan, butter and batter, mud and mad, cup and cap, which truck should I follow? and which track should I follow? Likewise the difference between mid /ɜː/ and open /ɑː/: stir and star, curve and carve, occur and a car, burn and barn, hurt and heart.

All pronunciation textbooks offer drills of this kind (e.g. O'Connor and Fletcher, 1989); indeed such a minimal pair is responsible for the title of the well-known Ship or sheep? (Baker, 1977). There are similar drills in many general classroom textbooks of English.

3. Allophonic difficulties

In all languages phonemes are pronounced somewhat differently according to the phonetic context in which they are found: that is, they comprise a number of distinct allophones. There are two kinds of interference problem this can give rise to for the learner:

Instances of the first kind include a failure to apply the appropriate distribution of aspiration for English /p, t, k/ (e.g. aspirated in pin, tanned, come but unaspirated in spin, stand, scum), a failure to distinguish clear and dark /l/, or a failure to apply pre-fortis clipping (Wells 1990: 136, 2000: 149).

More important, and more insidious, are instances of the second kind. Japanese [s], for example, does not occur before [i], being replaced there by an [ʃ]-like sound (actually [ɕ] - but in this article I shall represent Japanese [ɕ] and [ʑ] throughout as [ʃ] and [ʒ] respectively). Hence in Japanese [s] and [ʃ] can be regarded as allophones of the same phoneme. The consequence for Japanese learners of English is difficulty in producing sequences such as those in seat [siːt], receive [rɪˈsiːv], sick [sɪk]. They tend to pronounce seat in a way that sounds to English speakers like sheet [ʃiːt], and indeed fail to distinguish such minimal pairs as seat-sheet. This is particularly unfortunate in the case of the words sit and city.

Some learners, as is to be expected, have difficulty not only in producing the [si-ʃi] distinction but also in hearing it. They can benefit from ear-training as well as from articulation practice.

Other Japanese consonants, too, have special positional allophones before [i]. They are [z], [t] and [d], which are replaced by [dʒ], [tʃ], and [dʒ] respectively. For [z], this leads to problems with words such as lazy [ˈleɪzi] and resist [rɪˈzɪst]. Although there are no English words pronounced [ˈleɪdʒi] or [rɪˈdʒɪst], these forms are sufficiently different from the native-English pronunciation to give rise to serious problems of intelligibility.

With [t] and [d], fortunately, there are English loanwords in Japanese which may provide a model. The English word team [tiːm] comes into Japanese not necessarily as the expected [tʃiːmɯ] chiimu (チーム) but often as [tiimɯ] (ティーム). Although this violates the usual Japanese allophonic rule, knowledge of English pronunciation is sufficiently widespread for it to have become established. This means that only the more naïve learner will mispronounce team in English as [tʃiːm]; most can produce [tiːm], and this can serve as a model for teach [tiːtʃ], teeth [tiːθ], tip [tɪp], stick [stɪk], plenty [ˈplenti] and so on. Another English loanword that may provide a suitable model is tissue(-paper), borrowed as [tiˌʃʃu(ˈpeːpaː)] tisshu(-peepaa) (ティッシュペーパー). Minimal pairs for reinforcing the English phonemic distinction include tease [tiːz] and cheese [tʃiːz], tip [tɪp] and chip [tʃɪp], tears [tɪəz] and cheers [tʃɪəz].

Building on this, the learner can also cope with [d] before a high front vowel, rather than [dʒ], in words such as deep [diːp], different [ˈdɪfrənt], discuss [dɪˈskʌs], dear [dɪə], lady [ˈleɪdi].

4. Phonotactic difficulties: consonant clusters

At the beginning of a word, a Japanese consonant must be followed either immediately by a vowel or else by a palatal semivowel [j] and then a vowel. An English initial consonant, on the other hand, may well form part of a consonant cluster comprising two or three consonants. Typical examples of two-consonant initial clusters that may be difficult for Japanese learners are those in play [pleɪ], tree [triː], clear [klɪə], brain [breɪn], draw [drɔː], glue [gluː], free [friː], through [θruː], shrink [ʃrɪŋk]. These tend to be resolved by inserting a vowel between the two consonants, thus [pɯleɪ] etc. To achieve an English-style pronunciation the learner must eliminate this inserted vowel, while also taking care to make the appropriate English distinction between [r] and [l]. The aim should be a close transition from the first consonant to the second. Remember that native English speakers think of these words as consisting of just one syllable.

It may be helpful to practise hearing and making the difference between pairs such as prayed [preɪd] and parade [pəˈreɪd], plight [plaɪt] and polite [pəˈlaɪt], Clyde [klaɪd] and collide [kəˈlaɪd], drive [draɪv] and derive [dɪˈraɪv, dəˈraɪv].

Clusters involving [w], on the other hand, tend to be resolved by replacing the semivowel by a vowel - not necessarily a disaster, provided that the result has lip rounding. If not, the English monosyllabic twin [twɪn] becomes a Japanese three-mora [tɯ i ɴ] (トゥイン), queen [kwiːn] a four-mora [kɯ i i ɴ] (クイーン).

Another group of difficult initial clusters are those involving [s]. Examples are found in the words spin [spɪn], steep [stiːp], school [skuːl], smile [smaɪl], snow [snəʊ]. There are also the three-consonant clusters exemplified in spray [spreɪ], split [splɪt], straight [streɪt], screen [skriːn]. These, too, tend to be resolved by the insertion of extra vowels, as when English screen is borrowed into Japanese as sukuriin (スクリーン). Again, learners must aim at close transition between the consonants. Ideally, spin, smile, spray etc. should be felt as one syllable rather than as three or more moras. The word screen, which when borrowed into Japanese comprises five moras [sɯ kɯ ɾi i ɴ], also comprises just one syllable in English.

It may be helpful to practise hearing and making the difference in pairs such as sport [spɔːt] vs. support [səˈpɔːt], scum [skʌm] vs. succumb [səˈkʌm].

5. Phonotactic difficulties: final consonants

A major problem for Japanese learners of English is the fact that English consonants frequently occur in a position from which Japanese consonants are excluded, namely at the end of a word or syllable.

All the English consonant phonemes except /h/ can be found in word-final position. Thus we have words such as map [mæp], rub [rʌb], net [net], good [gʊd], back [bæk], egg [eg], rough [rʌf], love [lʌv], death [deθ], smooth [smuːð], face [feɪs], cheese [tʃiːz], push [pʊʃ], beige [beɪʒ], rich [rɪtʃ], edge [edʒ], come [kʌm], pen [pen], sing [sɪŋ], sell [sel].

The only one of these that is easy for a Japanese-speaking learner is final [ŋ], as in sing. If it is pronounced Japanese-style as a uvular nasal [ɴ], it may sound slightly odd but will not cause any problems of intelligibility. The main problem for the learner is keeping the nasal velar if the next word or syllable begins with a consonant made at some other place of articulation, as in sing badly [ˈsɪŋ ˈbædli] or sing today [ˈsɪŋ təˈdeɪ]. The Japanese moraic nasal always assimilates in place to a following consonant, so that the learner will want to say [sɪm] for the first and [sɪn] for the second. The former merely risks being unintelligible; the latter might give rise to real misunderstanding, since to sin is something different from to sing.

With the other final consonants, the natural tendency of Japanese learners is to support them by a following vowel: [o] in the case of [t] and [d], [ɯ] otherwise. Plosives, as noted above, tend to be interpreted as geminated after short vowels; thus map [mæp] may be rendered as [mappɯ] (マップ), net [net] as [netto] (ネット), back [bæk] as [bakkɯ] (バック), good [gʊd] as [gɯddo] (グッド). After long vowels, consonants do not get geminated, but do get supported by an inserted vowel, thus cheese [tʃiːz] may become [tʃiːzɯ] (チーズ), loop [luːp] may become [ɾɯːpɯ] (ループ), etc.

Even when the learner has learnt to suppress the extra vowel as such, it may still remain in his mind, giving an inappropriate coloration to the consonant. It would usually be better, if possible, to imagine a suppressed [ə] after the consonant rather than an [ɯ].

English final consonant clusters combine the difficulties associated with single final consonants and those associated with clusters. There are various subtypes. Some involve a lateral. Examples include help [help], belt [belt], milk [mɪlk], health [helθ], else [els]. In English, all these laterals are actually dark, [ɫ]- but the matter of clear and dark laterals is definitely something that can be left to advanced students. For most Japanese learners, the important thing is to achieve a lateral articulation of some kind. All the examples just given are monosyllables, and the Japanese tendency is as always to break up the consonant cluster with a vowel, giving three-mora results such as [he ɾɯ pɯ, be ɾɯ to] (ヘ ル プ, ベ ル ト).

A radical solution which I think deserves consideration (though it may shock some teachers) is to follow the Estuary English habit of vocalizing the lateral, that is replacing it with a vowel of the [o] type. Learners could aim at [heop, beot, mɪok, heoθ, eos]. They would be in good company, since millions of English and American speakers do the same thing.

Those aiming for American English (GenAm) pronunciation must also acquire clusters with [r], for example in harp, cart, dark, course, north. One advantage of choosing British English (RP) as the model is that this particular difficulty is avoided, since their RP forms are [hɑːp, kɑːt, dɑːk, kɔːs, nɔːθ].

Another type of final consonant cluster involves a nasal. Examples include lamp [læmp], month [mʌnθ], hunt [hʌnt], think [θɪŋk], fence [fens, fents], lunch [lʌntʃ, lʌnʃ]. These are relatively easy for Japanese learners to produce, since the nasal is homorganic with the following consonant, just like the Japanese moraic nasal. But it is still important to try to think of each word as consisting of one syllable, not three moras, and to suppress any added vowel after the final consonant.

In cases such as warmth [wɔːmθ] and length [leŋθ] it seems to me to be entirely acceptable for the learner to adopt the easier variants [wɔːmpθ] and [leŋkθ] or [lenθ], which are used by many native speakers. In this way the difficulty of a nasal followed by a non-homorganic consonant can be avoided. However, inflected forms such as tamed [teɪmd], banged [bæŋd], comes [kʌmz], hangs [hæŋz] cannot be avoided in this way: for them the learner must learn to produce an appropriate nasal, bilabial or velar, even though it is not homorganic with the following consonant.

Then there are final clusters involving [f] or [s], for example lift [lɪft], soft [sɒft], wasp [wɒsp], list [lɪst], desk [desk]. Again, the main error to be avoided is that of inserting extra vowels, as in three-mora [ɾi ɸɯ to] (リフト) instead of single-syllable [lɪft]. It may make it easier to practise first words such as lifting [ˈlɪftɪŋ], softer [ˈsɒftə].

There are other tricky final clusters ending in alveolar fricatives or plosives. Some of these are in morphologically simple words such as lapse [læps], box [bɒks], next [nekst]. But the majority arise in inflected forms such as plurals and past tenses: groups [gruːps], cats [kæts], takes [teɪks], laughs [lɑːfs] (AmE [læfs]), births [bɜːθs], wasps [wɒsps], tents [tents], desks [desks], risked [rɪskt], touched [tʌtʃt]; cabs [kæbs], heads [hedz], dogs [dɒgz], loves [lʌvz], breathes [briːðz], runs [rʌnz], pulled [pʊld], judged [dʒʌdʒd].

In this connection it will be useful to recall the pronunciation rules for regular plurals and past tenses. They depend on the phonetic classification of the last segment in the stem to which they are attached. The plural ending is pronounced

The three types are illustrated in cats, dogs, and horses, which are respectively [kæts], [dɒgz], [ˈhɔːsɪz]. This reflects the fact that from an English point of view final clusters such as [ts, θs] and [gz, vz] are fine, but [ss, dʒz] would be impossible.

The regular past tense ending is pronounced

The three types are illustrated in missed, turned, and waited, which are respectively [mɪst], [tɜːnd], [ˈweɪtɪd]. So again we see a connection with the constraints on possible English final clusters: those such as [st, pt] and [nd, dʒd] are fine, but [tt, dd] would be impossible.

6. Concatenation, coarticulation

Beginners can practise word-final consonants by putting them in phrases where the next word begins with a vowel sound. It may be helpful to think of the final consonant as actually belonging to the next word. Thus step up [step ʌp] can be imagined as [ste pʌp], leave out [liːv aʊt] as [liː vaʊt], end it all [end ɪt ɔːl] as [en dɪ tɔːl].

Useful as it may be for elementary students, this technique can however only be a half-way stage, for two reasons:

So word-final consonants also need to be practised both in absolute-final position (before a pause or the end of the utterance) and also in phrases where the next word begins with a consonant.

Examples of phrases for practising this are keep calm [ˈkiːp ˈkɑːm], nice time [ˈnaɪs ˈtaɪm], rich food [ˈrɪtʃ ˈfuːd], bad thing [ˈbæd ˈθɪŋ]. In each case there should be no kind of vowel sound - not even a voiceless one - between the last consonant of the first word and the first one of the second word. It may help, too, to try and feel these phrases, mentally, as consisting of two syllables each, not of six moras (ki-i-pu-ka-a-mu キープカーム etc.).

Particular care needs to be taken when the two abutting consonants are ones which tend to be confused. They may, for example, be dental and alveolar fricatives, as in both sides [ˈbəʊθ ˈsaɪdz], with salt [wɪð ˈsɔːlt]; or bilabial plosive and labiodental fricative, as in love bite [ˈlʌv baɪt], they've beaten [ðeɪv ˈbiːtn̩], (I like the) club very (much) [ˈklʌb ˌveri], and within the word obviously [ˈɒbviəsli].

Where the same plosive is repeated at the end of one word or syllable and at the beginning of the next one, we get gemination. That is to say, there is no audible release of the first, and no audible approach to the second: the two phonemes are realized by a single articulatory gesture, a plosive with a long hold phase. Fortunately, Japanese learners have a model for this in the Japanese moraic obstruent, as in arappoi (アラッポイ) 'rough', in which the articulation is very comparable to that between the first and second syllables of English stop pointing [ˈstɒp ˈpɔɪntɪŋ]. Examples for the other plosives might be (put the) web back [ˈweb bæk], night-time [ˈnaɪt taɪm], stood down [ˈstʊd ˈdaʊn], milk crate [ˈmɪlk kreɪt], big gun [ˈbɪg ˈgʌn]. However, unlike other obstruents, English affricates are not geminated, so that each chair [ˈiːtʃ ˈtʃeə] and orange juice [ˈɒrɪndʒ dʒuːs] should be pronounced with two complete affricates each.

It may also be necessary to emphasize that voiceless plosives are not geminated in English words such as copy [ˈkɒpi], happy [ˈhæpi], atom [ˈætəm], better [ˈbetə], jacket [ˈdʒækɪt].

Repeated fricatives in English are articulated like single ones, except that they last longer. Japanese provides a model for geminated [s] in words such as bessoo (ベッソウ) 'villa' and for geminated [ʃ] in issho (イッショ) 'same', but not for other double fricatives. English examples for practice might be rough fight [ˈrʌf ˈfaɪt], Faith thinks [ˈfeɪθ ˈθɪŋks], Miss Sykes [mɪs ˈsaɪks], push shut [ˈpʊʃ ˈʃʌt], love visiting [ˈlʌv ˈvɪzɪtɪŋ], with these [wɪð ˈðiːz]. Repeated nasals and liquids, too, are like single ones but longer: the same method [ðə ˈseɪm ˈmeθəd], ten names [ˈten ˈneɪmz], I feel lazy [aɪ fiːl ˈleɪzi]. In all such cases it is inappropriate for there to be any kind of vowel or break-and-make of articulation as we pass from one consonant to the next.

It may be helpful to do some ear-training and production practice on pairs such as this count [ˈðɪs ˈkaʊnt] vs. this account [ˈðɪs əˈkaʊnt], (I'm not going to) rush now [ˈrʌʃ naʊ] vs. Russia now [ˈrʌʃə naʊ].

Recall that many word-final clusters readily undergo simplification in connected speech through processes of assimilation and elision, which are well described in textbooks of English phonetics (e.g. Wells 1990: 46, 240; 2000: 49, 254). So although it is necessary for the learner to practise next [nekst] with all three of its final consonants present (for use when the word is said in isolation or in phrases such as next item), it is also helpful to be aware that in phrases such as next time, next contestant its last consonant, [t], can confidently be omitted: [ˈneks ˈtaɪm], [ˈneks kənˈtestənt]. Although ten must be pronounced with an alveolar nasal in isolation or in a phrase such as ten answers, it can be allowed to assimilate, Japanese-style, in ten boys [ˈtem ˈbɔɪz] or ten girls [ˈteŋ ˈgɜːlz].

7. Compound stress

At University College London we run a summer course in English phonetics that attracts a fair number of Japanese students. Some already have an excellent grasp of English pronunciation, but many do not. The latter group strikingly tend to refer to the summer course as [saˌmaː ˈkɔːs(ɯ)]. But in native English it is called a [ˈsʌmə kɔːs]. (This was first pointed out to me by Masaki Taniguchi.)

I touched on the reason for this at the beginning of the article. It is due to interference from Japanese, in which compound nouns tend to receive a pitch accent on the second element. In English, on the other hand, the majority of compound nouns receive word stress on the first element (early stress).

Although in general Japanese students of English cope pretty well with word stress (unlike, say, French students), they do tend to err in the stressing of compounds, and the reason is evidently interference from the usual Japanese pattern. The error is made worse by the further pitch feature characteristic of late-accented (or unaccented) Japanese words, namely the non-contrastive step-up in pitch, here shown as [ˌ], typically imposed on the second mora of the word. In the case of summer course, this gives the impression (to English ears) of a pre-tonic stress on the second syllable of summer, which in English of course is wholly unstressed.

This applies equally whether the English compound noun is one of those written as two words, as summer course, or one of those written as one. An example of a compound written as one word is passport [ˈpɑːspɔːt], AmE [ˈpæspɔrt]. Note that the word stress is located, as expected, on the first syllable. Borrowed into Japanese, however, it becomes [paˌsɯˈpooto] (パスポート), and this leads Japanese learners of English to tend to produce something that sounds to English ears like final-stressed [pʌsˈpɔːt] rather than the correct initial-stressed [ˈ]passport.

We see the same thing in credit card [ˈkredɪt kɑːd], compare Japanese ku[ˌ]rejitto-[ˈ]kaado (クレジットカード).

There are thousands of compound nouns in English. The vast majority of them bear early stress. Any good dictionary will supply copious examples. Here are just a few: alarm clock, baby-sitter, bank account, bookcase, bus stop, car park, contact lens, dining room, fairy tale, heart attack, letter-box, pen-friend, police station, post office, swimming pool, washing machine, youth hostel.

Tiresomely, however, not all English compound nouns are early-stressed. The principal exceptions, late-stressed, fall into one of the following categories:

But... summer course is an exception to this exception. On the one hand, this perhaps means that it is best for the student to consult a dictionary or a native speaker for the stress pattern of any compound noun falling into these categories; on the other, it means that there is a degree of uncertainty and variability about the stressing of compounds that can only be seen as good news for the learner. Of the different manifestations of phonetic interference at which we have looked, this is the one that has the least impact on intelligibility in practice.


Baker, A., 1977. Ship or sheep? Cambridge: Cambridge University Press.

Chomsky, N. and Halle, M., 1968. The sound pattern of English. New York: Harper.

Crystal, D., 1987. The Cambridge Encyclopedia of Language. Cambridge: Cambridge University Press.

Cruttenden, A., 1994. Gimson's Pronunciation of English. Fifth edition of A.C. Gimson, Introduction to the pronunciation of English. London: Edward Arnold.

Ishiwata, T., 1986. English borrowings in Japanese. In Viereck, W. and Bald, W.-D., 1986.

Lass, R., 1984. Phonology. Cambridge: Cambridge University Press.

O'Connor, J.D. and Fletcher, C., 1989. Sounds English. Harlow: Longman.

Viereck, W. and Bald, W.-D., 1986. English in contact with other languages. Budapest: Akadémiai Kiadó.

Wells, J.C., 1982. Accents of English. Three volumes. Cambridge: Cambridge University Press.

Wells, J.C., 1990. Longman Pronunciation Dictionary. Harlow: Longman.

Wells, J.C., 2000. Longman Pronunciation Dictionary. Second edition. Harlow: Longman.

I am grateful to the many colleagues and students who have educated me over the years about Japanese phonetics. In particular I would mention Paul Takei, Kazuhiko Matsuno, Yuko Shitara, Masaki Taniguchi and Mitsuhiro Nakamura. JCW

Author's home page