UCL Division of Psychology & Language Sciences

John Wells’s phonetic blog
16-31 October 2006, archived


To see the phonetic symbols in the text, please ensure that you have installed a Unicode font that includes all the IPA symbols, for example Charis SIL (free download).


Browsers: Some versions of Internet Explorer have bugs that prevent the proper display of certain phonetic symbols. I recommend Firefox (free) or, failing that, Opera (also free).


RSS feed for this site RSS feed for this site

Tuesday 31 October 2006

Degemination: the follow-up

Kensuke Nanjo reminds me that Jack Windsor Lewis, in his article ‘Weakform words and contractions’, mentions the degemination that can occur with the word some:

When some occurs in a weakform immediately before a substantive beginning with /m/, there is very often DEGEMINATION of the two /m/s to only one and then OPEN SYLLABLE PREFERENCE prompts the insertion of a schwa vowel, eg /sə `mɔː/ Some more?

And in his ‘Reduced forms of English words’ Jack claims that

what might be written good `eel, take `air, pry `minister and extra `tension for good `deal, take `care, prime `minister and extra at`tention are commonplace.

Degemination in prime minister (as [praɪˈmɪnɪstə]) is certainly widespread: I am not so convinced about take care, which doesn’t feel right to me.

The possible elision of /t/ in sit down, let go, whaddya want is not degemination, though it is obviously similar, and similarly lexically restricted.

Now to the matter of -ly. When this suffix is attached to a stem ending in /l/, there are several possibilities, partly lexically determined. (Degemination is the loss of one of two identical consonants; compression is a reduction in the number of syllables.)

  • A. stems ending in syllabic /l/ (or /əl/)
    1. degemination with compression, so that the adverb has the same number of syllables as the adjective. Examples: gentle → gently, simple → simply, single → singly, noble → nobly, able → ably (and similarly for the suffix -able or -ible: possible → possibly, visible → visibly, reasonable → reasonably, understandable → understandably etc.)
    2. degemination with variable compression. Sometimes compressed, sometimes pronounced with /-əli/ or /-l̩i/. Adjectives ending in -ical, thus historic(al) → historically, surgical → surgically usually /-ɪkli/. Adjectives ending in -ful, thus peaceful → peacefully, beautiful → beautifully. Also double → doubly, special → specially
    3. degemination without compression. Other adjectives ending in -al, thus natural → naturally, vital → vitally, racial → racially
  • B. stems ending in non-syllabic /l/
    1. degemination. Example: full → fully /ˈfʊli/ (are there any other categorical examples?)
    2. variable degemination. Examples: whole → wholly, dull → dully
    3. no degemination Examples: pale → palely /ˈpeɪlli/, vile → vilely, cool → coolly, futile → futilely

Does this about cover it?

Kensuke Nanjo
Kensuke Nanjo

Jack Windsor Lewis
Jack Windsor Lewis

Monday 30 October 2006


Geminated (double) consonants are quite common in English. They are never found within a morpheme, but arise across (i) morpheme boundaries and (ii) across word boundaries, wherever one element ends in a given consonant and the following element begins with the same consonant:

(i) meanness ˈmiːnnəs, guileless ˈɡaɪlləs, nighttime ˈnaɪttaɪm, midday ˌmɪdˈdeɪ

(ii) nice sort naɪs sɔːt, big girl bɪɡ ɡɜːl, bad dog bæd dɒɡ

Phonetically, geminated consonants are pronounced like ordinary ones but with extra duration. In the case of plosives, there is a single articulatory gesture but with a longer hold phase.

same man /seɪm mæn/ = [seɪmːæn], stop pushing /stɒp pʊʃɪŋ/ = [stɒpːʊʃɪŋ]

This much is covered by our textbooks. But what I don’t remember seeing much discussion of is degemination in English, the process whereby a geminate is simplified, i.e. two consonants are reduced to one.

Degemination is the norm in derivational (fossilized) morphology, but the exception in inflectional (productive) morphology. So inside the lexicon there are plenty of cases of degemination such as

dis + sent = dissent dɪˈsent (cf. consent), in + numerable = innumerable ɪˈnjuːmərəbl̩

abbreviate, addiction, aggregation, allocate, connotation, immigration, immature

—though even here we sometimes override degemination to emphasize the meaning of a prefix, as in dissuade dɪ(s)ˈsweɪd (cf. persuade), illegal ɪ(l)ˈliːɡl̩.

Germanic affixes are not subject to degemination. So alongisde the Latinate innumerable we have unnecessary with geminated /nn/ and cases like meanness, guileless in (i) above. (Entombment is an interesting case: -ment, despite its Latin origin, ‘came to be treated as an English formative’ (OED), which means that entombment ɪnˈtuːmmənt has geminate /mm/.)

The behaviour of -ly when attached to a stem ending in /l/ is complicated: we’d better leave it to another day.

What got me thinking about all this was hearing a Conservative MP on the radio pronouncing the phrase a good deal better with just a single, ungeminated [d]. This option is something I drew attention to in LPD, and something I sometimes do myself, though I admit it is a minority choice and probably becoming less frequent. It is also irregular, in the sense that we don’t degeminate other /dd/ sequences (good dog, bad deal).

LPD entry for <i>good</i>

Are there any other cases of degemination across a word boundary?

Cannot is a special case. When can and not come together the result is not can not [kæn nɒt] but [kænɒt] (which in spoken styles is then usually further reduced to can’t).

But apart from these two (and the Australian g’day) I can’t think of any other cases. Can anyone?


Friday 27 October 2006


Two correspondents have reproached me for misusing the term ‘Pinyin’ to refer to what should properly be called ‘Hanyu Pinyin’.

The more helpful one was Victor Mair, Professor of Chinese Language and Literature at the University of Pennsylvania, who points out that Pinyin essentially just means ‘spelling’.

“This did not use to be much of a problem, but now there are various other PINYIN schemes, some of which are becoming increasingly conspicuous. For example, the Taiwan Ministry of Education has recently approved the use of TAILUO PINYIN for the writing of Hoklo (often called "Taiwanese"), and there will be more and more textbooks published using this romanization during the coming years. There is also a competing TONGYONG PINYIN which is *supposed* to be applicable to all the languages of Taiwan.” (Read more here and here.)

“In any event,” continues Victor, “these are exciting, eventful times for romanization in Taiwan and, indeed, in mainland China.”

You might also be interested to read Victor’s demolition of the widespread belief that the Chinese word for ‘crisis’ is composed of elements that mean ‘danger’ and ‘opportunity’. It isn’t. (I note by the way that the website where this article appears — ‘a guide to the writing of Mandarin Chinese in romanization’ — is called just (sic).)

So: in future I’ll try to remember to refer to Hanyu Pinyin (or, with tones, Hànyǔ Pīnyīn). In Chinese that’s 汉语拼音 (simplified) or 漢語拼音 (traditional).

Victor Mair

Thursday 26 October 2006


Yesterday I was indeed in Cologne (Köln), Germany, as a guest on a television programme that hopefully represents the absolutely last gasp of the cow dialect story. We had a live cow (trained for film work) on the set in the studio, and I chatted about the inaccuracy of media reports with the presenter Günther Jauch. There was also the footage we filmed in London last month.

I made use of the three hours between arriving at my hotel and being picked up to go to the studio by doing a little shopping. I succeeded in buying the current (6th) edition of Max Mangold’s Duden Aussprachewörterbuch, the standard German pronunciation dictionary. (When compiling LPD it was the third edition that I consulted.) As well as giving authoritative IPA transcriptions of the words of German, this book is also particularly informative about the pronunciation of words from other languages, and indeed about the phonetics of other languages. It gives tabulations or summaries of reading rules for over twenty different languages, including Estonian, Lithuanian and Latvian. It offers a table of all possible Mandarin Chinese syllables, in Pinyin, Wade-Giles and IPA.

The dictionary tells us about the tones not only of Chinese but also of Burmese, Japanese, Lithuanian, Serbo-Croatian, Swedish, Thai, and Vietnamese.

It tells us nothing at all, however, about the phonetics of Arabic or Hindi — a strange omission.

It also has a list of other languages in which word stress is

  • initial (Estonian, Faeroese, Finnish, Georgian, Icelandic, Latvian, Lower Sorbian, Upper Sorbian, Slovak; Czech is among the languages with detailed phonetic descriptions),
  • antepenultimate (Macedonian),
  • final (French, Cambodian, Persian),
  • on a specified syllable for each word (Afghan, Bulgarian, Hebrew, Indonesian, Catalan, Lithuanian, Malagasy, Romansch, Russian, Ukrainian, Belarussian), and
  • “similar to German” (Danish, English).

What, I wonder, happened to

  • penultimate (not only Polish, but also Welsh and Swahili)?




Duden Aussprachewörterbuch

Wednesday 25 October 2006

Now that I am back I have been able to check for the source of the claim that the IPA sanctions the use of [c, ɟ] to symbolize palatoalveolar affricates (blog, 18-19 October). Here is the passage I was thinking of, from the 1949 booklet The Principles of the International Phonetic Association, p. 14-15.

text from the 1949 handbook

text from the 1949 handbook

The booklet has long been out of print, and has in any case been superseded by the current Handbook of the International Phonetic Association. In the latter there is very little discussion of the symbolization of affricates: in fact the only mentions I can find are on p. 22:

Affricates and double articulations
k͡p, t͡ʃ etc. Eng. chief;...

and on p. 27:

...letters may also be combined to make a phoneme symbol (for instance //, as at the beginning and end of English church; if necessary the phonological unity of the two segments can be shown by a tie bar: /t͡ʃ/).

In the listing of symbols and their computer coding, p. 179, it is noted that the t-esh ligature (or ‘tesh digraph’, sic Unicode) ʧ and the d-ezh ligature (‘dezh digraph’) ʤ have been superseded by the simple sequences and respectively. Inasmuch as the Handbook does not mention the possible use of [c, ɟ] for these affricates I suppose we must consider this superseded too. The specimen of Hindi (p. 100-103) uses and .

The s-tailed t and the barred-2 seen in the extract above were withdrawn in 1976.

STOP PRESS: viewers in Germany can see me tonight at 22:15 on stern tv: ‘Eine Meldung und ihre Geschichte: Muhen Kühe Dialekt?’. As you know, the answer is ‘Nein!’

Tuesday 24 October 2006

Among the post awaiting me on my return was a package that proved to contain an advance copy of a new book sent to me by the publishers. It is the Oxford BBC Guide to Pronunciation, by Lena Olausson and Catherine Sangster (OUP 2006).

At first I thought it was a new edition of Graham Pointon’s BBC Pronouncing Dictionary of British Names (OUP, 1990). It has the same compact format, and adopts the same practice of showing pronunciations both in IPA and in a respelling system.

However the Guide is entirely new, and very different in content from Pointon’s dictionary. You will look in vain here for names of obscure British people and places. Rather, it covers names, words and phrases from all over the world. For example, the first three entries under G are

Ga Ghanaian language gah /ɡɑː/
Gaarder, Jostein Norwegian writer yoo-stayn gor-duhr /ˌjuːsteɪn ˈɡɔː(r)də(r)/
gaberdine worsted or cotton cloth gab-uhr-deen /ˌɡabə(r)ˈdiːn/

There are also page-length articles on such topics as Accents (= written marks), Clicks, French, Latin, Top ten complaints about pronunciation, and Tone.

The authors work in the BBC Pronunciation Unit and the book’s main source is the Unit’s own database accumulated over many years. The book is “not so much a dictionary as a collection of particular pronunciations which are tricky, much debated, curious, or exotic”.

I look forward to many happy hours examining it in detail. Meanwhile you may like to know that for the next 48 hours at least it can be pre-ordered from at 34% off the publisher’s price. Publication is on Thursday.

Oxford BBC Guide to Pronunciation

Monday 23 October 2006

I’m on my way back from Hong Kong. While I was there my former student Prof. Cheung Kwan-hin, who looked after me so kindly and assiduously during my visit, pointed out to me something I did not know about my laptop computer: that it already has built-in software making possible the easy inputting of Chinese characters. Having once activated this, all you have to do in order to bring a character into your document and onto the screen is to type in the Pinyin romanization of the word in question.

The operating system I use is Windows XP. To activate the Chinese input method, I just had to go to Control Panel | Regional and Language Settings | Languages and check the box Install files for East Asian languages. Then go to Details and select Chinese (PRC), Add. OK my way out and wait a short while.

Once that’s done, I can select Chinese on the language bar and then type, say, shan1 — and on the screen the character 山 (mountain) duly appears. Notice that the tone is entered as a trailing numeral (1, 2, 3 or 4) rather than as a Pinyin diacritic. (Actually, you don’t always need a tone number. Just shan plus return is enough to produce 山.)

Instead of the standard Pinyin ü you have to type v. So nv produces 女 (woman), which in Pinyin would normally be written .

In Word, highlighting a character and toggling Alt-X changes it to its Unicode number, which is handy for writing HTML for this blog.

All that remains for me to do is to set about actually learning Chinese.

Cheung Kwan-hin

Tai Mo Shan

Friday 20 October 2006

Unusual pronunciations observed recently in the mouths of native speakers of English:

  • from a CNN newsreader, to obey someone sl/æ/vishly. Since the point being made was the possible offensiveness of this expression towards an African-American, perhaps the speaker could not bring himself to say sl//vish. Or was it contamination from lavish? Compare the difference between the two meanings of slaver: ‘slave trader’ with //, but ‘dribble, foam at the mouth’ with /æ/.
  • from a phonologist discussing the formal interface between syntax and intonation, acoustics with //. That’s actually what the BBC Pronunciation Advisory Committee recommended back in the 1930s. Until now I had thought it an excellent example of the lack of influence of this committee, since despite its recommendation (I had assumed) we all say acoustics with /u:/.
  • from the same speaker, the name of David Brazil, the guru of discourse intonation, as /brəˈzɪl/. But our late colleague called himself /ˈbræzəl/.
  • from a scientist giving an academic paper at a conference, asterisk (the punctuation mark * ) with /-ɪks/ instead of /-ɪsk/. I’d always assumed that this contamination from Asterix (the Gaul) was either illiterate or ironic. But I don’t think the speaker was being ironic.

CNN logo

Thursday 19 October 2006

Two correspondents have commented on yesterday’s posting about affricates.

Wyn Roberts of Simon Fraser University in Canada queries my claim that the use of [c] for a voiceless palatoalveolar affricate is ‘IPA-sanctioned’. Where, he asks, is this ‘sanctioning’ (approval) stated? Since I am away from reference books at the moment I can’t quote chapter and verse, but I am pretty sure you will find it somewhere in the 1949 Principles of the IPA booklet. I think the Council made a decision about this sometime in the 1920s or 1930s. Whether it is referred to in the current Handbook I am not sure.

Biljana Čubrović of Belgrade University in Serbia writes “You rightly emphasised that one of the scripts used by the Serbs, namely the Cyrillic script, is characterised by an almost perfect, one-to-one spelling to sound correspondence. In this script, all five Serbian affricates are represented by simple letters:

  • 1. ц in цев (Serbian Latin cev,Eng. pipe)
  • 2. ч in чип (Serbian Latin čip, Eng. chip)
  • 3. џ in џип (Serbian Latin džip, Eng. Jeep)
  • 4. ћ in кућа (Serbian Latin kuća, Eng. house)
  • 5. ђ in луђа (Serbian Latin luđa, Eng. madder)

(I should explain that Serbian can be written with either the Cyrillic alphabet or the Latin alphabet. Both orthographies are recognized.)

Biljana continues, “In the Latin script, only one of the five Serbian affricates is represented by a digraph (namely the voiced postalveolar affricate, or no. 3 above). I am unsure about how this 'deviation' arose, but don't think that one of the compound phonemes is indeed more complex than the other four. A possible answer may lie in the etymology of words containing this particular affricate. As a native speaker of Serbian, I see these five affricates as wholes.

Another observation connected to affricates in Serbian concerns their perception. Most native speakers of Serbian recognise and appreciate the difference between the voiceless postalveolar affricate and its palatal counterpart, but fail to articulate these two correctly. The Belgrade male idler jargon, for instance, is characterised by the neutralisation of these two in favour of the postalveolar affricate.”

We’d better not pursue the meaning of ‘correctly’ here.

Wyn Roberts

Biljana Čubrović

Wednesday 18 October 2006

Someone wrote to me last week worrying about the phonetic symbolization of affricates. Unfortunately I seem to have inadvertently deleted the email before replying to it, so this public reply will have to suffice. I think the person’s name was Sylvain or Sylvaine.

What worried this correspondent was that the IPA allows for the representation of the palatoalveolar affricates as [c, ɟ] instead of what s/he considers to be the correct way of writing them, namely [t͡ʃ, d͡ʒ].

Why does this provision exist? It is because there are some languages in which it doesn’t seem very satisfactory to write affricates with the plosive-plus-fricative notation. A speaker of Italian, for example, was telling me the other day that he is very conscious of the difference in tongue configuration between the ordinary Italian plosive [t] and the first element of the Italian affricate spelt c(i), c(e), usually represented in IPA as []. He would be happier with a notation that does not imply their equivalence. That is what the IPA-sanctioned use of [c] for the affricate provides. (You can only do this, of course, in a language in which you do not need to symbolize a voiceless palatal plosive, which is the default general-phonetic meaning of [c].)

Another such language is Hindi, in which the affricates very obviously pattern as single units, not as sequences.

Many linguists use the symbols [č, ǰ] for these affricates, although they do not have the approval of the IPA.

In ordinary orthography, although the Latin alphabet offers no way to write affricates without using diacritics or digraphs, other alphabets do: for example, in Cyrillic the voiceless palatoalveolar affricate is written Ч ч and the voiced one (in Serbian) as Џ џ. There is clearly a perceived need for a unitary way of writing these affricates.

Returning to the two-symbol notation, my correspondent also assumed that the correct way to write them is with a tie bar: [t͡ʃ, d͡ʒ]. Personally I normally omit the tie bar, and write just [tʃ, dʒ]. That is what you find in most pronunciation dictionaries and textbooks, too. It does involve the convention that a sequence of plosive plus fricative that does NOT form an affricate must be written some other way. Daniel Jones does this with a hyphen (see his article ‘The hyphen as a phonetic sign’, 1955, Zeitschrift für Phonetik 9), thus [t-ʃ, d-ʒ]. This enables us to show the difference in the Polish minimal pair trzy [t-ʃɨ] vs. czy [tʃɨ]. In English any such sequence must straddle a syllable boundary, so you can show it using a full stop, as in Wiltshire /ˈwɪlt.ʃə/ vs. vulture /ˈvʌltʃ.ə/ (if you think I am right about English syllabification) or /ˈvʌl.tʃə/ (if you don’t).

Another way is to use the ligatured symbols [ʧ, ʤ] for the affricates, leaving the separated [tʃ, dʒ] for the non-affricate plosive plus fricative sequences.

t͡ʃ, d͡ʒ

c, ɟ

č, ǰ

Tuesday 17 October 2006

Each of the universities we are assessing here in Hong Kong has been asked to produce a statement of its research strategy. Some of them have generated prime examples of gobbledegook.

[…] our research policy […] nurtures a culture for innovation development and technology transfer where the University is in the early stages of development […]

I can’t even parse this. What might ‘innovation development’ be? What is the antecedent of where the university is in the early stages of development? Is it a restrictive relative clause or a non-restrictive one? Perhaps the whole thing has been badly translated from a Chinese original that was clear in Chinese. Perhaps the sentence is simply short of punctuation, as in Eats shoots and leaves, and the reference was meant to be to

[…] innovation, development, and technology transfer […]

with an afterthought admitting that the institution is only just starting to address the matter of technology transfer.

Fortunately we are not being asked to make any kind of judgment on these statements. I say ‘fortunately’, because they are couched in the style of corporate management-speech that can be seen as both on the one hand ludicrous and on the other obscurely threatening. Here is a fragment of what one higher education institution says.

The research policy has been implemented using the following key strategies:

[…] Encouraging academic staff to develop research initiatives that are directly related to the mission and strategic plans of [name of institution] and its Faculties and Departments […]

Is this to be read as implying that the institution discourages its academic staff from ‘developing research initiatives’ that do not fit exactly into its master plan? How does that square with academic freedom?

Compare another institution we have to assess, which claims

[…] to provide and maintain a vibrant environment for new and important research areas to emerge […]

—which might seem to imply precisely the opposite aim, with everyone being encouraged to search out quite new and perhaps unforeseen ‘research areas’.

I wonder if this difference in claimed research strategy leads to any difference in practice. I suspect that it doesn’t; but if it does,

I \know which institution \/I would prefer to work for.

Eats, shoots and leaves

Monday 16 October 2006

This is the first blog entry to come to you direct from vibrant Hong Kong, where I am working on a Hong Kong universities Research Assessment Exercise and will then have a brief holiday.

Tami Date has been in touch again about the ‘empty word’ things (blog, 10 October). He offers the following counter-examples, in which things receives the nucleus.

(1) A: Mary, I’m sorry to hear about your father.
    B: Thank you, John. | It was 'one of those \things.
    A: When did he pass away?
    B: He was buried on December 20th.
(2) A: What does your father do for a living?
    B: He’s into 'many \things. (= jobs)
    A: Does he have a major line of work or...?
    B: Yes, he’s a waste material dealer.
(3) 'Get your \things. (= belongings) You’ll be
	 leaving with the police!
(4) We’ll drive out right after dinner | and 'get your \things.(ditto)

Numbers (1), (3) and (4) are clear: we have to accent things. You could consider just one of those things an idiom, and exempt things ‘belongings’ from the category of empty words. It seems to me that (2), though, could go either way: the nucleus can go on things or, alternatively, on many, given that the speaker can imply that jobs is given (= predictable from the context).

There are similar difficulties with people. Although we say

(5) I 'want to \meet people.
(6) I 'want to com\municate with people.
(because a nucleus on people might suggest a contrast with, say, machines or even animals), we nevertheless say
(7) There were a 'lot of \people in the room.

Hong Kong harbour

Hong Kong street

Archived from previous months:

my home page