My phonetic blog — June 2006

John Wells

email: ; home page

To see the phonetic symbols in the text, please ensure that you have installed a Unicode font that includes all the IPA symbols, for example Charis SIL (free download).

UCL Division of Psychology & Language Sciences
Stop press: Congrats to Michael Ashby and the UCL Phonline team, featured today on the UCL website.
Friday 30 June 2006

I’ve been doing a consultancy job for a publisher, listening to synthesized speech automatically generated for a handheld computerized dictionary currently under development — not isolated headwords, but phrases and sentences. Speech synthesis is much better these days than it used to be, but it still has some way to go, particularly in the matter of working out a plausible intonation pattern.

I’ve listened to a thousand randomly selected samples. So I can tell you both what the commonest type of error was, and what the worst type was. (How would the software have coped with the sentence I’ve just written? It ought to have been able to recognize the need for contrastive tonicity: both what the commonest type of error was, | and what the worst type was. Would it have succeeded? No.)

OK, then: the worst type of error was the wrong choice between homographs. English has a number of words where the same spelling is used for two words of different sound and different meaning. Thus wind (air movement) is /wɪnd/, but wind (turn, twist) is /waɪnd/; entrance (way in) is /ˈentrəns/ but entrance (bewitch) is /ɪnˈtrɑ:ns/; present (noun or adj.) is /ˈprezənt/, present (verb) is /prɪˈzent/. In the thousand samples I listened to there were five errors of this kind. Obviously, choice of the wrong pronunciation would be seriously misleading for the dictionary user.

However, the most frequent error was failure on the part of the speech synthesis software to recognize compound nouns and therefore to apply compound stress. In the great majority of cases of noun + noun in English it is appropriate to put the primary stress on the first element, as for example adˈventure story, appliˈcation form, ˈmurder charge. (Of course, you need to be aware of the occasional exceptions, such as strawberry ˈjam, gang ˈwarfare.) The software got this wrong in 65 cases, constituting just over a quarter of all errors identified.

You could see this, alternatively, as a demonstration of a fault in English spelling conventions. In the other Germanic languages, compound nouns are written solid, with no space (thus German Abenteuergeschichte, Anmeldeformular, Mordanklage). So you can recognize them as such. English sometimes does this (wintertime, railway) but very often doesn’t. If we wrote adventurestory, applicationform, murdercharge and so on, then the software (and millions of EFL students) might get the stress right.

Thursday 29 June 2006

The bacterium Clostridium difficile has been in the news recently. It is a cause of diarrhoea, usually acquired in hospital.

My interest, as always in this blog, is in how it is pronounced. Like all organisms, it has a Latin binomial name, the first part referring to the genus and the second to the species. There’s no difficulty about how to say the generic name: clearly, /klɒˈstrɪdiəm/. This can also be abbreviated to C., and in this form is often pronounced just as /siː/.

It’s the specific name that raises the interesting question. Apparently this species of Clostridium is called difficile because when it was first discovered it was difficult to grow in the laboratory. In Latin, difficile is the neuter singular nominative of the adjective difficilis ‘difficult’ — neuter because it agrees in gender with the neuter Clostridium. (With a masculine or feminine noun it would be difficilis.) In accordance with the usual rules for pronouncing Latin words in English, you would expect it to be /dɪˈfɪsɪleɪ/ or perhaps /dɪˈfɪkɪleɪ/.

But it so happens that difficile is also the French word for ‘difficult’. And in French it is pronounced /difisil/, anglicized as /ˌdɪfɪˈsiːl/. Because doctors and other health care professionals these days are much more likely to know French than to know Latin, it is perhaps not surprising that the bacterium is generally known in English as /klɒˈstrɪdiəm ˌdɪfɪˈsiːl/.

Or C. diff. /ˌsiː ˈdɪf/ for short.

Wednesday 28 June 2006

As a consequence of doing a Google search on svarabhakti (yesterday’s blog entry) I have come across a most informative site dealing with the (Scottish) Gaelic language:

It includes an exhaustive account of the remarkable phonetics of this language (go to Fuaimean na Gàidhlig - The Sounds of Gaelic). The site makes extensive use of IPA symbols, and is supported by .mp3 sound files. As far as I can tell, it seems thoroughly accurate and well-informed.

The name of the language, in Gaelic, is Gàidhlig, pronounced [g̊aːlɪg̊ʲ]. The consonant system includes dental and palatal plosives, palatal and velar fricatives, three laterals and three r-sounds. There are no voiced plosives, but fortis consonants are distinguished from lenis by being aspirated or (in non-initial position) preaspirated. Consonants are divided into ‘slender’ (palatalized) and ‘broad’ (non-palatalized), and come in slender-broad pairs.

The vowel system includes back unrounded vowels [ɯː] and [ɤː], as in caol ‘narrow’ and ladhran ‘sandpiper’ respectively.

Like the other Celtic languages, Gaelic has initial mutation (lenition) of consonants in various grammatical or positional environments. Now I understand why the Gaelic for James, written Seumas, has come into English as Hamish. In the vocative case, masculine names undergo both lenition of the initial consonant and ‘slenderization’ of the final consonant. Hence Seumas [ʃeːməs] becomes a Sheumais [ə heːmɪʃ].

One niggle: I wish the author would spell pronunciation correctly. I drew his attention to his “pronounciation” on the front page, which he has corrected; but the error remains elsewhere, e.g. twice on the Vowels page.

And if I were organizing the recordings, I would have insisted that the speaker use the same intonation on all example words. As it is, his tendency towards listing intonation (rise, rise, fall) might make the listener imagine that Gaelic is a tone language.

Tuesday 27 June 2006

“Eng-ger-land! Eng-ger-land! Eng-ger-land!” chant the patriotic football crowds, cheering on the English team.

But why do they feel the need to add an extra syllable? Why do they change change /ˈɪŋ.glənd/ into /ˈɪŋ.gə.lənd/?

There’s nothing phonotactically difficult for English speakers in syllable-initial /gl/. Why break up the cluster with an epenthetic schwa? Why the svarabhakti?

(Some speakers don’t have a /g/ in England. They have even less excuse.)

Here is a Scottish gloss on the matter.

There’s only one other instance of the phenomenon in English that comes immediately to mind: the widespread pronunciation of athlete, athletics as /ˈæθəli:t, ˌæθəˈletɪks/ instead of the standard /ˈæθli:t, æθˈletɪks/.

The other day I did hear someone pronounce template as /ˈtempəleɪt/ instead of the usual /ˈtempleɪt/. (A hundred years ago it was a /ˈtemplɪt/. But nowadays the spelling pronunciation prevails.)

The explanation must be that the phonetic C+l sequences could also arise from a process of compression (schwa deletion), as when tingling (from the verb to tingle) is compressed from /ˈtɪŋgəlɪŋ/ to /ˈtɪŋglɪŋ/, /ˈkæθərɪn/ Catherine to /ˈkæθrɪn/ or /ˈdʌbəlɪt/ double it to /ˈdʌblɪt/. (I ignore here the possibility of syllabic [] from /əl/, which is irrelevant to the argument.) Apply this process in reverse, and we get the svarabhakti forms. (But it doesn’t explain why it applies only to a very few lexical items.)


This, together with morphological regularization, presumably explains the occasional /ˈwɪntəri/ for wintry or /rɪˈmembərəns/ for remembrance.

Eng-ger-land fans, don’t let it spread any further. Please! Or as we say nowadays, puhleeze!

Monday 26 June 2006

“Final alveolar consonants, i.e. /n/, are often subject to assimilation”, writes an undergraduate in an examination answer.

But she doesn’t mean i.e., she means e.g.. The first stands for ‘that is’, the second stands for ‘for example’. The consonant /n/ is not the same as (all) alveolar consonants, it is an instance of an alveolar consonant.

Younger native speakers often confuse i.e. and e.g.. Both these abbreviations stand for Latin phrases, id est and exempli gratia respectively. You could blame their confusion on the decline of classical learning: not many people learn Latin these days. On the other hand there are millions of people who don’t know Latin but who manage to use them correctly.

You might also think that we ought to have English abbreviations for English phrases. Foreign learners, in fact, sometimes suppose that to be the case. I have several times come across “f.e.” or “f.ex.” in English essays written by Germans. After all, the German equivalent of for example is zum Beispiel, abbreviated as z.B.. I have to explain that “e.g.” is the only way of abbreviating this expression in English.

Another English abbreviation I have seen invented by Germans is “a.s.o.”. What does it stand for? Obviously, “and so on” (German und so weiter, u.s.w.). But the English for that is etc.. It’s Latin again: et cetera.

A popular (mis)pronunciation of this latter phrase is /ɪkˈsetrə/, as if it began ex-.

Saturday 24 June 2006

How do you pronounce artisanal?

That was a question a correspondent asked me two or three years ago. At the time I had to reply that I did not know the word: it wasn’t part of my vocabulary and I’d never heard anyone say it. However, despite the fact that artisan is /ˌɑ:tɪˈzæn/, I thought the regular stress effect of the suffix -al ought to yield /ɑ:ˈtɪzənəl/. Adjectives in -al, of three or more syllables, have penultimate stress if the penultimate vowel is long (archetypal, primeval, universal) or followed by a consonant cluster (dialectal, incidental), but otherwise have antepenultimate stress (personal, industrial, medicinal). (There are one or two exceptions in which the penultimate vowel lengthens on adding -al, as adjectival /-ˈtaɪvəl/.)

The word artisanal is not in the OED. It seems to have come into use quite recently as a kind of antonym of industrial, referring to small-scale production methods, agricultural or other. On Google it has four million hits. would have us believe that it is /ˈɑ:tɪzənəl, ˌɑ:tɪˈzænəl/. I don’t think so. That looks like someone’s guess &mdash someone who hasn’t fully absorbed the English stress rules (blog, 21 June).

Early this morning I was jolted out of my half-sleep by hearing someone on my bedside radio, on the farming programme, use the word not once but twice. I can’t remember what they were talking about, but my semi-conscious mind did note that the pronunciation used was indeed /ɑ:ˈtɪzənəl/. Result!

Friday 23 June 2006

I hope you like the minor redesign of this page. The narrower columns should make the text more readable.

I hope too that you can read all the various characters correctly: not only the IPA symbols but also the odd bits of Greek or other writing systems. There are two requirements here: you must have a Unicode font with the relevant symbols installed on your system, and your browser must be set up in such a way as to make use of them.

My own preferred browser is Firefox, running under Windows 2000/XP. It has the distinction of being much more tolerant of faulty HTML code than Internet Explorer or Netscape. This is generally an advantage to the user, who sees the intended symbols even if they haven’t been coded exactly correctly. But it is a disadvantage to the web page author, who seeing that everything is correctly displayed on his own browser may suppose that other browsers too are displaying it correctly even when this may not be the case.

In the style sheet (.css) that accompanies this page I specify the font Trebuchet MS for the text. I like this font. I find it clear and legible and aesthetically more pleasing than the ubiquitous Arial. In particular, I like the fact that a lower-case Ll looks different from an upper-case Ii. However, if you haven’t got Trebuchet MS on your system then you’ll see the text in whatever is the default font for you.


Trebuchet MS does not contain IPA symbols. So in the style sheet there is a line of code as shown alongside. This should cause the browser to select one of the named fonts whenever I mark a bit of text as phonetic by tagging it with the markup code <span class="phon">. These fonts do contain the IPA symbols, and your browser should go through them in turn until it finds one that is installed, and then use that one.

span.phon {font-family:"Charis SIL", "Doulos SIL", "Gentium", "Lucida Sans Unicode", "Arial Unicode MS", "MS Mincho"; color:darkgreen;}

ˈhɪər ə səm fəˈnetɪk ˈsɪmbəlz


Similarly, I mark up bits of Greek with the code <span class="gk">. This actually doesn’t matter for Modern Greek, the symbols for which are included in Trebuchet MS. But it is vital for classical Greek, which requires ‘polytonic’ symbols involving quite complex combinations of diacritics (rough and smooth breathings; acute, grave and circumflex accents; iota subscript). Not many fonts are equipped for polytonic Greek, and if you haven’t got one of the fonts named alongside your system probably cannot display it.

span.gk {font-family:"Palatino Linotype", "Gentium", "Arial Unicode MS", "DejaVu Sans";}

ἐν ἀρχῇ ἦν ὁ λόγος


For things to be displayed correctly in Internet Explorer, all this extra coding is essential. It doesn’t matter for Firefox, because clever Firefox, if it comes across a symbol not found in the current font, searches for a font that does contain it. IExplorer merely displays a blank square, and Netscape a boxed-in question mark. If you see a strange string of symbols like &#x1f10, it means that I have forgotten the final semicolon in the code for a Unicode symbol I can’t keyboard directly (but Firefox would nevertheless display the correct character).

Thursday 22 June 2006

Kensuke Nanjo reminds me that the question of s → ʃ / _ r was addressed in three of the items in Bert Vaux’s North American dialect survey, available on-line. Of 10,981 respondents who voted there on the s in anniversary, 94% identified it as [s], 6% as [ʃ].


As you can see from the map (reproduced, reduced in size, from the University of Wisconsin Milwaukee website), there was no particular geographical basis to the distribution of preferences: the blue dots, representing [ʃ], are scattered thinly across the east, midwest, and west. The red dots, [s], are everywhere. (Go to detailed maps.)


For the s in nursery, the figures were similar, but with a few more votes for [ʃ]. 88% thought the fricative was [s], 11% thought it was [ʃ], and 1% thought it was something else.


For the c in grocery, however, things were much more evenly divided: 52% thought it was [s], 45% [ʃ], and 2% ‘other’. It is not at all clear why the figures for grocery are so different from those for anniversary and nursery. I wonder how the respondents would have voted on miserable (not in the survey).

Wednesday 21 June 2006

If you go along with Chomsky and Halle in The Sound Pattern of English (1968), native speakers of English have as part of their grammar (= their implicit knowledge of the language) a complex set of rules allowing them to produce the appropriate word stress and vowel qualities for English words, while the words themselves are specified in our mental lexicon without indication of stress placement and possible vowel reduction.

Chomsky and Halle were wrong, of course. The fact that anyone was uncertain about how to pronounce hypernymy (19 June) bears this out. The relatively familiar words anonymous, eponymous, synonymous, synonymy show us clearly that the -nym- element has a short vowel. This plus the single consonant /m/ gives us a so-called weak cluster, so that when -y or -ous is added the stress is thrown back onto the antepenultimate. So the same thing must happen in hypernymy (despite its being ill-formed).

I have recently noticed another counter-example to the SPE principle. Every adult native speaker knows the words paralysis and analysis. Some also know one or more of catalysis, dialysis, electrolysis or some fifty other technical words with the same ending. All have antepenultimate stress: /pəˈræləsɪs, əˈnæləsɪs, kəˈtæləsɪs, daɪˈæləsɪs, ˌelekˈtrɒləsɪs/ etc.

Two years ago I was diagnosed with a heart condition and underwent an angioplasty procedure. Since then I have attended the Cardiac Support Group run by my local hospital. One of the techniques for treating a heart attack is known as thrombolysis. But I have noticed that the cardiologists, cardiac surgeons, paramedics and nurses don’t call this /θrɒmˈbɒləsɪs/, as you would predict from all the other -lysis words. They call it /ˌθrɒmbəʊˈlaɪsɪs/. Ah well, language doesn’t obey rules as much as some linguists would like.

In my lectures I sometimes talk about allophony. Guess how I pronounce it. Remember that everyone agrees that telephony is /təˈlefəni/ and cacophony is /kəˈkɒfəni/, and that I am someone who plays by the rules.

Tuesday 20 June 2006

Kensuke Nanjo writes from Japan, “I have a question about the pronunciation of the word classroom. If I remember correctly, Jennifer Jenkins pronounced the word classroom like claashroom ([ʃ] instead of [s] before [r]) when she gave a lecture a few years ago in Osaka. Although this pronunciation isn’t recorded in any of the dictionaries, including pronunciation dictionaries, is it a common pronunciation in RP in particular and British English in general?”

I can’t find any discussion of this possibility in the literature, although I think it is by no means unknown. I do mention the possibility in one of my lecture handouts, where I list various environments in which the assimilation of /s/ to [ʃ] may occur.


You can get this wherever /s/ and /r/ abut: across word boundaries as in this reason, between the parts of a compound as in the classroom that Prof Nanjo heard, or indeed within a word as in the compressed version of nursery [ˈnɜːs(ə)ri, ˈnɜːʃri] or grocery.

There can be assimilation of /z/ to [ʒ] in the same environments, e.g. miserable [ˈmɪʒrəbɫ̩].

I have no statistics about how widespread assimilation before /r/ is. I don’t do it myself, and it’s certainly not a mainstream pronunciation.

I suspect that the assimilation product may in some cases not be identical with an ordinary [ʃ, ʒ], but rather a kind of retroflex [ʂ, ʐ], anticipating the place of articulation of the /r/.


Rather more widespread, and on the increase, is /s/ → [ʃ] before /tr/, as in strong. Many of my native-English students have this at least as an optional variant.

The str → ʃtr assimilation joins the stj → stʃ → ʃtʃ assimilation as phonostylistic phenomena we shall soon have to teach our EFL students. I call them s-affricate assimilation.

Does anyone know of any studies of these phenomena?

Monday 19 June 2006

One of our Modern English Language MA students has chosen hypernymy as a dissertation topic. There is some hesitation about how to pronounce this word: where does the stress go? I would say that following the pattern of synonymy /sɪˈnɒnəmi/ it must clearly be /haɪˈpɜːnəmi/.

This got me thinking about the etymology of this group of words. The classical Greek for ‘name’ is ὄνομα ónoma, stem ὀνοματ- onomat-, which we see in onomatopoeia and onomastics. So why isn’t it *synonom, *synonomy etc? Where does the y in -nym come from?

Dictionaries say that synonym is actually from an adjective συνώνυμος synó:nymos ‘having the same name’, which is composed of syn- ‘with’ and the ‘name’ element. I haven’t found a clear answer to the question why -onom- changes to -onym- in the course of adjective formation: but it appears that ὄνυμα ónyma was the Aeolian or Doric dialect version of ὄνομα ónoma. Using this dialectal variant obviates a possible confusion that might otherwise have arisen with -nom- ‘law’, as in metronome, astronomy, economy.

According to the OED, the earliest meaning of hyponym is ‘a name made invalid by the lack of adequate contemporary description of the taxon it was intended to designate’ (1904), i.e. a biological name that is unusable because its meaning is unclear. As a linguistic term — with a quite different meaning — it is rather recent, the earliest citation being from John Lyons in 1963. Hyponymy is slightly earlier (1955), and in linguistics refers to the relationship between such pairs as scarlet and red, tulip and flower. Scarlet is a hyponym of red because anything that is scarlet is also necessarily red, although the converse does not hold.

Hypernym and hypernymy are so new that they are not included in the second edition of the OED. They describe the converse relationship to hyponym and hyponymy. Thus red is a hypernym of scarlet, flower is a hypernym of tulip. A synonym of ‘hypernym’ is ‘superordinate term’.

Hypernym and hypernymy are obviously modelled on hyponym(y). However, from the point of view of etymological morphology, they are actually not well formed. The elements of hyponym are not hypo- and -nym but hyp- and -onym, as indeed we see when we replace hyp(o)- by syn- to give syn-onym. (So also an-onym-ous, ‘nameless’.) Like other Greek prefixes ending in o, hypo- loses the o when followed by a stem beginning with a vowel, thus hyp-onym. So when we change hyp(o)- to hyper- we ought to get not hypernym but hyperonym, not hypernymy but hyperonymy. And I think we would all agree on how to pronounce the latter. It would have to be /ˌhaɪpəˈrɒnəmi/.

  • Friday 16 June 2006
  • Paul Kerswill, of the Dept of Linguistics and English Language at Lancaster, has just sent me an email expressing his interest in Sam Wood’s findings (14 June). He says: “You comment in your blog that the Afro-Caribbeans have most th-fronting, and Asians very little. This is exactly the pattern for Birmingham according to my PhD student Arfaan Khan”.

    He goes on to give me an update on the work he is doing on London English along with Jenny Cheshire and Sue Fox of QMUL and Eivind Torgersen of Lancaster. The project is called Linguistic innovators: the English of adolescents in London. It appears that quite a few of the traditional Cockney characteristics are in sharp decline among younger speakers. Just look at the change in the formant frequencies for TRAP and STRUT (/æ, ʌ/) in this plot for speakers in the inner-London borough of Hackney.

    Paul says: The phonetic half of the project is just moving onto consonants — not sure what we’ll find. The most striking thing is the almost complete lack of h-dropping among many of the youngsters, especially those of ‘non-Anglo’ origin. We actually noted this in Milton Keynes and Reading, but not in Hull where h-dropping remained robust. Also, for London we hear a uvular initial /k/ before low back vowels, especially among non-Anglos. I’m not convinced that the /r/ we hear is all that labiodental. I think it has lingual involvement as well as labial. And there is what I perceive as syllable timing in the inner city, vs. stress timing in the outer city. I wonder how significant the backing/lowering/lengthening of turn-final schwa is — it's very striking, and I think it is sociolinguistically stratified — more white outer-city.

    Paul is currently planning a large Northern Englishes project with colleagues at three northern universities.

  • Thursday 15 June 2006
  • My mention of George Bush’s pronunciation of nuclear (9 June) has evoked some discussion. Olle Kjellin suggests that what Bush says is not so much “nucular” as “newkiller”, and that this may be a folk etymology rather than a genuine mispronunciation. He may well be right. We’d have to check if Bush does the same thing in circular (“sir-killer”) etc.

    Jamie Kirchner reports hypercorrections in the other direction: cellular pronounced as [ˈsɛlijər] rather than [ˈsɛljələr]. (Americans use this word much more than Brits, because in this supposedly globalizing world of English as an International Language the object known in Britain as a mobile (phone) and in Germany and Japan as a Handy(phone) is known in north America as a cell(ular )phone.)

    Jamie adds his favo(u)rite bits of folk etymology: “volts-wagon” and “kiddie-garden”, ‘the latter of which turns out to be a rather accurate calque’.

    Apropos of schnitzel, Olle asks whether /-ts-/ is really a phonotactic problem, given Patsy (and, we could add, ritzy, ditzy, and Betsy, not to mention curtsey). In Swedish, he says, it can also become [ˈsnɪtsel], and he asks whether we can’t also get [ˈsnɪtsəl] in English. Good question. But I don’t think I have ever heard it.

  • Wednesday 14 June 2006
  • As we know, in England the dental fricatives /θ, ð/ are on their way out.

    One of our second-year BA Linguistics students, Sam Wood, reports some interesting findings about TH fronting in London. He carried out a small-scale Labov-style survey in three London department stores, and found that the use of [f] rather than [θ] in third (floor) correlated not, as expected, with the speaker’s social class, but rather with ethnicity. Salespeople categorized by their appearance as black (= of African descent, including West Indians) used [f] in 40% of cases, those judged to be white (= European) in 31%, east Asian (= Chinese etc) in 17%, and west Asian (= Indian etc) in 13%. The pronunciation [fɜːd] rather than [θɜːd] also correlated, much more highly, with (estimated) age: it was used 80% of the time by those judged to be up to 20 years old, but 33% or less by all older age groups. So in London the sound change seems to be being spearheaded by young blacks.

    The fact that it was the blacks who came out as most likely to use TH fronting is all the more striking given that in Caribbean and African English the tendency is to replace dental fricatives not by labiodental fricatives but by alveolar plosives.

    Nevertheless, geography rather than ethnicity remains the strongest influence in determining most British speakers’ accents. Anecdotal evidence: I was standing at a bus stop in London when I overheard a middle-aged black lady talking. My ears immediately pricked up, because I seemed to be hearing the comfortable and familiar accent of Upholland, where I lived as a child (though when I were a lad there were no black people living in that part of Lancashire). I asked her where she was from. The answer was Chorley, ten miles from Upholland but a world away from London.

  • Tuesday 13 June 2006
  • Yesterday saw the launch of UCL’s new online distance learning course in English phonetics, Phonline.

    Twenty-five students from five continents have been enrolled for this pilot course. Many others had to be turned away.

    The two-month course will cover all the elements of a traditional on-campus course in phonetics as taught at UCL, with a weekly lecture and extensive practical work including ear training, transcription and acoustic analysis.

    Yesterday’s launch party (physical, but non-alcoholic) in the Department of Phonetics and Linguistics was accompanied by a virtual party in the course chatroom on-line. There were students logged in from Russia, Brazil and Thailand, and even one of the UCL tutors was logged in from distant Cornwall.

    Is this the way of the future?

    Congratulations to my colleagues Michael Ashby, John Maidment, Jill House and Mark Huckvale, who set it all up, and to their research assistant Kayoko Yanagisawa.

    Phonline goes live: Michael Ashby, Kayoko Yanagisawa

  • Monday 12 June 2006
  • My summer course colleague Jack Windsor Lewis, listening to the radio news, reports hearing a disturbing account of police discrimination against less well-to-do homes. Apparently, a house in Forest Gate was raided by anti-/ˈterəst/ forces. The many British people who live in terraced houses (for Americans, that’s row houses) rather than in semis or detached houses might rightly feel aggrieved. (In case you haven’t caught up, this was meant to be anti-terrorist.)

    For some reason it reminded me of the Scotsman who took a week off work because he had a wee cough. If you’re teaching about pre-fortis clipping, this minimal pair makes a change from plump eye—plum pie and might I—my tie.

    When I listen to aircraft cabin announcements I often do a double-take when I hear about “our co-chair partners” (actually, code-share partners). Because the [d] in code-share gets devoiced, the distinction between co-chair and code-share depends on nothing more than (i) fortis versus lenis and (ii) syllabification, both of which are pretty inaudible in this context: /ˈkəʊ.tʃeə, ˈkəʊd.ʃeə/. Oh, and perhaps the difference between an affricate on the one hand and a plosive plus a fricative on the other. Pre-fortis clipping is blocked by the morpheme boundary.

    PS: Further to prenuptial (blog, 9 June), yesterday’s London Sunday Times, on page 4 of the Money section, actually had the spelling mistake prenuptual.

  • Saturday 10 June 2006
  • In the village outside Birmingham where my brother lives there is a dress shop. The name board on the fascia reads MAΨA.

    My schooling was unusual in that I started to learn (Classical) Greek at the age of twelve. I went on to read Classics at university. So I have been very familiar with the Greek alphabet ever since childhood. Naturally, therefore, I read this name as /ˈmæpsə/. After all, the letter between the two As is a Greek psi.

    But no, this is just a fancy way of writing MAYA. The name of the dressmaker is Maya, /ˈmaɪə/, like the Central American Indians.

    Maya is following the same path as people who invite you to visit GRΣΣCΣ, where the Greek capital sigma does duty for E. Indeed, there is a well-known chain of toyshops called TOYS ’Я’ US, whose logo exploits the similarity between the Cyrillic capital Ya and the Latin R.

    And phoneticians who ought to know better sometimes use a Greek eta, η, instead of the velar nasal symbol [ŋ].

  • Friday 9 June 2006
  • President Bush’s pronunciation of nuclear as if it were spelt nucular is well-known and widely condemned as a mispronunciation. He says /ˈnuːkjəlɚ/ instead of the standard AmE /ˈnuːkliɚ/. This (mis)pronunciation presumably arose through the influence of the large number of familiar words ending in /‑kjəlɚ/ (BrE /‑kjʊlə/ or /‑kjələ/), among them circular, particular, spectacular, molecular, secular, perpendicular, jocular.

    Two other mispronunciations that one hears from time to time are percolator as /ˈpɜːkjəleɪtə/ instead of /ˈpɜːkəleɪtə/ and escalator as /ˈeskjəleɪtə/ instead of /ˈeskəleɪtə/. In these, a yod has been inserted between /k/ and /əl/ under the influence of words such as speculator and perhaps even our own technical term articulator.

    And then there’s prenuptial and nuptials pronounced /‑ˈnʌptʃuəl(z)/, as if spelt prenuptual, nuptuals.

    Two days ago I heard someone referring to a defibrillator—which is usually pronounced /diːˈfɪbrɪleɪtə/—as a /diːˈfrɪbjəleɪtə/, i.e. as if it were spelt defribulator.

    But my all-time favourite of this type of thing has to be the Wiener schnitzel masquerading as a /ˈsnɪtʃəl/, thereby solving at a stroke what are from the English point of view two phonotactic problems in the German-style pronunciation: initial /ʃn‑/ and morpheme-internal /‑ts‑/.

  • Thursday 8 June 2006
  • I return to the matter of showing intonation markup in word processing or on the web. Years ago, using the old, single-byte technology, I designed the font inton-d specifically for creating O'C&A markup. The symbols are convenient to enter, everything being available with a single keystroke on the standard UK computer keyboard, and it does the job. It is based on the Doulos font, using SIL Encore Typecaster software.

    The downside about using this font on the web is that, as with all such non-ASCII single-byte fonts (now known as ‘legacy’ fonts) you have to make sure your reader has installed it. If they don’t have the font, they can’t read the symbols. (What you see above is a gif, a graphic, which is why you can see it even if you don’t have the font.) And the use of non-Unicode fonts on the WWW is now deprecated.

  • Wednesday 7 June 2006
  • In the O’Connor and Arnold notation system for English intonation, rhythmic stress not associated with any significant change of pitch is marked with an open circle symbol. O’C & A distinguish, however, between those syllables that are low in pitch, which they mark with a low symbol, and those that are non-low, written with a raised symbol.

    How can these marks be reproduced in web documents, or indeed in word-processed documents in general, using regular Unicode fonts?

    The raised small circle is not problematic. It is one of the basic extended-ASCII characters, U+00B0, the degree sign [°]. It may not have a dedicated key on your computer keyboard, but you can easily enter it (in Windows) as Alt+0176, or in modern versions of Word as B0 followed by Alt-x. Or of course you can use Character Map or the Insert Symbol function. (Presumably Mac users have similar techniques which they will know about but I don’t.) In HTML you can write &#x00B0;.

    /What did you °say your °name was?

    In the notation I use in my forthcoming book I make the place of the nuclear tone more explicit by underlining, thus

    /What did you °say your °name was?

    The lowered small circle, however, does present a problem. This symbol is not to be found in Unicode in places where you might expect to find it. The subscript zero, U+2080, is not a satisfactory substitute. (If it is included in one of your installed fonts, it should appear here: [].) Among the IPA symbols we have the devoicing diacritic, U+0325; but it is a combining character, appearing underneath the previously entered character, so for our purpose you would have to precede it by a non-break space, thus [ ̥], which you would enter as &nbsp;&#x0325;. This works OK as long as you don’t forget the non-break space.

    Hel\lo, Mr  ̥Jenkins.

    Another substitute that we could consider—at least for those who have a Japanese or Chinese font installed—is the Ideographic Full Stop, U+3002. If your computer can display it, it looks like this: [。]. But the problem here is that it occupies the same horizontal space (is exactly as wide) as any other Chinese character (thus 亀。事 if you have a Chinese font), and is not centred within that horizontal space but sort of left-justified. When surrounded by Latin letters it looks as if it has a following space, which makes it inappropriate as an intonational stress mark:

    Hel\lo, Mr Jenkins.

    SIL comes to our rescue with its phonetic fonts Charis SIL and Doulos SIL, both of which contain a Private Use Character, U+02F3, [˳], which is exactly what we require: a spacing modifier letter low circle. DejaVu doesn’t have it.

    Hel\lo, Mr ˳Jenkins.

  • Tuesday 6 June 2006
  • The IPA symbol for the vowel of English foot, put, good, [ʊ], sometimes gives trouble to authors, printers and publishers. It tends to get confused with the symbol for the labiodental approximant, [ʋ].

    So it is important that font designers (and the rest of us) differentiate clearly between the two symbols. The foot symbol, known to Unicode as U+028A LATIN SMALL LETTER UPSILON, is symmetrical about the vertical axis. Despite the name, it is not so much a Greek upsilon (υ) as an inverted small capital omega (Ω).

    The labiodental approximant symbol, on the other hand, is assymetrical, with the right side having a leftward-pointing lip or hook. In Unicode it is U+028B LATIN SMALL LETTER V WITH HOOK.

    The new symbol for the labiodental flap (see entry for 25 March), has a rightward-pointing hook on the right side. It does not yet have an official Unicode number, and the fonts that have it treat it as a Private Use Character, U+F25F.

    Here they are in four Unicode phonetic fonts. In red you see in order the ordinary small letter u, the foot vowel symbol, the ordinary letter v, the labiodental approximant symbol, and (if present) the labiodental flap symbol.

    For other easily confused phonetic symbols, see here.

  • Monday 5 June 2006
  • A month ago I wrote appreciatively of the Unicode phonetic fonts made freely available to everyone by SIL. But there is no sans serif font among them: Charis SIL, Doulos SIL and Gentium are all serifed. Until now the only sans serif Unicode phonetic font generally available was the rather blocky Lucida Sans Unicode often supplied with Windows.

    Now I have discovered a new set of free Unicode fonts that include IPA symbols. They are called DejaVu, and are the result of a “process of collaborative development [...] by many contributors [..] organized through a wiki and a mailing list”, and originated by one Štěpán Roh. They are described here, and you can download them free (“free as in speech and as in beer”) here.

    Above you can see a sample of three of the fonts included in the package: DejaVu Sans, DejaVu Sans Mono, and DejaVu Serif. In my view only the first is satisfactorily designed. In the other two, as you can see, the phonetic symbols are not quite the same height as the ordinary Latin letters, so that text looks untidy. Worse, in the Mono (mono-spaced) font the diacritics do NOT sit over the letter they go with but occupy a separate space (not shown here). When I want a serifed phonetic font I shall continue to use one of SIL’s. But for a sans serif font I think I prefer DejaVu Sans over Lucida Sans Unicode.

    In the samples I have included a selection of symbols that sometimes give difficulty. Ought the alveolar tap symbol to have a baseline serif even in a sans serif font? (Yes, I think so, for legibility.) Does the font contain the newly approved IPA symbol for the labiodental flap? (Yes, in the case of the DejaVu and SIL fonts. In Gentium and Lucida Sans, no.) Is the tilde properly centered over the cardinal-3 symbol? (Best in DejaVu Sans and Doulos.) Is the syllabicity mark properly centred under [l]? (No.)

  • Friday 2 June 2006
  • I quite often receive emails—particularly, for some reason, from people in Argentina—asking for links to sound recordings of Estuary English. Unfortunately EE is not well defined linguistically. Joanna Przedlacka has demonstrated that it is not a coherent variety of English, but rather a journalistic cover term applied to all sorts of popular English in the southeast of England: see her book Estuary English? and my handout and lecture slides. And no claimed native speaker has yet offered us any pedagogically-oriented recordings (though I did try to persuade Gary to make some).

    So what I usually do is recommend people to go to the BBC radio website and listen to local radio from London and the southeast of England. Earlier this week, by the way, there was a link to a useful clip of Jamie Oliver there, but it has now been removed.

    More enterprising researchers will already have found Paul Coggle’s book Do you speak Estuary? and its nice illustration, in which Michael Caine, Jonathan Ross, Bob Hoskins and Janet Street-Porter are shown as typical EE speakers. A Google search on any of them should throw up usable sound files.

    To them I would now add David Beckham and, particularly, his wife Victoria, whose words can currently be found in abundance in the media.

  • Thursday 1 June 2006
  • Yesterday I went to a talk given by one of our PhD students, Mary Pearce, about her fieldwork. She is working on Kera, a language spoken by some 50,000 people in Chad. It is a tonal language, with three tones (high, mid, low). It has no voicing contrast in consonants, but VOT varies depending on the tone of the following vowel: so a velar plosive before a high or mid tone sounds like [k], but before a low tone like [ɡ].

    Working under the auspices of SIL, she is involved in both bible translation and a literacy project. She showed us the New Testament in Kera (JAA TƏMARWAŊ) and a reading primer (Aŋ aŋkə ku keera la). As you can see, Kera orthography uses the Latin alphabet supplemented by IPA symbols: not only ə and ŋ but also ɓ and ɗ. Following the conventions of Latin-based orthographies, each letter requires an upper-case as well as a lower-case form: hence Ə Ŋ Ɓ Ɗ.

    Some Kera proper names have clearly passed through French. The names of the four evangelists are Matiye, Markə, Liki, and Zaŋ.

    Archived from previous months:

    my home page