Transcription conventions

These guidelines are based on those used by Van Engen et al. (2010), with minor adaptations.

1. General procedure

Each channel was transcribed separately using Wavescroller (Northwestern University Linguistics software). Transcription was started where the transcriber judged the diapix conversation to begin. The speech was transcribed verbatim, and no punctuation was used, except for apostrophes for contractions and possessives. Numbers are written out (one hundred and fifty five). No hyphenation or abbreviation was used, and full dictionary spellings were used for all words except for those mentioned in section 2, below. All of the speech is in lower case letters.

2. Special spellings for words

The following words are exceptions to the general rule of full dictionary spellings being used:

  • Collocations:
    • gonna (going to, as in "he's gonna do it")
    • wanna (want to, as in "do you wanna stop?")
    • yknow (as in "I found, yknow a lot more mistakes" but not "how do you know?")
    • kinda (kind of, as in "I kinda forget")
    • sorta (sort of, like kinda)
  • Fixed spelling for certain special words:
    • okay
    • TV (no space)
    • oops
    • thingy
    • yeah
    • yep
    • nope
    • "all right" (not "alright")
    • "a while" (not "awhile")
  • Hesitation sounds/filled pauses & yes/no sounds:
    • uh, um, er, ah, eh, oh, mm, hm, yuh, or huh, depending on what the speech sounds closest to. These can also be combined to include uhhm, uhhuh, mmhm, mmmm, uhuh, yuhhuh, etc.
  • Spelling:
    • When a speaker says a sequence of letters or is spelling a word, the letter sequences are spelled out in capital letters and separate letters by spaces: U C L

3. Other symbols

<SILP>     silence by current speaker while interlocutor is talking
<SIL>     silence by current speaker between the speaker’s turns not due to the other person talking (within-talker pause) (guideline minimum duration 0.5 seconds)
dash     word is spoken partially (even for unknown words)
<LG>     laughter that is NOT part of any word
<LG_word>     laughter that is part of a word
<BR>     breaths, sighs
<GA>     garbage; noise that is not from the speaker, such as microphone pops and background noise. Also, <GA> is used for anything produced by the speaker that does not fit into <LG>, <BR>, <LS> (e.g. clearing throat)
<GA_word>     noise not from the speaker that occurs while the speaker is saying a word.
<?>     transcriber has not understood which word is intended
<word_?>     transcriber has attempted to transcribe a word, but is unsure which word is intended
[word]     overlap between interlocutors in the conversation transcripts