ANTRANS

NAME

antrans - phonetically transcribe orthographic annotations

SYNOPSIS

antrans (-i item) (-x exceptions.txt) (-m missing.lst) (-s) (-w) (-A|-J) file

DESCRIPTION

antrans uses an inbuilt English pronunciation dictionary to transcribe orthographic annotations into phonetic symbols. The operation merely changes the annotations label text, it does not change positions of any annotation.

By default phonetic annotation is output in SAMPA format, but options are available to output in ARPABET or JSRU symbols.

Unknown words are not transcribed, and may be reported to a file for processing by hand. You may then build an exceptions dictionary file and include this in processing.

antrans assumes that the annotations describe chunks of the signal separated by pauses (as generated by "npoint -a" for example). To this end, it add a psuedo "silence" symbol "/" at the start and end of each chunk, and also converts any chunk that is only annotated with "/" to the SAMPA pause symbol "...".

Options:

-I Identify the program name and version.

-i item Select input item number.

-x exceptions.txt A text file of pronunciation exceptions. These are used in preference to inbuilt dictionary. The format of this file is <word><TAB><pronunciation><NEWLINE> where pronunciation is done using SAMPA symbols, e.g.

Amsterdam	%{mst@"d{m

-m missing.lst This option causes the program to generate a list of those words missing from the dictionary into the supplied file.

-s Include stress markings (where available) in output transcription.

-w Do intra-word alignment only. Assumes input annotations are words from a connected stream, and so doesn't add silence around them.

-A Output symbols in ARPABET format, as used in the BEEP dictionary.

-J Output symbols in JSRU format.

VERSION/AUTHOR

1.0 - Mark Huckvale
Wed Jul 17 22:31:36 2013