ANALIGN

NAME

analign - automatic alignment of phonetic transcription to speech signal

SYNOPSIS

analign (-i item) (-f|-p) (-l pauselabel) (-A|-J) file

DESCRIPTION

analign perfoms phonetic alignment of transcription to a speech signal. It takes as input an annotation item and a speech item and generates an aligned annotation item. It has two modes of operation. In its "fixed label" mode, the boundaries of the input annotations are not changed, and alignment only takes place within each labelled region. In this case the labels are expected to consist of a sequence of transcription symbols. In its "fixed pause" mode, labels identified as pauses are not moved by the alignment, so that alignment only takes places from one pause to the next. In this case the labels are expected to consist of individual segments. You can use the first mode to get a rough alignment, and the second mode to refine it. analign is implemented as a wrapper function for an hidden markov model alignment performed by a version of the HVite tool that is part of the Cambridge HTK toolkit. The HMMs used are stored in $(SFSBASE)/data/analign.hmm, and the HTK configuration file in $(SFSBASE)/data/analign.cfg. These can be changed if required.

By default, input transcription is expected to be in SAMPA format. Symbols should be separated by spaces. Stress markers are stripped out. Options allow input of ARPABET and JSRU symbols - translations take place both before and after alignment.

Options:

-I Identify the program name and version.

-i item Select input item number.

-f Set fixed label mode. Alignment is only performed within the boundaries of the input labels.

-p Set fixed pause mode. Alignment is only performed within the boundaries of labels identified as pauses.

-l label Specify the label used to identify pauses in fxied pause mode. Default is the SAMPA pause symbol "...".

-A Input symbols are in ARPABET format, as used in the BEEP dictionary.

-J Input symbols are in JSRU format.

FILES

analign.hmm HMM definitions in HTK format

analign.cfg HMM configuration in HTK format

analign.lst List of HMM models

analign.dct Translation from SAMPA to phone names used on HMMs

VERSION/AUTHOR

1.0 - Mark Huckvale
Wed Jul 17 22:31:35 2013