npoint - endpoint N utterances in a single speech waveform


npoint (-i item) (-n numutter|-a) (-r mark_space_ratio%) (-b backofftime) (-w windowtime) (-W wordlist) (-l labelstem) file


npoint is a program to automatically annotate the endpoints of multiple utterances in a speech waveform. The input is a speech signal containing one or more utterances (the number specified on command line) separated by silence. The output is 2N annotations, marking the beginning and end of each utterance. The program uses a dynamic programming procedure to find exactly N utterances and N+1 silences.


-I Identify program name and version number.

-i item Select input item number.

-n numutter Specify number of utterances. Default 1.

-a Automatically estimate the number of utterances in the file. Uses an energy based criteria to look for pauses. Use this instead of "-n".

-r mark_space Specify the mark-to-space ratio for speech signals to silence. This is expressed as a percentage in range 0-100. Thus for 2 second utterances separated typically by 5 seconds of silence, specify a mark-space ratio of 40. Default 50.

-b backofftime Specify the time in seconds you want the markers 'backed-off' from the located start and stop points. The start markers are moved earlier by this time, the stop markes are moved later. No check is performed to see if this causes starts to overlap previous stops. Default 0.1s.

-w windowtime Specify the size of each analysis window in seconds. Annotations are positioned to multiples of this size. The maximum size of speech file that can be processed is limited by the square of the number of analysis windows required to cover the input. So for an input signal of 30 seconds, a window of 0.05 seconds will required 600x600 = 360kbytes of memory. To analyse long speech signals use a larger analysis window. Default 0.05 seconds.

-W wordlist Specify a file containing a list of the N words to be found. These are then used as the basis for the annotations for the start points. The stop points are annotated with '/'.

{Bi} -l labelstem Specify a different stem for the start point annotation. The default is 'start'. When this mode is selected, the stop points are labelled with '/'.


SP Speech pressure waveform.


AN Endpoint annotations: {startN,stopN}.


utterances number of utterances.

markspace mark-to-space ratio

windowtime analysis window size (s).

backoff back-off time (s)

type set to 'endpoints'.


1.4 Mark Huckvale


Fri Jul 09 14:54:35 2004