SFS Manual for Users

1. Introduction

The tutorial below relates to the use of SFS from the command line: such as from the Unix shell, or the MS-DOS prompt on Windows. Windows users now have the SFSWin command program as an alternative to the command line.

1.1 Command Line Tutorial

In this section we will demonstrate a simple sequence of SFS commands to show what SFS looks like on a computer system. In this instance we will be looking at its operation using a Unix system. Before studying this, you may like to run the canned demonstration that is available from the SFS distribution site. In the dialogue below, user input is underlined, and the character '%' is the operating system prompt.

First, let's create an empty SFS file called demo.sfs:

% hed demo.sfs new file SPEECH FILE HEADER EDITOR Fill in the details below. Type <details><RETURN> to enter details. Type <RETURN> to skip question. Type !<number> to jump to question. Type <SPACE><RETURN> to delete an answer. 1) source of recording [] : ucl 2) name of database [] : temp 3) name of speaker [] : mark 4) session ref [] : _ 5) session date [] : _ 6) token [] : /laIf/ 7) repetition [] : _ 8) recording conditions [] : 9) comment [default header] : __ Is the data correct (y/n/q) ? y

The program hed is the SFS main header editor. Each SFS file contains space for information about the source of the data which it contains. In this instance, we only need to record that the file is for test purposes. We can always look at the main header details of an SFS file using the program sdump using the '-h' flag:

% sdump -h demo.sfs
Main
File id: uc1-4766 created Mon Feb 15 12:41:23 1988 by mark
Database: temp    Speaker: mark
Session:          Session Date: 
Repetition:       Environment: 
Token: /laIf/
Comment:

Let us now copy an existing speech waveform into our test file. We assume that there is an existing file called /usr/sfs/demo/life.sfs (a file with this name may be found in the demonstration package at the SFS distribution site):

% scopy -isp. /usr/sfs/demo/life.sfs demo.sfs

The scopy utility copies data sets within and across files. Which data set to copy is specified by a program switch or flag. In this operation the '-isp.' switch informs the program to copy the first speech data set in the input file.

If we only had a binary waveform file (life10.bin containing speech sampled at 12800 samples/sec), we could have 'linked' this to our SFS file with a command such as:

% slink -isp -f12800 life10.bin demo.sfs

Alternatively, if the file was in another signal file format, say WAV format, slink may be run with its '-t' option. For example:

% slink -isp -tWAV life.wav demo.sfs

Read about slink in manual section 3.2.

To listen to the waveform, SFS needs to be configured for your Digital-to-Analogue Converter (DAC) capability. There are two stages to this: at compile time support for various DAC systems are included in SFS programs, and at run time the user chooses between supported methods by setting the environment variable 'DAC'.

To select replay through the speaker-box on Sun-4 systems:

% setenv DAC sun16

To select replay through a SoundBlaster-16 card on IBM-PC systems:

c:\sfs\> set DAC=sb16

To listen to the data set we use:

% replay demo.sfs

The program replay is a simple program that replays waveforms through the currently selected DAC. All being well, we hear the signal replayed. We can take a quick look at the contents of demo.sfs using the program summary:

% summary demo.sfs
1. SPEECH (1.01) 8960 frames from
scopy(file=/usr/sfs/demo/life.sfs,item=1.02,history=agc(1.01))

This gives us the information that the file demo.sfs contains a single data set, it is a speech waveform of 8960 samples that was copied from a file /usr/sfs/demo/life.sfs. The 'data set number' or item number of the speech data set in this file is 1.01, and it originated from item 1.02 in the file life.sfs. Section 2.4 gives more detail about the output of summary.

More details of the file contents can be obtained using the program sdump:

% sdump demo.sfs
Item 1
Data Type: 1.01 SPEECH (copied from </usr/sfs/demo/life.sfs>)
History: scopy(file=/usr/sfs/demo/life.sfs,item=1.02,history=agc(1.01))
Parameters: 
Process Date: Mon Feb 15 12:42:20 1988
Format: 2 byte integer
Frame size: 1        Frame count: 8960
Total Length: 17920  Frame Duration: 7.8e-05(12800Hz)
Window size: 1       Overlap: 0
Offset: 0            Comment: 
Data Starts:
2  0 -5  0  2  0  0  2  5  0 
2  0 -2  0 -2  5  5  5  5  0 
5  2 -7 -5  2  5 -5 -5  0  0 
0  0  5  5  5  2  2  2  5  0 
0  5  0  2  5  0 -5  2  5  5

The sdump program has many options, but its default action is to summarise the data sets in the file. Each data set is preceded by a header (the item header) detailing the size and form of the data set. Taken together, the header and data set are said to constitute an item of data. Thus in the above dump of the file demo.sfs, the parameters of the speech data set (the processing date, the number of speech samples, the sampling frequency, etc) are taken from the item header, and the first few sample values are taken from the data set. A fuller description of the fields in the item header is given in section 2.5.

We can display this waveform along with a simultaneous spectrogram by using the program Es. Es is the general-purpose display program of SFS.

The configuration of graphics in SFS is conducted in two stages: in the first a number of supported graphics terminals are selected at compile time and built-in to SFS programs; then users select which of the supported graphics devices they want to use through the environment variable 'GTERM'.

SFS will attempt to guess the terminal type if GTERM is not set through the use of the TERM environment variable and the file $SFSBASE/data/digmap, for details refer to the DIG manual page.

For Unix/X-windows, the setting is:

% setenv GTERM xterm

For DOS with Super VGA:

% set GTERM=svga-256

More details are given in the installation instructions. To operate Es, you will need a mouse with at least two buttons (three buttons are preferred). Es allows you to zoom and scroll the waveform and spectrogram. The mouse buttons operate cursors on the screen and there is a menu of options available as buttons at the top right of the screen. The command to display the waveform and spectrogram is:

% Es -isp -gsp demo.sfs

If the display of the spectrogram is rather slow, an alternative is to precompute the spectrogram and store it in the file. The commands to do this are:

% spectran demo.sfs

% dicode demo.sfs

The program spectran is a speech processing program that performs spectral analysis on waveform data. The program dicode is a program that constructs a grey level display from analysed spectra. If we summarise the file demo.sfs now, we find:

% summary demo.sfs
1. SPEECH (1.01) 8960 frames from scopy(file=/usr/sfs/demo/life.sfs,
item=1.02,history=agc(1.01))
2. COEFF (11.01) 532 frames from spectran(1.01;window=8, overlap=6)
3. DISPLAY (9.01) 532 frames from dicode(11.01;dbr=50.00, nump=128)

There are now three data sets (items) in the file, the speech data, the spectral coefficients and the grey-level picture, each with its own item header. These three types of speech data are given the names: SPEECH (SP for short), COEFF (CO), and DISPLAY (DI). Note that a processing record is displayed on the right-hand side of the summary, which indicates that the COEFF item was calculated by the program spectran operating upon item 1.01 in the file. Similarly, the DISPLAY item was calculated by the program dicode operating upon the COEFF item 11.01. Note that the processing record also includes a number of parameters of the calculations done by spectran (namely the analysis frame sizes) and dicode (namely the dB range and the number of pixels). These are parameters that can be altered by command line switches to the programs.

We can now display the speech waveform and the grey level spectrogram. Assuming we are sitting at an X-Windows terminal:

% setenv GTERM xterm
% Ds -isp -idi demo.sfs

The program Ds is a general purpose display program that can give simple plots of speech data on a range of graphics devices. The program flags request that the SPEECH item and the DISPLAY item make up the picture. Note that the name of the speaker and the token description is taken from the main header for the file.

Although the COEFF item itself is used by some programs, in most instances it is used simply on the processing path to some other goal: here we used it to get to a grey-level display. Since we no longer require the COEFF item, we can delete it with:

% remove -ico demo.sfs

The SFS utility program remove is used to edit the contents of an SFS file, allowing the selective removal of data sets. If we now summarise the file demo we get:

% summary demo.sfs 
1. SPEECH (1.01) 8960 frames from scopy(file=/usr/sfs/demo/life.sfs, 
item=1.02,history=agc(1.01))
2. COEFF -(11.01) 532 frames from spectran(1.01;window=8, overlap=6)
3. DISPLAY (9.01) 532 frames from dicode(11.01;dbr=50.00, nump=128)

At first appearance, it doesn't look as though remove has done its job. In fact the '-' sign after 'COEFF' is the indicator of the removal of the COEFF data set. The program remove has kept a record of the COEFF item in the file, whilst removing the spectral data itself, so that the processing history of the DISPLAY item can be traced back to the original speech waveform. If however we remove the DISPLAY item, and summarise, we get:

% remove -idi demo.sfs 
% summary demo.sfs 
1. SPEECH (1.01) 8960 frames from scopy(file=/usr/sfs/demo/life, 
item=1.02,history=agc(1.01))

Since the DISPLAY item was not part of the processing history of any other item, it can be completely removed. Also, since the removal of the DISPLAY item makes the keeping of the COEFF item stub redundant, it too is removed.

The procedure for running SFS programs follows the outline we see above. SFS programs take some data in an SFS file and return the results of calculation back to the file. If there is a choice about what item of data in the file is to be processed then the programs need to be supplied with the item number using '-i' switches. The processing history of the data sets is maintained inside the file, and when data sets are removed, sufficient information is retained to be able to trace the complete processing history of all the items left.

Section 2.3 gives more detail about item numbering, section 3.5 describes Es, and section 4.2 lists the most popular SFS processing programs.

This has been a brief introduction to the working of SFS, highlighting only a small but important fraction of its operation. The SFS documentation gives much more information about how SFS works, but unfortunately, a guide to speech processing is outside its scope.

Next Section