SFS

NAME

sfs - speech filing system library

OVERVIEW

The SFS library contains the essential routines for manipulating data files. Other utility routines are provided in this library, but these are documented in individual manual pages.

SYNOPSIS

SFS library routines are:

#include <sfs.h>
int     sfsstruct[];

char *  sfsbase()

char *  sfsfile(filename)
char    *filename;          /* speech file pathname */

int     sfsopen(filename,mode,head)
char    *filename;          /* speech file pathname */
char    *mode;              /* access mode requested */
struct main_header *head;   /* returned main header */

int     sfsdup(fid)
int     fid;                /* file descriptor */

int     sfsclose(fid)
int     fid;                /* file descriptor */

int     sfsnextitem(fid,item)
int     fid;                /* file descriptor */
struct item_header *item;   /* returned item header */

int     sfsitem(fid,datatype,spec,item)
int     fid;                /* file descriptor */
int     datatype            /* generic data type */
char    *spec;              /* item specification */
struct item_header *item;   /* returned item header */

char *  sfsbuffer(item,numf)
struct item_header *item;   /* item header */
int     numf;               /* buffer size in frames */

int     sfsread(fid,start,numf,buff)
int     fid;                /* file descriptor */
int     start;              /* index of first frame */
int     numf;               /* number of frames to load */
char    *buff;              /* buffer address */

void    sfsheader(item,it,floating,datasize,framesize,
               frameduration,offset,windowsize,overlap,lxsync)
struct item_header *item;   /* item header */
int     datatype;           /* item type */
int     floating;           /* structured/integer/floating flag */
int     datasize;           /* datum size */
int     framesize;          /* frame size */
double  frameduration;      /* frame duration */
double  offset;             /* data set offset */
int     windowsize;         /* fixed frame window size */
int     overlap;            /* fixed frame overlap */
int     lxsync;             /* excitation synchronous flag */

int     sfschannel(filename,item)
char    *filename;          /* speech file pathname */
struct item_header *item;   /* output item header */

int     sfswrite(fid,numf,buff)
int     fid;                /* output file descriptor */
int     numf;               /* number of frames to write */
char    *buff;              /* buffer address */

int     sfsupdate(filename)
char    *filename;          /* speech file pathname */

int     sfsaddlink(item,numf,link,filename)
struct item_header *item;   /* output item header */
int     numf;               /* number of frames in link item */
struct link_header *link;   /* link description header */
char    *filename;          /* output filename */

DESCRIPTION

The include file "sfs.h" includes the definitions of main_header and item_header, all the structures used in structured data items, standard sizes and SFS library definitions. The array "sfsstruct" (included in sfs.h) contains the lengths of the header portions of structured data items.

sfsbase returns a pointer to a static area containing the pathname of the base subdirectory for the sfs software. The default directory is built into the software at compile time, but may be overridden with the environment variable SFSBASE.

sfsfile returns a pointer to a static area containing the full pathname of a speech file. The environment variable SFSPATH specifies a list of directories separated by ':'. These directories are searched in turn and if the given file is found, then the full pathname is returned. If the file is not found, the original filename is returned unchanged.

sfsopen attempts to open filename to check that it is a valid speech file. There are five supported operations requested as a string passed in mode:

"r" read main header into head, check file ok for reading, return file identifier.

"w" read main header into head, check file ok for updating, return file identifier.

"h" write main header from head, return success code.

"c" create new file using head, return success code.

For any operation, default action is performed if head is supplied as NULL. sfsopen returns the file identifier or 0 if the operation is successful. It returns -1 if the file does not exist, and -2 for any other error.

sfsdup duplicates a file descriptor opened on a file using sfsopen. This routine should be preferred to opening the file twice. The duplicate descriptor is positioned in the file identically to the original descriptor at the time sfsdup was called.

sfsclose closes a file descriptor opened with sfsopen. Note that there are a maximum number of files that may be opened simultaneously.

sfsnextitem locates the next data set in a file opened with sfsopen, and returns the item header if found. If supplied with a NULL argument for the item_header, then repositions at the start of the file. sfsnextitem returns 1 on success and 0 if there is no next data set.

sfsitem attempts to locate an item in a file opened with sfsopen, that meets the specification given by the variables datatype and spec. The variable datatype should contain the generic 'datatype' code for the data set. The variable spec should contain either the number of the data set expressed as a string (with "0" for last, "-1" for first) or a pointer to a string containing a string match expression that should be matched against the history fields of the items in the file of the type datatype using the routine histmatch(SFS3). If datatype is given as 0, the routine locates the first item of any type in the file. sfsitem returns 1 on success and 0 on failure. If the item is found and item is not NULL, then the item header is returned.

sfsbuffer creates a memory buffer to hold data frames for any item in a suitable format for reading or writing data from/to a file with sfsread/sfswrite. sfsbuffer takes details about the item frame format from the item header, and the size of the buffer in frames (for framesize equal to 1, 1 frame = 1 sample). The buffers are constructed as arrays of the primitive data structure for the item: e.g. short sp[]; struct lp_rec lp[]; struct fm_rec fm[]; etc. Note that creating a general output buffer for annotation items is very inefficient, so you should consider using a (sfs) buffer of length 1, or creating an array of an_rec within your own program.

sfsread loads the whole or part of a data set into memory once it has been located by sfsitem or sfsnextitem. The routine takes the file descriptor, the start index of the data in frames, the number of frames to be loaded, and the address of a memory buffer in which to place the data. This buffer should be created with sfsbuffer. sfsread returns the number of frames actually loaded, or zero if there is a read error. Read access to unstructured and fixed-length structured items can be made in any frame order, read access to variable-length structured items (currently only annotations) must be made in serial order.

sfsheader should be used to initialise all item headers. The entire item header is first cleared then the arguments are used to initialise appropriate fields. sfsheader also adds machine-specific information to the header so that the exact format of floating point numbers, etc can be determined by sfsread. Thus you should not use old item headers for new data, since these may have been created on a different machine.

sfschannel opens an output file to hold data to be added to a speech file. The name of the speech file is used to create a temporary file in the same directory, and to associate an output item to a particular file. The item header should be created with sfsheader and have the "history" and "params" fields initialised separately. The routine returns a file descriptor greater than 0 on success, and -1 on failure.

sfswrite should be used to write new information to the output channel opened by sfschannel. The routine takes the file descriptor provided by sfschannel, the number of frames to be written, and a pointer to the buffer where the data is held. Note that the buffer should be created with sfsbuffer, or should be constructed using the same conventions. sfswrite automatically updates the "numframes" and "length" fields in the appropriate item header. sfswrite returns the number of frames written on success and 0 on failure.

sfsupdate is the main routine for adding new data items to speech database files. Apart from a few special utility programs, all file update programs should use this routine to add data sets to the file. Once data has been stored using sfschannel and sfswrite, this routine should be called to update the speech file. sfsupdate returns 1 on success and 0 on error. In either case, all temporary files created by sfschannel for the given file are deleted.

sfsupdate checks the contents of the new datasets against the contents of the datafile. It uses the criterion that two items are duplicated if their history fields are identical. If no duplicate items are found, the data sets are appended to the datafile. If duplicate items are found, the file is restructured. If the duplicate datafile item has not been subsequently processed it is deleted. If the duplicate item has been used as input to a subsequent item, it is "truncated" to its item header only (and with its datatype field negated). In all cases the subtype fields of the new items are automatically given numbers according to their position in the file.

sfsaddlink allows users to add "virtual" or "linked" items to files. The routine accepts the item header of the data set to be linked to, and the number of frames in that data set (or portion of data set), it also accepts the link header structure detailing where the item is actually stored, and the filename of the file into which the virtual item is to be stored.

VERSION/AUTHOR

1.0 - Mark Huckvale
Fri Jul 09 14:54:51 2004