Department of Phonetics and Linguistics


Helen FIRMIN1, Sheena REILLY2, Adrian FOURCIN

1UCL and Camden and Islington Health Authority
2Institute of Child Health
Two standard techniques are used for the clinical examination of abnormal swallowing: Videofluoroscopy which depends on irradiation, and Cervical Auscultation, which makes use of a stethoscope. Both of these techniques have important disadvantages. The first does not lend itself to routine use and the second provides no reliable quantitative information. The aim of this work was to investigate the utility of some of the methods used in Speech and Hearing Sciences. These methods do not use radiation and have the potential to give more accurate timing information than can be derived from auditory/acoustic monitoring. Pilot data were obtained from the simultaneous use of four sensors: an ear-plug microphone of the type used successfully for the detection of otoacoustic emissions; a standard miniature electret microphone ordinarily used for speech recording; a miniature accelerometer of the type sometimes used for monitoring nasality; and a standard electrolaryngograph. Swallow measurements were made with twenty normal adult subjects. The most effective single signal was that provided by the use of standard electrolaryngograph hardware and software. A small but significant increase in reliability came from the combined appraisal of two signals, from the laryngograph and an accelerometer.

1. Background
It may seem a little odd that work in the fields of speech, hearing and language should impinge on, and stand to benefit from, a knowledge of the physiological processes basic to deglutition. The intricately coordinated mechanisms of voice production are, however, linked to those of swallowing and there is a potentially useful overlap of experimental approaches, equipment, and understanding.

1.1 Characteristics of a normal swallow
The act of swallowing is ordinarily described as consisting of four stages (eg Logemann, 1983):

oral initialoral final pharyngealoesophageal

The first two of these stages are under conscious control. The second two stages are normally autonomic and occur as parts of a complete peristaltic gesture in which food and drink are, as it were, swept from pharynx to stomach. Although the pharyngeal and oesophageal stages have here been given special experimental attention, the following brief overview of all four stages is intended to place them in the context of the complete process (figures based on Logemann, 1983).

2. Phases of a swallow

2.1 Oral initial phase

Figure 1

During the oral initial phase of swallowing, tongue movements differ between subjects; the oral enclosure is, however, relatively consistently defined. Typically a labial seal prevents the escape of liquid from the front of the mouth. Escape of liquid into the pharynx is prevented by a rear oral cavity enclosure produced by the positioning of the velum against the back of the elevated tongue.

2.2 Oral final stage

Figure 2

The oral final stage occurs when the tongue is moved so as to squeeze the bolus or liquid volume against the hard palate so that it is propelled past the anterior faucal arches. It is at this stage that the automatic reflexive gesture of swallowing is triggered. Normally this gives rise to the coordinated peristaltic reflex sequence described below with reference to figures 3 and 4.

2.3 Pharyngeal stage

Figure 3

The triggering of the peristaltic reflex is the beginning of the pharyngeal stage of swallowing. This stage has four main phases:

2.4 Final stage

Figure 4

In the final stage of the swallow, the bolus is transferred in a continuation of the peristaltic gesture from the cricopharyngeal to the gastro-pharyngeal juncture at the entrance to the stomach.

2.5 Aspects of abnormal swallowing
Disorders of swallowing may manifest themselves in a variety of ways. During the neonatal period, persistently poor feeding, characterised by weak sucking, and coughing and choking, leading to prolonged feeding times, are typical signs. There may be associated respiratory difficulty and even apnoea. Alternatively, difficulties may emerge when weaning is attempted with the introduction of spooned solids. Swallowing dysfunction in childhood may also present more unusually as an isolated weakness in the absence of other signs and presents in adults as a part of other neurological disorders.

The clinical sequelae of swallowing dysfunction may include repeated "penetration", when food gets between the vocal folds, and "aspiration", when food is inhaled into the tracheal airway. Poor oral intake leads to malnutrition and consequent failure to thrive - adversely affecting the child's growth and development. The results of abnormal swallowing can also lead to recurrent chest infections and the development of chronic lung disease.

Clinically, it is essential that swallowing be monitored in individuals in order to ensure that safe feeding protocols can be established. A number of approaches have been developed to assess swallowing.

3. Methods of monitoring

3.1 Established approaches
All methods depend on at least an initial assessment which involves taking an appropriate history together with making a clinical examination. This approach has many advantages but it is not quantifiable, may not define the precise nature of the trouble and may not detect silent aspiration - when the client inhales food into the lungs but does not cough or show discomfort. Complementary methods have been introduced to reduce these disadvantages. The three most frequently used methods are listed below in order of relative importance.

3.1.1 Videofluoroscopy
This method makes it feasible to examine both the structure and function of the organs involved (Ekberg, 1992). Dysphagia can be identified and silent aspiration detected. Recordings can be made and used for reference and, if required, measurement. Its success depends, however, on the swallowing of a controlled quantity of radio-opaque material, which may be unpleasant, and exposure to radiation, which must be brief. The method cannot be used frequently. It may be rather daunting in application for the child, and it is not a basis for interactive therapy.

3.1.2 Endoscopy
Fibrescopic Endoscopic Examination of Swallowing (FEES: Langmore, Schatz, & Olsen, 1988) makes use essentially of a nasopharyngolaryngoscope - a flexible endoscope of the type used in the voice clinic. Although it is an invasive procedure it is more acceptable for some adult patients and avoids the gagging associated with rigid oral endoscopy. It also can be used for biofeedback but it is in essence a cumbersome technique and not adapted for ready use with children.

3.1.3 Electromyography
EMG (Cooper & Perlman, 1996) recordings are also used and they confer the advantages of precision in identification and accuracy of temporal measurement. This method is, however, dependent to a large extent on signals derived from electrodes inserted subcutaneously. Whilst EMG is a valuable research tool it is not well suited to routine clinical investigation.

3.1.4 Cervical Auscultation CA
This method involves the placement of a sensor (originally a stethoscope was used) on the neck of the subject and either listening and/or recording the acoustic signals which are produced from a microphone as by-products of the swallowing processes. These signals are typically visually presented and examined as waveforms or spectrograms. The first work using these signals led to the description by Bosma and his colleagues of the normal swallow as being associated with two discrete and perceptually distinct sounds - the "Initial Discrete Sound" (IDS) and the "Final Discrete Sound" (FDS) (Bosma 1992; Heinz et al 1994). Many other investigators have explored the use of this approach for the examination of dysphagia. A variety of sensors have been employed in attempts to define an optimal configuration but the best choice of sensor, or of sensor combination, is still not clear.

3.2 Other sensors
The essential advantage of the CA method is that it provides useful information in a rapid and simple non-invasive fashion. Its essential disadvantage is that the auditory/acoustic information that it provides is not readily linked to particular physiological sources and is often not very clear physically. Data are generated not as a direct result of the processes of swallowing but rather as an adventitious side product. It is the intrinsic and relative movements of the organs associated with deglutition that are important in the understanding of normality and the detection of pathology. Following Bosma's initial lead, auditory CA has, nevertheless, made a substantial contribution to the clinical detection and management of swallowing problems and this has led to the investigation of the possible use of other sensors in place of the microphone.

3.2.1 Ear Probe
This sensor has been shown to respond to the "intrinsic sounds of swallowing". An advantage of the ear probe is its site of placement. Whilst scars and skin changes post-radiation can make neck mounting of a sensor difficult (poor mounting can lead to extraneous noise in the signal) no such difficulties are encountered with the ear probe. This sensor also has the advantage that it may be more acceptable for children than the more standard techniques (and it might be useful for biofeedback).

3.2.2 Accelerometer
The accelerometer is a neck mounted sensor which is held in place by a lightly adhesive strip. It has a wide frequency response range and can be obtained in miniature unobtrusive formats. It is vibrated by the "epidermal vibrations caused by internal sounds and vibrations reaching the surface where it is attached" (Kuhn, 1995), i.e. it responds to the movements of internal organs. The acoustic information the accelerometer provides, whist not 'sounding' like a swallow mediated by CA, appears closely related to the discrete swallow sounds detected on the neck by a stethoscope or a microphone.

3.2.3 Electrolaryngograph
The time constants used in the design of the normal laryngograph were chosen to give useful responses to impedance changes associated with vocal fold vibration. Any neck impedance change between the electrodes will, however, have an effect on the Lx signal output as a function of the magnitude and rapidity of the change. Swallowing has an effect which is greater or less directly as a function of these internal time constants. In the limit it is possible to arrange for a response down to zero frequency (Gx setting). It has previously been suggested within the dysphagia literature (Sorin, McClean, Ezerzer & Meissner-Fishbein, 1985; Perlman & Grayhack, 1991) that including an electroglottograph as a sensor in swallowing studies may be of benefit to clinical work. Perlman earlier (personal communication) attempted to convert a standard Laryngograph in her own laboratory so as to obtain Gx for swallowing studies. The results were not encouraging and this possibility is revisited later in the present description.

The output of an environmental microphone was also included in the set of recorded data. It was used to register the name of the subject, the condition of the test, and also to provide clear information regarding the moment when the pharyngeal stage of the swallow was initiated on a separate monitor channel. This ensured that no 'dry' swallows, coughs or vocalisations were included within this study accidentally.

3.3 Outstanding problems in sensor selection
Although the use each of these sensors has been investigated in many separate swallowing studies no clear best choice has as yet emerged. One major difficulty arises from the lack of a basis for rigorous cross comparison. This is only feasible when sensors are used simultaneously to monitor the same swallow and when each of a representative group of subjects undertakes the same swallowing task. Two further difficulties will then remain. The first is to arrive at a videofluorographic calibration of the best sensor or sensor combination. The next arises, in the present context, from the need to have a system which is readily applicable with children.

4. Experimental aims and procedures
The present work was necessarily limited in scope and the main part of the investigation was designed only to provide information directly relating to the optimal choice of sensor. This led to the definition of two primary objectives.

4.1 Objectives

4.1.1 Which sensor provides the most accurate, consistent and reliable basis for the identification of the characteristics of a 'normal' swallow?
And, as a closely allied aim:

4.1.2 Using this approach, what are the main characteristics of the 'normal' swallow?

4.2 Sensors used
Each of the four sensors described above was included in the study using techniques which had already been proved in related applications.

4.2.1 The ear probe (EM 3046)
The ear probe was provided by courtesy of Otodynamics. This is a standard audiological wide band acoustic receiver ordinarily used for the detection of otoacoustic emissions in a combined transmitter and receiver housing. Otodynamics also provided the range of ear plugs needed for adaptation to individual subjects.

4.2.2 The accelerometer (BU 1771)
The accelerometer was obtained from Laryngograph Ltd. It was used together with a processing circuit originally designed for the detection of nasal wall vibration as part of a system for the display and measurement of nasality in speech.

4.2.3 The Laryngograph
The Laryngograph used in the main investigation was typical of the current range employed in voice clinics and had a "normal Lx" time constant In order to provide background information concerning the use of the laryngograph, a small amount of swallowing data was also obtained using a Gx, zero frequency coverage, laryngograph. Although true Gx requires a care in adjustment, which makes it difficult to apply in a clinical environment, it can be helpful in the interpretation of the Lx waveform itself.

4.2.4 Neck mounted microphone (EK3132)
The fourth sensor was a miniature neck mounted microphone (EK3132) of the type used for free field speech recordings. This followed the tradition of CA itself without, however, using a stethoscope coupling.

4.2.5 Audio microphone
During the data gathering a fifth sensor, provided by a standard audio microphone, was used to monitor the experimental sequences.

4.3 Subjects and sensor placement
Twenty normal young adult volunteers, 7 men and 13 women, with no history of dysphagia took part in the tests. Each subject was seated and requested to swallow 20 ml of water on three separate occasions. On each occasion the subjects held the liquid in their mouths until a signal was given for them to swallow. This procedure was followed in an attempt to ensure that a common initial swallowing starting point was obtained for all observations.

Figure 5 Sensor Placement

The placement of the sensors was consistent for all subjects and resulted from reference to prior work (Takahashi, Groher & Michi, 1994; Reddy, Gupte, Green and Camilang, 1994; Hamlet, S., Penney, D.G. & Formolo, J. (1994)) and our own initial exploratory experiments.

4.4 Signal acquisition assessment and evaluation
The outputs from the four main sensors plus the monitoring microphone were saved simultaneously for each swallow on an ADAT recorder (an eight channel digital audio recorder using special VHS tape to give standard 16 bit CD audio quality for each channel). Subsequent display and analysis of the data was based on the use of the Laryngograph Ltd SPG program.

Figure 6 Sensor outputs for a typical swallow

The four signal presentations shown in this figure are those used for the measurements reported here. Whereas 200 Hz analysis bandwidth spectrogams were used for the two microphone based sensors, ear probe at the top (1) and neck microphone (3), the raw waveforms are shown for the accelerometer (2) and laryngograph (4) signals. These modes of presentation were chosen heuristically simply to obtain the clearest bases for interpretation - further work is quite likely to lead to more sensitive techniques.

The data obtained for the sixty individual swallows from these normal subjects were very consistent both within and between subjects. This made it feasible to assess the outputs from each of the sensors on the basis of a simple rating protocol based on the application of subjective visual criteria.
PoorNo change in signal, and therefore no indication of the presence of the precursor, IDS, FDS or end of swallow.

Rating = 0

Variable During some swallows there were indications of the presence of the precursor, IDS, FDS or end but this was unreliable

Rating = 0

GoodReliable indication of different aspects of swallow, but needs confirmation from another sensor

Rating = 1

Very Good Sensor which accurately indicates the presence of different aspects of each swallow and does not require another sensor to confirm this.

Rating = 2

Table 1

4.5 Table 1 Rating Protocol
The criteria defined in Table 1 were applied for the family of outputs from each subject, rather than for each separate sensor type. This approach was chosen since it gave a practical clinically relevant coherence to the overall appraisal.

Quantitative measurements were made on the combined plots of the form shown in Figure 6. Each swallow was treated separately and its data used in combination to get the best estimates of the three main temporal intervals shown - precursor to IDS; IDS to FDS and FDS to end of swallow.

5. Experimental results

5.1 Sensor output comparisons
When the results of applying the protocol of Table 1 to every one of the sixty sets of individual swallow data were collated the rather unexpectedly clear set of overall rating comparisons shown in figure 7 emerged.

Figure 7 Sensor Rating Comparisons

The electrolaryngograph and accelerometer outputs were both distinctly more useful in identifying three of each of the four main swallowing events than either of the other two sensors. The exception was in respect of the precursor. The beginning of a swallow was better identified, from these data, when outputs from either the ear probe or the neck microphone were used than when the accelerometer was employed. On a single sensor basis, the Lx signal from the electrolaryngograph was the most reliable source of information for each main temporal event. The visual appraisal of these two sources of information in combination produced an improvement in detection score for the end of the swallow from 67% for the laryngograph alone to 80% in combination with the accelerometer. There was a reduction, however, from 70%, using the Lx signal alone for the identification of the precursor to 50% when both signals were used in combination.

5.2 Subject differences

Figure 8 Individual Subject Mean Swallow Durations

Mean (ms) Male

(n = 7)


(n = 13)

p level*
Total duration 956.33 ms880.62 ms 0.08
Precursor to IDS 112.34 ms151.44 ms 0.10
IDS to FDS 701.78 ms585.69 ms 0.00 **
FDS to end of swallow 142.23 ms143.47 ms 0.96

* independent samples t-test (2-tailed)

Table 2 Mean Swallow durations for male and female subjects

When an independent samples t-test was applied to the four sets of temporal intervals taken for each swallow for each subject the only significant difference between male and female subjects was for the IDS to FDS interval (see also Figure 7 in regard to the dominance of these individual events).

6. Discussion
Qualitatively, on the basis of the present observations, the electrolaryngograph has provided the most consistent and reliable indications relating to the main swallow events which have been established by prior work in the field of Cervical Ausculation. Quantitatively, the timing characteristics, which have been measured here, correspond to those found using other indirect methods of observation. Absolute, as opposed to relative, accuracy, however, can only be established with simultaneous use of videofluorography and this has not been done in this necessarily limited study, although the present results certainly make this extension worthwhile.

The second most useful sensor found in this study was the accelerometer and it appears that there could be a slight advantage in the combination of the outputs from the two sensors. A further significant advantage could come from the simultaneous use of more than one set of electrolaryngograph electrodes. Perhaps a specially significant aspect of this result, however, is that both of the best sensors respond directly to internal organ movement as opposed to the acoustic by-products of movement. This may prove to be an essential indicator in regard to the best way forward in respect of the choice and future development of non-invasive means for the monitoring and measurement of swallowing. Finally, the use of biofeedback could be another fruitful result of introducing speech-based techniques into the management of dysphagia by the provision of clear real-time visual and or auditory displays of abnormal and normal function.

We would like to thank Mahen Goonewardane and David Cushing for their technical support. This work was done partly with the help of funding from the North West Thames Health Authority.

Allaire, J.H., Riordan, B. & Gillies, G.T. (1994) A Swallowing Frequency Device. Presented at the Second Workshop on Cervical Auscultation, Ritz Carlton Hotel, Tyson's Corner, VA, October 13, 1994

Bosma, J. (1992) Introduction to the Cervical Auscultation Workshop, Department of Paediatrics, University of Maryland, Baltimore, Maryland, April 22, 1992

Cooper D.S., & Perlman, A.L. (1996) Electromyography in the Functional and Diagnostic Testing of Deglutition In Deglutition and its Disorders pp255-285; Singular London UK

Ekberg, O. (1992) Radiographic Evaluation of Swallowing. In Dysphagia: Diagnosis & Management (2nd edition) ed Groher, M.E. Butterworth Heinemann, USA

Hamlet, S. (1992) Auscultation of Feeding Sounds at the Ear. Presented at the Cervical Auscultation Workshop, Department of Paediatrics, University of Maryland, Baltimore, Maryland, April 22, 1992

Hamlet, S., Penney, D.G. & Formolo, J. (1994) Stethoscope Acoustics and Cervical Auscultation of Swallowing. Dysphagia 9 : 63 - 68

Heinz, J.M., Vice, F.L. & Bosma, J.F. (1994) Components of Swallow Sounds. Presented at the Second Workshop on Cervical Auscultation, Ritz Carlton Hotel, Tyson's Corner, VA, October 13, 1994

Kenny, D., McPherson, K., Kasis, M. &Judd, P. (1992) Possible Applications of Cervical Auscultation in Observations of Swallow: Respiration Interaction in Dysphagic Children. Presented at the Cervical Auscultation Workshop, Department of Paediatrics, University of Maryland, Baltimore, Maryland, April 22, 1992

Kuhn, P.M. (1995) A Review of sensing Devices for Cervical Auscultation. Presented at the Year in Cervical Auscultation, Tyson's Corner, McLean Virginia, October 26, 1995

Langmore, S.E., Schatz, K. & Olsen, N. (1988) Fiberoptic endoscopic examination of swallowing safety: A new procedure Dysphagia,2, 216-219

Logemann, J.A. (1983) Evaluation and Treatment of Swallowing Disorders. College-Hill Press

Morrell, R.M. (1992) Neurological Disorders of Swallowing. In Dysphagia: Diagnosis and Management (2nd ed) Eds Groher, M.E. Butterworth-Heinemann, USA

Perlman, A.L. & Grayhack, J.P. (1991) Uses of Electroglottograph for Measurement of Temporal Aspects of the Swallow: Preliminary Observations. Dysphagia 6 : 88 - 93

Reddy, N.P., Gupta, V., Prahbu, D.N.F., Green, P., and Canilang, E.P. (1994) Aceleration Measurments During Swallowing & Coughing. Second Workshop on Cervical Ausculation VA October 13, 1994

Sorin,R., McClean, M.D., Ezerzer, F., & Meissner-Fishbein, B. (1985) Electroglottographic evaluation of the swallow. Archives of Physical Medicine and Rehabilitation

Takahashi, Groher & Michi, (1994) Methodology for Detecting Swallowing Sounds Dysphagia 9; 54-62

© Helen Firmin, Sheena Reilly and Adrian Fourcin.


Page created by Martyn Holland
for comments