Whereas intonational focus in English has been the object of a large number of studies, its existence in Spanish has not often been recognized. Most analyses admit the possibility of accents other than the last one being made emphatic for several reasons but not to signal the information structure of a sentence in terms of old vs. new information nor a reduction of following accents (Navarro, 1948; Alarcos, 1950). However, some studies on Latin American Spanish go against this general trend and report foci realized by nuclear shifts and deaccenting (Bolinger, 1954; Contreras, 1978; Ortiz-Lira, 1994). My informal observations suggested that people intuitively accept early accentual prominences as a focus mechanism in Castillian Spanish, particularly when the focused item is at the beginning of a sentence. Accordingly, a test was designed in order to find out if early focus is possible in Spanish. In the sentences analyzed here, all speakers signalled the information structure that was being prompted by placing the greatest pitch prominence in the sentence at the beginning, that is, on the accented syllable of the focus domain. In what follows I will analyze and compare the realization of early focal accents in English and Spanish. The analysis of post focal material falls outside the scope of this paper. The number of English sentences and informants was smaller since English was used as a control group against which to compare Spanish results.
For the test, 13 sentences for Spanish and 6 sentences for English were analyzed, all of them presenting focus on the grammatical subject in domains with a single stressed lexical item. The sentences used were simple declarative sentences because it was felt that other types might exhibit inherent focus structures (House 1981). Sentences were constructed so that the syllables where focus was expected to fall were domain initial in some cases (sentences 1-7 in Spanish and 1-3 in English) and preceded by other unstressed syllables in other cases (sentences 8-13 in Spanish and 4-6 in English). Included in both types there were words in which the stressed focused syllable was followed by other unstressed syllables in the same word, and words where there were no syllables following the stressed one in the same word and domain. As can be seen in the appendix below, there were no voiceless sounds within the target domains. This was contrived to avoid gaps in their pitch traces.
The stimuli used to elicit focus in Spanish took the form of contrastive or wh-questions so as to find if one of these reasons for focus proved to be a better trigger for focus or produced differences in focus realizations.
The type of Spanish analyzed was Mainland-Northern Spanish as spoken in Vitoria. Four speakers were chosen for analysis. A separate test proved that none of these speakers were identifiable as Basque. There were three men and one woman.
The English subject was a female mainstream RP speaker, born in the greater London area.
All speakers were between 25 and 40 years old at the time of the recording and had at least a secondary school education. None of the subjects were bilingual although all of them had studied a foreign language at secondary school.
Subjects were recorded in a soundproof room. A microphone was set at an angle from their mouth at a distance of about 20 centimetres. They wore a set of headphones attached to a personal stereo in which the tape with prompt questions was being played. A pair of electrodes was set on the surface of subjects' necks by means of a collar band to which the electrodes were attached. The electrodes were connected to a portable laryngograph. Speech signals were entered into one of the channels of a tape recorder and laryngeal signals were fed into the other channel.
3. Analysis Procedure
Before the analysis proper, all speakers' productions were input, processed and stored in the PCLX programme. An individual file was created for each speaker which consisted of the laryngeal signal recorded for the whole test - between one and three minutes of spoken material. These files were used for statistical analysis of the speakers' laryngeal characteristics. PCLX provided the necessary statistical analyses. Second order distribution (Dx 2) statistics provided information about each speaker's fundamental frequency mean, mode, median, standard deviation and ranges.
The laryngograph signal recorded on one of the channels of the tape was analyzed in the Phonetics laboratory at UCL using PC-Pitch mode of PCLX. All sentences were analyzed individually with this programme. Measurements were taken in Hertz at all points where there seemed to be changes in pitch direction and at all starting and ending points neighbouring gaps -voicelessness- in the signal.
A second trace analysis, also carried out at the Phonetics laboratory in UCL, was done with the Speech Filing System Programme (SFS) devised at UCL by Dr. M. Huckvale (Huckvale 1988). The version used allowed for the simultaneous presentation on screen of the following synchronized signals: speech waveform, laryngeal waveform, HQTX excitation period measurements and pitch trace. Sentences were analyzed again for problem areas and segmentation purposes. If period measurements appeared to be wrong, they were discarded and a median of the five nearest correct periods was obtained as measurement for a particular point. In the case of peaks following voiceless consonants the first two or three periods, the onset of voicing, were discarded and peak measures were taken of the steady part of the vowel, or, in the case of kinetic vowels, a median measurement was obtained using the five contiguous highest frequency periods. SFS analysis of troughs was intended to filter out nodes which might correspond to micro intonation, particularly in the case of voiced obstruents.
It was thought appropriate to characterize accents according to three points. These relevant values were labelled as follows: On-Glide -the turning point immediately preceding a peak and excluding micro-intonational effects-; Peak -highest point on the accented vowel- and Off-Glide -turning point immediately following the peak and excluding micro-intonational effects. There were no cases of accents characterized by downward pitch prominence so we can define accent prominence as peaks.
Values were standardized to make inter-speaker comparisons possible. Fundamental frequency measures were converted into percentages of each speaker's mean fundamental frequency1. The speakers' mean fundamental frequencies were the following:
Spanish speaker 1= 119.60 Hz, Spanish speaker 2= 201.30 Hz, Spanish speaker 3= 126.40 Hz, Spanish speaker 4= 110.20 Hz, English speaker= 231 Hz.
Fundamental frequency percentages were divided in three registers: High, Mid and Low. It is impossible to give a precise definition of those categories in terms of percentages or Hertz values since, even if dealing with normalized measures, the range of each category will vary depending on the speaker's pitch range and factors such as declination. Nevertheless, a simple working parameter is proposed according to which a cut off point was chosen for the range of values that can be considered to fall within speakers' mid registers. All percentages that were higher than this register were classified as high and percentages below the mid register as low. The cut-off point for mid register limits was chosen in the following manner. Accent peak percentages2 for all the speakers analyzed were noted. Peak measures were preferred to on-glides or off-glides because of their being the most stable ones. It was observed that peak heights fell into four quite homogeneous groups for each speaker. Means were calculated for each group and speaker and for all speakers in each height group. This was considered necessary for the same reasons that prompted normalization of Hertz values, namely, to allow for cross speaker comparison of patterns. Spanish means were calculated separately from English ones because of the greater size of the Spanish sample and the obvious differences in ranges between both languages (Navarro Tomás 1948, Stockwell and Bowen 1965, Cruttenden 1986). The height group that fell closest to the speakers' mean fundamental frequency (0%) was considered to represent the range of speakers' mid registers. For English this height group had a mean value of 8.80% above the speaker's mean F0 whereas for Spanish speakers it was 4.90%. Accordingly for English we will consider mid range those values which fall between ±9% and, in Spanish, values which fall between ±5%. If a given value should be verging between two registers, the one it belongs to according to the above procedure is indicated in capital letters and the register it borders with is indicated in lower case. The characterization of pitches as high, mid or low was done for purely practical reasons. It is not intended to represent functional units of intonation but a simple system of labels by which to refer to pitch. Therefore percentages of the mean frequencies will be used for most of the analysis.
Table 1. Spanish focal accent characterization in percentages
of the speakers' mean fundamental frequencies.
Table 2. Spanish focal accent label characterization
According to the labels in table 2 we can do a broad characterization of Spanish focal accents. The preferred on-glide to accents lies within the mid register. There are also some cases where the starting point to a high to low fall is also in the high register. Interestingly, these cases always correspond to accents where the accented syllable is not preceded by unstressed syllables (sentences 1, 2, 3 and 4) and all the accents that have a mid peak are preceded by a low on-glide. We shall see below if this is a significant correlation.
The most common pattern is for focal accents to be realized as falls from high to low. This is particularly the case for speakers 3 and 4. Speaker 2 shows a nearly similar tendency for her falls to go into the low register or to stay between the low and mid registers. There is one accent (speaker 4's nº 5) whose off-glide is given two numerical values in table 1 and two labels in capital letters in table 2. This is because it is a falling-rising accent, the only one found amongst focal accents. The labels represent a fall from high to mid which rises again to high. There are only three accents that fall from mid to low and no accents that stay within the low register.
Therefore, in most subject focal accents the pitch glides up from mid to high and then falls into the low register. The depth of the fall varies amongst speakers, three of them tend to reach down into the low register quite consistently. Speaker 2, the female speaker, quite often has less deep falls, staying within or around the mid register.
The following figure shows the average configuration of Spanish
Figure 1. Spanish focal accent realisation in percentages of the speakers' mean fundamental frequency. The bars represent +/- standard deviation about the mean.
We will now examine sentences according to different variables to see if they affect the behaviour of focal accents. These variables will be information type (contrastive or new) and syllable structure. When relevant, speaker-bound differences will be noted.
For those paired variables in which one of the groups had fewer samples than the other a recoding was done in which missing values were substituted with the mean value for that group. This method affected the standard deviation -which will therefore not be noted- but it did not affect the value of t nor the probability.
4.1.1 Peak Height
Applying a two tailed Paired t-Test to new information focal accents versus contrastive ones, we find that the difference between their peak heights is significant at a p< 0.05 (t= -2.06, p= 0.05, DF= 27) so that contrastive accents are significantly higher. However, if we ignore the extra high value produced by speaker 4 in sentence 10 (its peak is 60% above the speaker's mean and it might be judged to be an exceptional case) the difference is non-significant (t= -1.93, p=0.06). This would lead us to believe that there are not enough grounds to conclude that peak height varies according to information structure type.
The difference between accent peaks in sentences with unstressed syllables preceding the peak and those in which the accented syllable was sentence initial is not significant (t= 0.91, p= 0.37). This means that a desired peak height can be attained irrespective of the number of syllables preceding it (but see (ii) below).
There is a significant difference between both speaker 4 and the female speaker (number 2) as opposed to speaker 1, the latter having a lower mean height (vs. speaker 4 and ignoring the extra high value in sentence 14 t= 2.32 p= 0.04 DF= 12) (vs. speaker 2 t= -3.5 p=0.004 DF= 12). If accent prominence, in the sense of peak height, is considered to be dependent on the degree of emphasis, we can conclude that speaker 1 was being the least emphatic whereas speaker 4 was the most emphatic one.
It can be seen in table 1 that there are four cases where no on-glide values are provided (sentences 1 and 4 by speakers 1 and 3). This is because it was the case that there was no gliding up of the pitch to the accent, but the fundamental frequency started at its peak value. In these cases missing values will be substituted by the mean.
There are no significant differences between the on-glides to contrastive and new information focal accents (t= 0.22, p=0.57). On the other hand, there is a significant difference between the on-glides in sentences with unstressed syllables preceding the peak and those in which the accented syllable was sentence initial, the former being higher (t= 5.43, p= 0.0001). Therefore, the presence or absence of unstressed syllables before the accented one is a determining factor for the height of the on-glide.
As far as speakers' mean values are concerned, the only significant differences in on-glide values were to be found between the female speaker and speaker 4: t= 2.50, p= 0.28.
In order to establish whether the on-glide's height is related to the peak height, a Pearson's correlation coefficient was obtained. The result indicates that there is a positive correlation between both variables at a 5% significance level (r= 0.41, n= 48).
The only variable which proved to be relevant was "speaker". The female speaker produces higher off-glides than the male speakers, but this difference is only significant in the case of speaker 3 (t= 6.50 p= 0.0001). On the other hand speaker 3 also produces significantly lower off-glides than speaker 1 (t= 3.60 p= 0.004). No other differences are statistically significant.
A Pearson's correlation coefficient proves that there is no correlation between peak height and off-glide measures at a 5% significance level (r= -0.05, n= 52).
By excursion we mean the distance between the peak and both its on-glide and its off-glide. These are measures that indicate how much pitch movement there is in the realization of a particular accent. The following are the mean excursions for each of the speakers:
Speaker 1: On-glide/Peak= 12.15; Peak/Off-glide= 23.53
Speaker 2: On-glide/Peak= 18; Peak/Off-glide= 29.46
Speaker 3: On-glide/Peak= 17.61; Peak/Off-glide= 36.76
Speaker 4: On-glide/Peak= 26.15; Peak/Off-glide= 44.48
As can be seen, both types of excursion are consistent in speakers 1 and 4. Speaker 1 shows the least pitch movement to and from accented syllables. Speaker 4 shows the greatest pitch movement both to and from accented syllables even if his extra high peak is ignored. The other two speakers are between both extremes.
To sum, up the characterization of Spanish focal accents would be as follows:
There is no difference between focal accents that indicate contrast and those that signal new information. The only significant difference to be found, that related to peak height, is seen to disappear if the extra high accent produced by speaker 4 in sentence 10 is ignored. Syllable structure is related to on-glide height. On-glides which occur on unstressed syllables preceding the accented one are significantly lower than those which occur on the accented syllable itself. It seems to be the case that in the absence of unstressed material before the accent, the on-glide starts higher probably because if there are unstressed syllables there is consequently more room for the pitch to attain the desired peak height. On the other hand on-glides are also related to peak heights. There is a positive albeit weak correlation between both variables so that we find that higher peaks tend to be realized with higher on-glides. This could also be interpreted the other way around: peak height is related to on-glide height in the sense that higher on-glides tend to result in higher peaks. However, we favour the idea that it is the peak which influences the on-glide, since it seems more reasonable to assume that speakers decide what height to give to an accented syllable rather than the height of the break point before it.
For off-glides, only the "speaker" variable showed significant differences. It is worth noting, however, that off-glides present less standard deviation than the other two points.
As for speaker bound trends, it can be seen that Speaker 4 produces the highest peaks, even if his extra high production in sentence Nº 10 is excluded, whereas speaker 1 produces peaks lower than the other three speakers. If we take peak height to be a measure of emphasis it can be concluded that speaker 4 was the most emphatic whereas speaker 1 was being the most restrained. Additionally, if emphasis is seen as manifested by pitch excursion, it is apparent that again speaker 4 is the most emphatic and speaker 1 the least so.
The highest on-glides and off-glides are produced by speaker 2. There are not enough data to conclude whether this is related to the fact that it is a female speaker.
4.2 English Focal Accent Characterization
According to the procedures detailed above, the following values (table 3) were obtained for the focal accents:
Table 3. English focal accent characterization in percentages of the speaker's mean fundamental frequency.
Pitch values above the mean fundamental frequency (0%) by +10% or more will be considered high; values below the mean by -10% or more will be considered low for English. Mid will be the label applied to those values that depart from the mean by less than ±10%, that is, by a maximum of ±9%. The choice of ±9% as a cut off point was decided in the manner indicated above.
According to the system of labels proposed, subject sentences
would be characterized as follows (table 4):
Table 4. English focal accent labels.
As can be seen in the table above, all focal accents are realized as falls from high to low. On-glides to those falls vary according to syllable structure. Thus, in the first three sentences, in which the accented syllable was not preceded by any other syllables, there is a small rising glide onto the accent which starts at approximately the same level, high. Sentences 4, 5 and 6 are preceded by unstressed syllables which provide more material for the speaker to attain the high level required in the accent in a more gradual manner. Therefore the speaker starts at her mid or slightly lower than mid level and glides up to the accented high syllable. More particularly, there is a tendency for the actual pitch at which the on-glide starts to be related to the height attained in the peak too. Thus, the sentence with the highest peak (nº 3) is also the one with the highest on-glide. This is, however, only a tendency.
On the other hand, off-glides do not seem to be related either
to peak height or to syllable structure. There is altogether,
as can be seen in figure 2, far less variation in the pitches
attained in the off-glides (Standard Deviation: 4,446). The speaker
usually goes well below her mid register. Sentence number one
is the only exception. The off-glide is just below the mid-register.
This would be an example of a non-low fall (Cruttenden 1986: p.
53), the pitch descends from a high to a level just below mid.
The off-glide is realized in the second (sentences 1 and 6) or
first (sentences 2, 3, 4 and 5) unstressed syllable after the
accented one. Either alternative does not seem to be related to
syllable structure or to peak height.
Figure 2. English focal accent realisation in percentages of the speaker's mean fundamental frequency. The bars represent +/- standard deviation about the mean.
The following comparison between accent realization values in the two languages studied must be of a tentative nature due to the fact that the amount of English data analyzed was considerably more limited than data for Spanish. Nevertheless our data for English largely agree with descriptions found in the literature (for instance Kingdon 1958, Schubiger 1958, O'Connor and Arnold 1973 and Cruttenden 1986)and therefore we presume that our speaker's production is quite a good representative of an RP accent.
Considering normalized percentage values, there appear to be important differences between focal accent mean values in the two languages. These differences are moststriking as far as the on-glides (English= 12.50%; Spanish= -2.92%) and peaks (English= 31.33%; Spanish= 18.54%) are concerned. Off-glides are not too dissimilar (English= -19.17%; Spanish= -15.02%). Indeed, it is in accent off-glides where the languages are more similar. In neither case were off-glides seen to be related to linguistic variables such as syllable structure or preceding peak height. In both languages off-glides are the points which show least variation as compared to the other two nodes.
Therefore, in English focal accents show much wider movements to and from the accented syllable than Spanish ones. It might be though that this diversity is attributable to the fact that in English we were considering a female speaker whereas in Spanish there were three males and one female speaker. Although this fact may have had some bearing on the results, the Spanish female values also differ considerably form the English speaker. To a certain extent, these differences may be due to the two languages having different pitch ranges (see above) but I consider them too large to be totally attributed to this factor. Presumably, either emphasis or other reasons must be partly responsible for the larger pitch prominence found in English focused items.
On-glides to focal accents in the two languages are related to syllable structure since they are higher when they are realized on the accented syllable itself and lower if they occur on preceding unstressed syllables. It is suggested that when there are unstressed syllables preceding the accent, on-glides start lower because there is more time for the pitch to attain the desired peak height. Additionally, there is a tendency in both languages for on-glides to be related to peak heights too.
Descriptive labels show more similarities. In both English and Spanish the preferred realization for focal accents is a fall from high to low. In English there is only one realization that goes only just below the mid register, whereas in Spanish falls to mid or just below mid are much more frequent. Nevertheless, these shallow falls correspond mainly to one speaker so that it may be an idiosyncratic rather than a language characteristic. Therefore it seems reasonable to conclude that in English and Spanish subject focal accents usually fall to points well down in speakers' low registers. According to O'Connor and Arnold, falls would be "high falls" in all the English subject sentences and in a majority of the Spanish ones, there being also five cases of accents that might be classified as low falls and one accent which constitutes an example of an "arrested" fall (Hultzen 1964).
1. Isabel paid the waiter / Who paid the waiter?
2. Andy came for a meal / Who came for a meal?
3. I ordered those dishes / Who ordered those dishes?
4. My neighbour gave a reward / Who gave a reward?
5. Miranda studies languages / Who studies languages?
6. The boy plays the violin / Who plays the violin?
1. 5. Iñigo vino a la comida / ¿Quién vino a la comida?
2. 6. Ella pidió un helado / ¿Quién pidió un helado?
3. 7. Nuria toma manzanilla / ¿Susana toma manzanilla?
4. 8. Alvaro tiene fiebre / ¿Pedro tiene fiebre?
5. 9. Diego correrá el maratón / ¿Alguien correrá el maratón?
6. 10. Luis bebe agua / ¿Quién bebe agua?
7. 11. Yo llevé la bebida / ¿Quién llevó la bebida?
8. 12. La niña tiene sueño / ¿Quién tiene sueño?
9. 13. Don Manuel criaba ganado / ¿Quién criaba ganado?
10. 14. Yolanda cenó verdura / ¿Cristina cenó verdura?
11. 15. El lobo huyó al monte / ¿El jabalí huyó al monte?
12. 16. El Mar olía muy bien / ¿Qué olía muy bien?
13. 17. Miguel tiene hambre / ¿Tomás tiene hambre?
I would like to thank all the staff in Wolfson House for helping me so much with these analyses.
Alarcos Llorach, E., 1950, Fonología Española, 4th edn. 1971, Madrid: Gredos.
Bolinger, D.L., 1954, "English prosodic stress and Spanish sentence order", Hispania, 37: 2, 152-156.
Contreras, H., 1978, El Orden de Palabras en Español, Madrid: Cátedra.
Cruttenden, A., 1986, Intonation, Cambridge: Cambridge University Press.
House, J., 1981, Links Between Intonation and Information, Unpublished Post-Graduate Diploma Dissertation, PCL. School of Languages, London.
Hultzén, L.S., 1964, "Grammatical intonation", In Honour of Daniel Jones, 85-95, D. Abercrombie, D.B. Fry, P.A.D. MacCarthy, N.C. Scott, and J.L.M. Trim (eds.), London: Longman.
Navarro Tomás, T., 1948, Manual de Entonación Española, Madrid: Guadarrama.
Ortiz Lira, H., 1994, A Contrastive Analysis of English and Spanish Sentence Accentuation, Unpublished Ph.D. thesis, University of Manchester, Manchester.
Stockwell, R.P. and J.D. Bowen, 1965, The Sounds of English and Spanish, Chicago: University of Chicago Press.
© 1996 Mª Luisa García Lecumberri
Back to Publications
Back to Phonetics and Linguistics Home Page