Department of Phonetics and Linguistics


Frederika HOLMES

1. Introduction
The experiment described here formed part of a series of tests designed to investigate the nature of language representation in bilinguals and second language learners.

For individuals who have acquired two languages to a high level, and who use both languages on a regular basis, the question arises to what extent the two language systems are separate, and to what extent they share mental representations and mental processes. Current research in this area is no longer informed by a desire to prove "independent" or "interdependent" representation: clearly bilingual ability must involve both since, for example, bilinguals can generally translate freely between their languages and often suffer interference from one to the other, indicating the connectedness of the two systems; on the other hand, many are able to use each language without apparent interference, indicating that each is able to function separately. Experimental evidence exists to support both these models (see, for example, Caramazza et. al., 1974; Grosjean, 1985; 1989; Guttentag et. al., 1984; Macnamara, 1967; Soares & Grosjean, 1984).

It appears that the coexistence of two languages in an individual is a complex phenomenon in which the type of task used in experimental investigation, as well as circumstances of language acquisition and language usage, individual variability and the inherent characteristics of the particular languages concerned all play a part.

The present study was designed to investigate cross-language influences at the lexical level of processing. The starting point was an experiment by Altenberg & Cairns (1983) which examined the use of phonotactic constraints during an on-line lexical decision task. The present investigation used a similar technique, but attempted to correct some fundamental flaws in Altenberg & Cairns' design.

Altenberg & Cairns' test made use of a corpus of monosyllabic nonwords with varying phonotactic status in English and German. These nonwords were grouped into four categories, according to the phonotactic status of the initial consonant cluster:

In addition, four classes of analogous real words were used; these consisted of real words in one language which were either phonotactically legal in both languages (eg. FLAT for English, FLUG for German) or else were illegal in the other language (eg. THRIFT for English, ZWANG for German).

Subjects were presented with these stimulus items individually via a computer screen, and were asked to press keys labelled Yes or No according to whether they judged each item to be a real word or not. Their decisions and the time taken to react were recorded.

With monolingual subjects the general finding for this type of test is that phonotactically illegal nonwords are rejected faster than legal ones, since a string beginning with a cluster which is illegal in a particular language will be blocked at an early stage of lexical retrieval. For bilingual subjects, Altenberg & Cairns found that reaction time was affected not just by the phonotactic status of an item in the test language, but by its status in the latent language as well, as shown by longer RTs for items which were illegal in the test language but legal in the other language. The delay in rejecting such items as fast as would be expected for a monolingual speaker suggests that for bilingual subjects phonotactic information from the latent language is activated even when it is not relevant to the task and that this level of language at least is not characterised by a strong degree of separation between the two language systems.

However, there are several problems with the design of Altenberg & Cairns' experiment, which must cast some doubt on the validity of the results. These criticisms relate to the corpus of stimulus material used in the experiment, and to the selection of subjects for testing.

The nonword stimulus material used by Altenberg & Cairns is given in Table I below.

Table I

Nonword items used in lexical decision task by Altenberg & Cairns (1983)
Legal in

both (LL)
Legal in

Legal in

Illegal in

both (II)

Primary among the corpus-based problems is the claim that "other than the initial consonant cluster, the rest of the word was always legal in both languages". Since we are dealing with written rather than auditory material here, the problem of phonetic differences between the two languages does not arise, but there is an analogous difficulty which arises from the failure of Altenberg & Cairns to make and maintain a clear distinction between phonemes (units of sound) and graphemes (units of writing).

For example, the initial grapheme sequences SPR- and STR- are classified as legal in both languages; however, although they occur as legal grapheme sequences in both languages, they represent different phonemes, namely /Spr-/ and /Str-/ respectively in German, and /spr-/ and /str-/ respectively in English.

Analogous problems occur with grapheme sequences such as TW-, realised in English as /tw-/ which does not occur in native German words but is common in loanwords such as twin, twen etc., where it is realised as /tv-/. A similar problem occurs in reverse with the German initial sequence ZW-, realised as /tsv-/, which is not found in English words, but is not unfamiliar to English speakers through its use in proper names, where its pronunciation is anglicised to /zw-/.

Similar problems were found with the medial vowels used in the test items, where digraphs such as AU or EI represent different vowels in English and German, and also with the final consonant of the test words, where graphemes such as -Z or -K are of questionable legality in at least one language, as well as representing different phonemic entities in each.

Non-linearities such as these must call into question the validity of results gained with this material since there is no way for the experimenter to ascertain whether subjects were responding to the stimuli as graphemes or as phonemes and, in the latter case, which of the possible phonemic implementations the subjects were using. In addition, there are inconsistencies between the two languages; again, the experimenter cannot know which version in which language bilingual subjects are using to form their judgments.

A different kind of flaw in the design of Altenberg & Cairns' experiment is its unidirectional focus. Control subjects were used for the English condition only, which means that no conclusions can be drawn about the performance of bilinguals in the German condition relative to monolingual standards. A further problem is that group results only are provided for a very heterogenous subject population, consisting of childhood bilinguals and L2 learners with widely differing ages of acquisition.

2. Method
2.1 Test material
The overall structure of the stimulus material remained the same, with a total of 48 nonwords in four categories (one category of words legal in both languages, one illegal in both, and one each legal in English and German respectively, but illegal in the other language). Another 48 real words were used, divided into four categories: real English words legal and illegal in German, and real German words legal and illegal in English (see Tables II and III below). Several adaptations were made to remedy the most serious of the problems outlined above. Not all were easily soluble, and in some cases problematic material was retained in the interests of keeping a reasonable body of stimulus material.

  1. Adjustments were made to initial clusters to ensure that their classification was accurate and where possible grapheme-to-phoneme ambiguities were resolved.
  2. Medial diphthongs (digraphs) were replaced with monophthongs. All test items containing digraphs were changed to single graphemes.
  3. The final consonant was changed so that it was legal both as a grapheme and a phoneme in both languages.
  4. Finally, any items which became real words through these adjustments were replaced by analogous nonwords, and care was taken to ensure an equal distribution of vowel sounds and final consonants.

The adapted set of stimulus material used for the tests is shown in Tables II and III below.

Table II

Adapted nonword stimulus material for lexical decision task
Legal in

Legal in

Legal in

Illegal in


Table III

Real words used in lexical decision task
English words
German words
Legal in

Legal in

Legal in

Legal in


2.2 Subject selection
In selecting subjects for testing, theoretical as well as practical criteria were taken into account. In the context of the wider test battery of which the present experiment formed a part, the aim was to test the validity of the concept of dominance across a range of competences and language acquisition patterns. As a result, the bilingual subject population was relatively heterogeneous.

12 bilingual subjects were tested. No rigid distinction was made between childhood bilinguals and those who had become bilingual later in life. The criteria for selection were: a high degree of fluency in both languages; that subjects should use both their languages on a regular, although not necessarily daily, basis; that they should have spent time living in both countries; that their English accent should fall within the range of Standard Southern British, and their German accent should be a standard northern German one (Hochdeutsch). The mean age of the bilingual subjects was 39 years. Three were childhood bilinguals, who had acquired both languages before the age of five; the others had begun learning the second language at school and had attained a high level of fluency in adulthood. Most were living in London on a long-term basis; all but one were tested in London.

The bilingual subjects were matched for relative strength of their two languages using a language-dominance scoring system and subjects' own evaluation of their dominance. On this basis, 6 subjects were judged to be stronger in English 6 stronger in German.

12 monolingual controls were tested, 6 English and 6 German. The English monolinguals were students and members of staff at the Phonetics Department, University College, and the German monolinguals were students and members of staff at the Fachbereich Computerlinguistik at the Universität des Saarlandes in Saarbrücken. All monolingual subjects had lived all their lives in their country of origin, and had parents who were both native speakers of their language. All the monolingual subjects had basic phonetic training. None of the English monolinguals had any significant knowledge of German; inevitably it was difficult to find German monolinguals without some knowledge of English, but those who were accepted as subjects had a chiefly passive knowledge, and claimed to use their English very rarely. The English monolinguals had a mean age of 30 years; the German monolinguals a mean age of 26 years.

2.3 Test procedure
The complete set of real word and nonword stimuli was randomised, and four training items (two nonwords, one real English word and one real German word) added at the beginning to give a set of 100 test items.

This corpus was used as input to a Reaction-Time measurement program, which displayed one item at a time to the screen. Subjects were told that their task was to decide as quickly as possible whether each item was or was not a real word in their language, and to respond as quickly as possible to each item by pressing the key on the computer keyboard labelled yes or no. In each condition the labels on the keys were appropriate to the language in which the test took place. They were told that the computer would measure the time taken to reach a decision.

Testing took place over one session for the monolingual subjects, and two sessions for the bilingual subjects, one in each language. Half the bilinguals were tested in English first, the other half in German first; in any case, the two sessions were conducted at least a fortnight apart. All instructions/conversation took place in the language being tested. Testing took place in soundproofed rooms at the Phonetics Department, University College London and in the Fachbereich Computerlinguistik at the University of Saarbrücken from April to July 1994. Subjects were paid for their participation, and reimbursed for travel expenses.

3. Results
The group results for the bilingual subjects do not provide a clear impression of performance, since the bilinguals were not a homogenous group, and the mixing within the group of subjects with different dominance patterns and different degrees of bilingualism means that potentially significant effects may cancel each other out. However, the group results do give an idea of the range of performance among the bilinguals, and of the amount of variance present in comparison with the monolingual controls.

3.1 Raw data
Individual reaction time scores were assembled and collated to give mean RTs per subject group in each of the different stimulus categories (Table IV below). Only the RTs for correct responses were included in these calculations. The following four figures (I-IV) display the results graphically. Different categories of stimulus are coded as follows:

Code Status Example
LL legal in both languages FLID
LIlegal in test language THRESS in English
illegal in other language ZWUCK in German
IL illegal in test language ZWUCK in English
legal in other language THRESS in German
II illegal in both languages TLON
LO legal in test language FLUG in English
real word in other lang. FLAT in German
IO illegal in test language ZWECK in English
real word in other lang. THRIFT in German

Table IV

Mean reaction times (secs) for nonword stimulus categories in lexical decision task
English Condition
German condition
English condition
German condition

Figure I

Box plot of RT for legal categories: monolinguals and bilinguals English condition

Figure II

Box plot of RT for illegal categories: monolinguals and bilinguals English condition

Figure III

Box plot of RT for legal categories: monolinguals and bilinguals German condition

Figure IV

Box plot of RT for illegal categories: monolinguals and bilinguals German condition

3.2 Statistical analysis
A two-way General Linear Models procedure, which is more suitable than ANOVA for unbalanced groups, with one between factor (language) and one within factor (the stimulus category) was performed on the data for the monolingual subjects.

The main effect of stimulus category was highly significant (F(5,25)=7.47, p<0.001), confirming the finding described above that monolinguals reacted differently to stimulus items according to their phonotactic legality in the test language. Comparisons of means for different stimulus categories were made using the post-hoc Duncan-Waller range test. The resultant groupings are shown below: underlined categories are not significantly different from each other.

Table V

Reaction times for stimulus categories: monolingual subjects

A GLM procedure was computed with two between factors (subject group (monolingual vs bilingual) and language (English vs German)) and one within factor (stimulus category) to compare the reaction times of monolingual and bilingual subject groups to the different stimulus categories for each language condition.

The first finding was that the main effect of subject group was significant (F(1,80)=38.55, p<0.001), indicating that the performance of the bilinguals differed significantly from that of the monolinguals.

The interaction of subject group with stimulus category was also significant (F(5,105)=2.85, p<0.05), confirming that the monolinguals and the bilinguals showed different patterns of reaction time according to the phonotactic status of the test item. Duncan-Waller groupings for bilingual subjects in each of the two language conditions confirmed that the bilinguals' reactions to the various categories of stimulus was more complex than was the case for the monolinguals. The bilingual groupings are given below.

Table VI

Reaction times for stimulus categories: bilingual subjects

4. Analysis and discussion

4.1 Analysis of group data
The statistical analyses confirm that bilinguals differed from monolinguals in their processing of the phonotactic information, showing that information from the latent language affected processing of the test language. This cross-influence is shown firstly in the longer overall reaction times for bilingual subjects, which is assumed to be a correlate of a search of two lexica, or of an extended lexicon before an item can be rejected, a finding which corresponds to that of Soares & Grosjean (1984). It also suggests that the difference between the monolinguals and the bilinguals in the present study was more robust than that found by Altenberg & Cairns, who found no overall difference in reaction time between their English-dominant bilingual subjects and the English monolingual controls. However, as discussed above, results from subjects with different dominance patterns as well as a comparison with monolinguals of the other language (as undertaken in the present study) are not available, an omission which is likely to have affected the results.

Cross-language influence is also shown in the breakdown of the clear two-way groupings of different stimulus categories found with both groups of monolinguals. As for the monolinguals the three legal categories (LL, LI and LO) prompted the longest reaction times but, unlike the monolinguals, there was no clear division between legal and illegal categories. This corresponds to Altenberg & Cairns' findings that the bilinguals showed different patterns of responses to the monolinguals despite the lack of difference in overall RT.

Reactions to the LO category are particularly interesting. This category consisted of items which were real words in the latent language but not in the test language. Despite the longer overall reaction times, items in this category were rejected relatively more quickly by the bilinguals than by the monolinguals, suggesting that the bilinguals' knowledge of the status of the item in the latent language was cutting short the search of the lexicon in the test language. In the English condition, this knowledge was sufficient for the LO category to be grouped statistically with the illegal rather than the legal categories; in the German condition the groupings shown by the Duncan-Waller range test are rather more complicated, but LO is part of a grouping which includes illegal as well as legal categories, a pattern which can be explained only by cross-influence from the bilinguals' latent language.

These findings are in accordance with the findings of other studies that bilinguals are unable completely to exclude information from the latent language, even when it is clearly irrelevant to the task (Hamers, 1973; Altenberg & Cairns, 1983; Soares & Grosjean, 1984). These studies also suggest that different levels of activation of the latent language are possible, and that factors such as language-set, frequency of occurrence of lexical items and use of linguistic precursors might affect the readiness with which information from the latent language was activated.

The present findings show that, for bilinguals as a group, phonotactic information from both languages is accessed when monitoring for items in one language only. They further show that, although the effect of this latent language information seems to be an increase in overall reaction times, bilinguals were also able to use this information to their advantage, as when LO items were rejected relatively more quickly than was the case for monolinguals. However, the group results do not show us to what extent these behaviours were true of all bilinguals, or whether individual subjects differed in the strategies they employed for processing test-language and other-language information.

4.2 Analysis of individual data
The principal problem with group analyses of the type described above for the results of the bilingual subjects is that they fail to reflect in a useful way the amount and the nature of the within-group variability. Table IV indicates that the results for the bilinguals in most of the categories were subject to a substantially greater degree of variation than was the case for the monolingual subjects. It seems likely that for such interactions as subject group x stimulus category, the rather complex and opaque groupings produced by the Duncan-Waller procedure are concealing a variety of important and interesting sources of variance. Possible sources of such variance include the language-dominance and, if applicable, the degree of such dominance, of the bilingual subjects, as well as various factors relating to language-acquisition and patterns of language-use such as age of acquisition and length of residence.

In order to investigate these factors further, a GLM analysis was performed on all subjects, monolingual and bilingual, with one between factor (language) and one within factor (subject), and a Duncan-Waller grouping for individual performances in each language condition was obtained. The intention was to provide a better insight into the natural groupings into which the subjects fell: by analysing individual bilinguals together with individual monolinguals it would be possible to see whether some bilinguals performed like monolinguals, or whether all bilinguals formed a separate group together. In the light of the significant effect of overall RT between bilinguals and monolinguals it was hoped that comparing RTs across different stimulus categories would make it possible to pinpoint individual differences among the subjects.

For both language conditions the main effect of individual subject was highly significant, at the level F(17,85)=10.95, p<0.0001 in the English condition, F(17,85)=15.83, p<0.0001 in the German condition, indicating that individual subjects differed significantly from one another in their overall response times to the nonword stimuli. The Duncan-Waller groupings for each subject condition are given below. The figure given for each subject represents the mean reaction time across all stimulus categories.

Table VII

Cross-category RTs for individual subjects 1

English condition German condition
b02 1.5067 b02 1.1666
b04 1.4650 b04 0.9050
b06 1.2633 b05 0.8783
b09 1.0200 b07 0.8066
b03 0.9550 b06 0.8066
b05 0.8850 b10 0.7733
me01 0.8217 b03 0.7333
b07 0.8100 b11 0.7266
b10 0.7800 mg06 0.6816
me02 0.7583 mg02 0.6750
b01 0.7517 b01 0.6733
b08 0.7417 b09 0.6633
me04 0.7383 b08 0.6400
b11 0.7300 mg04 0.6316
me06 0.7233 b12 0.5866
me05 0.5800 mg03 0.5800
b12 0.5633 mg05 0.5766
me03 0.5633 mg01 0.5333

1The codes for individual subjects are as follows: Bilingual subjects are indicated by the code 'b' followed by a number from 1 to 12; monolingual subjects are indicated by the initial code 'm' followed by 'e' or 'g' for English or German respectively, and a number from 1 to 6.
These results show a complex set of groupings, but some features stand out quite clearly.

There appears to be a considerable amount of individual variability in the reaction time across all categories of stimulus. Some subjects (b02, b04) showed overall far longer response times than other bilingual subjects, and maintained this pattern across both language conditions. Other bilingual subjects maintained consistently faster response times across both language conditions (b12, b08). If a faster overall response time is taken as a correlate of the efficiency with which negative influence from the latent language is kept to a minimum, then these results suggest that some subjects are able to effect a more complete separation between the two language systems than others, and that this ability has a positive effect on processing efficiency in both languages. Subsequent correlation analysis showed that overall reaction times for individual subjects were positively correlated across language conditions, suggesting that the degree of cross-language influence manifested by individual subjects on this task was not a language-specific phenomenon, but possibly a higher-level strategy for keeping the two languages separate during processing. Unfortunately, comparisons with the findings of other studies are not possible here, since other authors have not undertaken analysis of the individual variability within their group results.

This analysis is confirmed by the Duncan-Waller groupings shown above. If the longest response time for any monolingual subject is taken as a cut-off point between bilingual and monolingual performance, it is clear that some bilinguals have scores that group them with monolingual subjects (grouping D for the English condition; grouping G in German condition), while those whose mean response times were longer than those of the slowest monolingual form a separate group. These individual groupings are striking evidence of the variability concealed within the group results given earlier; although as a group the bilinguals showed longer response times, these individual statistics show that some bilinguals were functioning within monolingual norms while others were clearly outside the monolingual range. As indicated above, some individuals were unable to match monolingual performance in either language, while others were within the monolingual range in both conditions, suggesting that cross-influence from the latent language, at least on this type of task, is not a language-specific phenomenon, but rather the correlate of some higher-level processing strategy.

Some language-effects were apparent, however, since some subjects (b10, b11, b09, b07) showed a native-like performance in one language but not in the other.

5. General discussion and conclusions
Within the context of the wider test battery of which the present experiment formed a part, there was a wide variation in the degree of cross-linguistic interference found at different levels of processing. The present investigation, which tested on-line processing of phonotactic information in a lexical decision task seemed to be the level that was most robust in the face of cross-language influence.

When compared with the performances of monolingual speakers, three subjects performed within monolingual standards on this test, while six others were outside monolingual standards in both languages; in only three subjects was there evidence of linguistic bias towards a dominant language in the form of a native-standard performance in one language and non-native standard in the other.

A cross-language correlation analysis showed that performances in this test were positively correlated across languages, with higher performance in one language associated with a higher performance in the other. However, this finding does contrast with the cross-language correlation results of other types of test in the battery, where there was a strong tendency for bilinguals to perform within native standards in the dominant language, and outside native limits in the second language.

It seems likely that the results found here stem from inherent characteristics of the type of test and the nature of the processing level they measure. It can be hypothesised that this test might have been measuring some supralinguistic facility for separating the two languages, of which the performance correlate was a faster reaction time, and that this facility might be more developed in some subjects than in others. Certainly, for the purposes of the present analysis, it seems that the lexical level, at least as measured by a lexical decision task of this type, may be relatively resistant to cross-language influence, so that subjects who otherwise have pronounced dominance patterns here seem relatively balanced across their languages, regardless of whether this balance was within or outside monolingual norms.

These findings, both alone and within the context of the test battery of which they form a part, have important theoretical implications for models of second language learning. In particular, they provide clear counter-evidence to the maturational state hypothesis, since the concept of an age-related limitation on language learning predicts a pattern of performance characterised by a dominant L1, spoken to native standards in all modalities, and a subordinate L2, which can never be acquired to L1 standards if acquired after the critical period has passed. Such a model does not allow for the complex patterns of cross-language influence found here. In particular it does not allow for the development, as that posited here, of an improved ability to keep the two languages functionally separate during on-line processing.

It seems that an awareness of the multiplicity of forms which cross-language interaction may take, and of the potential separateness of the different levels of encoding involved in speaking and listening is crucial to an appropriate characterisation of the processes of second language acquisition, and that future such an awareness into the structure and design of research investigations.

6. Bibliography
Altenberg, E.P., & Cairns, H.S. (1983) The effects of phonotactic constraints on lexical processing in bilingual and monolingual subjects. Journal of Verbal Learning and Verbal Behaviour, 22: 174-188.

Caramazza, A., Yeni-Komshian, G. & Zurif, E. (1974) Bilingual switching: The phonological level. Canadian Journal of Psychology, 28 (3): 310-318.

Grosjean, F. (1985) The bilingual as a competent but specific speaker-hearer. Journal of Multilingual and Multicultural Development, 6: 467-477.

Grosjean, F. (1989) Neurolinguists beware! The bilingual is not two monolinguals in one person. Brain & Language, 36: 3-15.

Guttentag R. et. al. (1984) Semantic processing of unattended words by bilinguals: A test of the input switch mechanism. Journal of Verbal Learning and Verbal Behaviour, 23: 178-188.

Hamers, J. F. (1973) Interdependent and Independent States of a Bilingual's Two Languages. Unpublished PhD Diss. (McGill University, Montreal).

Macnamara, J. (1967) The linguistic independence of bilinguals. Journal of Verbal Learning and Verbal Behaviour, 6: 729-736.

Soares, C. & Grosjean, F. (1984) Bilinguals in a monolingual and bilingual speech mode: The effect on lexical access. Memory and Cognition, 12: 380-386.

© 1996 Frederika Holmes

Back to SHL 9 Contents

Back to Publications

Back to Phonetics and Linguistics Home Page

These pages were created by: Martyn Holland.
Comments to: