Eton F. Churchill, Jr.

Any intelligent agent incarnated in matter, working in real time, and subject to the laws of thermodynamics must be restricted in its access to information. (Pinker, 1997, p. 138)

Restricted access to information is indeed the bane of many a language learner. Language learners, as "agents incarnated in matter", are necessarily restricted by environmental factors to the quantity and quality of input they receive. As beings following the laws of thermodynamics, they are further restricted by internal factors and by the constraints of real time to the amount and nature of the input that becomes intake available for further processing. Furthermore, when attempting to convey meaning in their second language, these information processing agents are impaired not only by the limited nature of their current interlanguage system but also by their ability to match ideas, in real time, to their currently available interlanguage syntax, lexicon and phonology. Finally, the interlanguage system by definition is limited and, as such, inhibits the facility with which a given string of intake may add to or restructure the learner's knowledge of the language. Aside from environmental factors, all of the impediments to information access listed above deal with language processing. In studies in cognitive psychology, the construct that is considered to be most central to language processing is working memory (Baddeley, 1986; Daneman, 1991; Daneman & Merikle, 1996; Gathercole, Willis, Emslie & Baddeley, 1992; Miyake & Friedman, in press; Service, 1992). It should thus follow that the language learner's access to information would be significantly moderated by working memory.

It is the purpose of this paper to make a theoretical investigation of the contributions that working memory may make to the language learning process. To this end, working memory shall first be defined, and a brief rationale for the importance of working memory to language learning and comprehension submitted. Then, prerequisites for the inclusion of working memory in a theory of second language acquisition shall be investigated. Evidence will be put forth for the independence of second language working memory from intelligence, first language working memory, and second language proficiency. Next, this paper shall examine how working memory has been used in the theories of SLA, and how the inclusion of working memory may inform other models that have been proposed to date. The shortcomings of these present models shall then be exposed through a review of the available literature on first and second language working memory. Finally, a new model of SLA acquisition that accounts for the dynamic nature of working memory will be proposed. In addition, the research tools needed to test this theory and an overview of the important questions that emanate from the model will be outlined. Throughout this discussion, the author will draw on available literature on first language working memory, second language memory and recent studies in neurological science.

Definition and the Validation of Working Memory as a Construct

Working memory dates back to Newell (1973), and is commonly defined as a complement to long-term memory that allows for short-term activation of information while permitting the manipulation of the information in question. In laymen's terms, working memory can be seen as the "workspace of the mind". Baddeley and Hitch (1974) suggested the now commonly known tripartite model of working memory derived from research they were doing on the short-term/long-term memory controversy. Their model consists of a central executive and two separate slave systems as seen in the illustration below.

Figure 1. A simplified representation of working memory (Baddeley, 1986)

Of particular interest to the field of second language acquisition are the central executive and the phonological loop. While the central executive is described as being an attentional control system which is linked to long-term memory, the phonological loop is capable of holding and manipulating language-based, as opposed to visuo-spatial (the visuo-spatial sketchpad), information. Baddeley and others (Daneman, 1991; Daneman & Carpenter, 1980; Daneman & Green, 1986; Gathercole, Willis, Emslie & Baddeley, 1992; Service, 1992) have not only found that the phonological loop is important to speech production and reading and listening comprehension; but they have also demonstrated that it plays a substantial role in the acquisition of language.

To briefly summarize the findings in first language studies regarding comprehension, individual differences in working memory have been shown to affect the ability to integrate information, to find a pronoun's referent, to monitor for semantic inconsistencies between texts, to resolve lexical ambiguity, to abstract a main theme, to make comparisons, and to perform well on general measures of comprehension (see Carpenter, Miyake & Just, 1994 and Daneman & Green, 1986 for a review). In most of these cases, working memory was operationalized through a reading span test developed by Daneman and Carpenter (1980). According to a meta-analysis from 77 studies on the relation between working memory and reading comprehension (Daneman & Merikle, 1996), the reading span test correlated with global measures of L1 reading comprehension (e.g. Verbal SAT scores and the Nelson-Denny Reading Test) at .41, with a confidence interval of .38 - .44.

In terms of language acquisition, working memory has been found to be instrumental in the acquisition of new vocabulary and in more global measures of acquisition. Daneman and Green (1986) found that it played a significant role in determining how easily elementary school children extracted word meanings from context. Noting that readers use context to enrich their understanding of words that are only partly known, Daneman and Green also proposed that working memory may facilitate vocabulary growth in an indirect manner. Gathercole and Baddeley (1990) showed that subjects with a high-memory span were able to learn a new name in three trials, whereas subjects with a low span took more than five trials to do the same task. Service (1992) and Ando, Fukunaga, Kurahashi, Suto, Nakano, and Kage (1992) (cited in Miyake and Freidman, in press), two studies which will be reviewed more extensively later, found that working memory span played a significant role in predicting foreign language acquisition. In short, the literature provides convincing evidence for the importance of working memory to first language comprehension and acquisition and there is emerging evidence supporting the role of working memory in foreign language acquisition.

A word of caution must be issued, however, because of the methodology used in these studies. Typically, the approach has been to use some measure of working memory (e.g., non-word repetition task, reading span test) and to correlate the working memory scores with performance on some other task (e.g., reading comprehension, vocabulary tests, overall language proficiency measures, etc.). Because of the correlational approach to these studies, it is entirely possible that a common underlying factor, such as intelligence or overall language ability, can account for the shared variance in the respective scores. This is particularly important when one considers the nature of one of the most commonly used measures of working memory, the reading span test (see the instrumentation section of this paper). Thus, prior to making any strong claims for the centrality of working memory to language comprehension and acquisition, it is of utmost importance that the validity of the construct be established by demonstrating its independence from intelligence and proficiency. Furthermore, in the case of second language acquisition, it will be important to understand the relation between L1 and L2 working memory.

Working Memory and Intelligence

The evidence regarding working memory and intelligence tends to paint a picture of independent constructs for L2 learning. Geva and Ryan (1993) working with 73 fifth to seventh grade children in a Hebrew immersion program in Canada found that correlations between two L2 working memroy measures and L2 reading measures remained significant even when intelligence (as measured by the Otis-Lennon test of Mental Ability) was partialled out. Interestingly, however, their L1 working memory measures, an opposite word task and a listening span task, did not correlate with the L1 linguistic measures after intelligence was removed through multiple regression. Ando, Fukunaga, Kurahashi, Suto, Nakano, and Kage (1992) measured reading and listening spans of Japanese sixth graders who had not had any English instruction. After twenty hours of instruction emphasizing a traditional, grammar-oriented approach, they found the L1 spans to significantly predict posttest performance in L2 reading (r = .60) and listening (r = .72); yet their intelligence measure, the Raven Progressive Matrices Test, did not correlate with the posttest measures.

In two experiments, and in a reanalysis of an earlier study (Kyllonen & Christal, 1990), Jurden (1995) also found L1 working memory to be independent of intelligence. In the first study with 84 undergraduates, Jurden found that working memory correlated with the ACT- reading test (r = .26), but his computational span (C-span), a measure of non-verbal memory, did not. In his second study, with 52 college students, the reading span test again correlated with verbal measures, while the C-span correlated with measures of non-verbal intelligence (Wechsler Adult Intelligence Scale and Raven's Progressive Matrices). In the reanalysis of Kyllonen and Christal's work with 723 military recruits, Jurden found that intelligence and verbal working memory were highly correlated (r = .78), but the verbal and other working memory factors maintained their "conceptual autonomy". This led Jurden to propose that "the working memory system may be a parallel (possibly distributed) processing system" (p. 101). To conclude this section on working memory and intelligence, the preliminary evidence suggests that working memory and intelligence are at least independent for younger learners of a second language; and that they may be more highly correlated in L1 for adults, while still maintaining some independence. Needless to say, more studies confirming these results would be welcomed.

L1 and L2 working memory

A second important question involving the validity of L2 working memory as a construct involves its independence from L1 working memory. As Harrington and Sawyer (1992) note, a question of considerable interest is the relationship between L1 and L2 working memory. Working with 34 Japanese subjects with a mean TOEFL score of 534, Harrington and Sawyer found significant correlations between the L2 reading span and performance on the TOEFL (grammar = .57 and reading = .54), but the L1 and L2 working memory measures only correlated at a moderate (r = .39) level. In study with 60 French university students learning English with an average TOEIC score of 613, Berquist (1997) found that L1 reading span was greater than the L2 reading span and yet they correlated at r = .48. Furthermore, the L2 working memory measure correlated with TOEIC at r = .41. Of note for issues involving instrumentation, Berquist used two working memory spans in L2 and actually found that the word span measure he used correlated slightly better (r = .46) with the TOEIC scores than the reading span measure. Hummel (1998) also studied French native speakers learning English and found results similar to Berquist. The L1 and L2 working memory reading span tests correlated at .55, while the L1 working memory span test correlated with L2 reading comprehension as indexed by the Michigan Test of English Language at r = .32. Ikeno (1997) used a slightly different approach and conducted a multiple regression using L1 and L2 reading span scores as independent variables and the TOEFL reading test as a dependent variable. This analysis revealed that the L2 working memory measure correlated with the TOEFL at r = .28 after L1 working memory was partialed out. Finally Miyake and Friedman (in press) conducted a study with 59 Japanese university subjects and found that L1 and L2 working memory correlated at r = .58. While these results appear to support the independence of L1 and L2 working memory, work by researchers in Japan suggests that L1 and L2 WM may be more highly correlated.

In two studies involving four different languages (Osaka & Osaka, 1992; Osaka, Osaka & Groner, 1993) found high correlations between L1 and L2 working memory. With 15 Swiss German-French bilinguals, Osaka and Osaka (1993) found that the L1 and L2 reading spans were correlated at r = .85. They also found high correlations with their Japanese college students of English at the "near bilingual level" (.84 for the L1 and 'ESL' measure and .72 for the L1 and L2 measure based on Daneman and Carpenter's (1980) span test). While one may at first assume that the higher correlations are due to the "bilingual" proficiency of the subjects, another interpretation is possible. In these two studies, Osaka and Osaka counted the reading span scores on a 5 point scale based on meeting the criteria of 3 out of the 5 sets of sentences at each level. When subjects only got 2 out of the 5 sets correct, they were given half a point. This is the grading rubric used by Daneman and Carpenter (1980) in their original study using the reading span test.

However, Daneman, and Green (1986) later found that with reading and speaking span tests, a total performance measure calculated by the total number of words correctly recalled was more sensitive than the maximum set size measure used in Daneman and Carpenter's (1980) original study. The studies mentioned above which found lower correlations between L1 and L2 working memory scores all used this more sensitive measure. Since this measure is more sensitive, the anticipated result would be a greater variation and thus offer a lower correlation with the studies using this approach. On the other hand, the measurement used by Osaka and Osaka (1993), based on the original Daneman and Carpenter measurement scheme, would favor higher correlations. This is precisely what was found. Thus, it is probable that the results obtained by Osaka and Osaka would have been quite different had they used the one word equals one point system of analysis. Assuming that the correlations found by Osaka and Osaka are artificially high due to issues involving measurement, we may conclude that the correlation between L1 and L2 working memory is only moderate (.39 - .58) and that L2 WM contributes independently to L2 reading comprehension.

Working Memory and Proficiency

Finally, for working memory to be of interest to the field of SLA, it must be demonstrated that working memory contributes independently to linguistic knowledge or overall proficiency. The results here are difficult to consolidate into a clear interpretation because of the diversity of measures, but there may be some evidence for the unique contribution of working memory to overall proficiency. Working with 51 adults ranging in age from 18 to 60, Baddeley, Logie, Namio-Smith, and Brereton (1985) found that L1 working memory correlated with L1 reading comprehension as measured by the Nelson-Denny reading test at .463. Then through a stepwise regression, they showed that L1 working memory contributed 10.6% to comprehension following a lexical decision task (26.1%). In a subsequent study reported in the same article, Baddeley's research team worked with 107 subjects and found that the reading span test contributed 19.4% of the variance in reading after the effect of vocabulary had been removed. Similarly, Geva, and Ryan (1993), in the previously mentioned study on young learners of Hebrew, found through a multiple regression that their L2 working memory word opposites test contributed to 6.8 % of the variance in L2 reading as measured by their cloze test after the effect of oral proficiency as measured by a Foreign Service type exam was partialed out. Harrington (1992) also found that L2 working memory still accounted for the variance in TOEFL reading scores even after the effects of L2 vocabulary and grammatical knowledge were removed.

In contrast to the evidence in these studies supporting the claim that working memory contributes independently to the variance in proficiency, there is one study that appears to argue for no significant correlation. Ikeno (1997) used 52 paid Japanese university students and looked into the effect of working memory and language processing efficiency on L2 reading comprehension on the TOEFL reading section. He found that there was a lack of significant partial correlation between the L2 memory span test and L2 reading after the affect of L2 processing efficiency had been statistically removed. While this result would appear to qualify the findings listed above, it is worth discussing briefly here the measures that Ikeno used.

To evaluate L2 processing efficiency, Ikeno used a word matching test, a lexical semantic judgment task, a grammatical judgment task and a lexical semantic judgment task. Of these, the lexical semantic judgment task and the sentence verification task were found to move the correlation between L2 working memory and L2 reading to non-significant levels. However, both of these tasks would appear to require the mechanisms involved in working memory. With the lexical semantic judgment (e.g. car—automobile, sit—stand), subjects would be required to hold the first term in working memory while waiting for the second lexical item. Then, the process of accessing the semantic representations for these terms and the subsequent comparison would involve the processes of working memory. With the sentence verification task (e.g. Tokyo is the capital of China), working memory would appear to be similarly involved as the subjects would need to maintain the surface structure of the phrase while accessing the real world knowledge needed to confirm or refute the statement. It is likely that the significant covariance between L2 working memory and L2 reading was removed by these processing measures because of the underlying L2 working memory mechanisms required to perform on these tasks. Thus, summarizing this discussion of L2 working memory and linguistic knowledge, it may be cautiously submitted that L2 working memory has been found to contribute to approximately 10% of the variance in reading comprehension measures after the effects of L2 linguistic knowledge (vocabulary and grammar) or oral proficiency have been removed. Clearly, however, more evidence will be required to confirm the findings reported in the studies above.

The tentative conclusion regarding the validity of working memory as an independent variable can be posited here. The research suggests that working memory and intelligence are independent for younger learners of a second language, that the correlation between L1 and L2 working memory is only moderate (.39 - .58), and that L2 working memory contributes approximately 10% (6.8% - 19.4%) of the variance in L2 reading comprehension after linguistic knowledge (vocabulary and grammar) have been partialed out. Until evidence is provided to the contrary, there appears to be ample reason to treat working memory as an independent variable in language comprehension.

Working Memory and Current Theories of SLA

A working memory perspective shifts the scientific schema within which language and language behaviour have been studied to focus on the dynamics of language processing, in addition to its structural aspects (Carpenter, Miyake & Just, 1994, p. 1107).


If working memory can indeed be validated as an independent variable that can account for language comprehension and language acquisition, the obvious question is how working memory fits into a theory of second language acquisition. As suggested by the eloquent statement made by Carpenter, Miyake and Just above, one area in which to look for such a compatible theory of SLA is in the area of processing. Indeed, there appears to be a growing consensus in the field of SLA (Harrington, 1992; Sagarra, 1998) that working memory is highly compatible with information processing models of L2 learning such as those proposed by Pienemann and Johnston (1987, 1988), VanPatten (1996), McLaughlin, Rossman, and McLoed (1983), and Hulstijn and Hulstijn (1984). Some of these theories have more recently been incorporated into the recent work by Skehan (1998) on a cognitive approach to language learning. To illuminate, it may be useful to look at a few of these models that will be informative to the model proposed in this paper.

Pienemann and Johnston (1987, 1988) elaborated on an earlier proposed Multidimensional Model (Meisel, Clashen, & Pienemann, 1981) put forth to account for observed stages of development and the variation between learners. Central to their proposal is the notion that the developmental sequences in language learning reflect the systematic manner in which learners overcome processing constraints. The corollary to this theory is the teachability hypothesis which claims that teaching of the developmental structures can only be successful if learners have mastered the prerequisite processing operations. Larsen-Freeman and Long (1991) criticized this model because it does not account for how learners overcome processing constraints. If one allows for a dynamic portrayal of L2 working memory that develops over time as more and more linguistic information gets incorporated into long-term memory, then the incorporation of L2 working memory into the Multidimensional Model may counter the criticism made by Larsen-Freeman and Long (1991). This point shall be developed further later in this paper, however, a brief outline of the underlying processes can be found in two recent models.

VanPatten and Cadierno (1993) and VanPatten (1996) portray the role of processing from input to output in the following schematic model.

         I                  II                             III
 input -------->  intake ----------> developing system ----------> output

Figure 2. model of second language acquisition (VanPatten and Cadierno, 1993)

According to this model, at each stage (I, II, and III) in the process, the learner engages in linguistic processing that is mitigated by the limited capacity of the learner's L2 WM. At all three stages, there is a competition between meaning and form for the overall computational resources of the learner. Thus, the facility with which new information can be integrated into the developing system is governed by the learner's ability to deal with the processing demands at the time the new information is encountered (VanPatten, 1996).

VanPatten and Cadierno's (1993) model and subsequent elaboration by VanPatten (1996) brings some light to the processes that may contribute to change in the interlanguage system, however, the question of what is involved in working memory remains unanswered. Drawing from the seminal work by Schmidt (1990, 1994) on the role of consciousness in language learning, Skehan (1998) puts forth the following model that elaborates on the processes involved in L2 working memory. The elegance of this model is that it begins to describe the complexity of working memory and long-term memory. The model also sheds some light on the connection between working memory and long-term memory. It also suggests that working memory must be able to engage in a number of different tasks (Skehan, 1998).

In explaining the role that working memory plays in the development of interlanguage, Skehan (1998) claims that the metaprocesses that operate on working memory are essential to instigating change in the knowledge system of the learner. Implicit in this model is the assumption that there are various types of consciousness enhanced processing. Moreover, the processes operating on various tasks such as matching, feedback appreciation and so forth, may be in competition for a limited amount of overall capacity. Skehan (1998) does not come out and state it specifically, but he is actually suggesting that working memory may not be a uniform construct, a point that shall be elaborated on shortly.

Figure 3. Influences on noticing and components of working memory and long-term memory (Skehan, 1998)

The appeal of VanPatten and Cadierno's (1993) model is that it suggests that the competition of resources plays a central role in the integration of input into the developing interlanguage. Furthermore, it contributes to the dimension of the Multidimensional Model that accounts for developmental sequences in learning by suggesting that the capacity of working memory at a given time restricts the ability to acquire a linguistic form new to the interlanguage system. The shortcoming of the model is twofold. First, it views working memory as a unified system that must devote limited resources to both meaning and a variety of forms. Secondly, it claims that there is a competition between meaning and form and yet does not explain how this happens given that meaning is in fact encoded in the form (Sawyer, personal communication).

On the other hand, Skehan's (1998) approach begins to dissect working memory by the underlying tasks that it might have to carry out, but it does not address the competitive demands of the input that VanPatten and Cadierno suggest. Skehan's model attempts to address the external competition by employing Schmidt's (1994) noticing, but noticing alone will not guarantee that an utterance will become intake available for processing in working memory. Prior to being ready to actually process a form, a learner may notice the form in the input and yet not have a sufficiently well-developed system to maintain the form in working memory. In other words, it is quite possible that a noticed item might decay before it is actively processed in working memory.

Moreover, the place of "noticing" within cognition is not clear. In Skehan's schematic portrayal of working memory and noticing, factors external to the learner are represented by the three boxes to the left of the "noticing" box. Skehan then has internal factors influencing noticing, as represented by the box below noticing; and then the noticed input moves into working memory and short term memory. In short, Skehan has factors external and internal to the learner separated with noticing between the two, suggesting that noticing is neither or both, or some thing separate from the internal factors. Clearly, the act of noticing is the interface between the input and the internal system, working memory and long-term memory, but it is an act much in the same way as matching, recombining and transforming are acts.

While Skehan's model falls short in these areas, it should be credited with suggesting that working memory may involve several different tasks, or routines; and we may infer that here would be competition between the various tasks in working memory. VanPatten and Cadierno's (1993) approach, on the other hand, offers the notion of competition that leads to changes in the interlanguage system, but inappropriately claims that competition for resources exists between meaning and form. One may thus posit that there are features in the input that compete for the language processing system's resources and that there are routines within working memory that are in competition to interpret the intake. A complete model of working memory must account for both the external and internal competitive demands to the system while explaining how this competitive environment can account for changes in the interlanguage system in long-term memory over time.

The Nature of Working Memory

Prior to elaborating further on the model under construction in this paper, a few points need to be made regarding the discussion of the VanPatten and Cadierno (1993) and Skehan (1998) models. The first point that needs to be confirmed is the non-unitary nature of working memory. As stated above, the VanPatten and Cadierno model appears to assume that working memory is a unitary construct while Skehan implies a model of working memory made of several subcomponents. A second issue that must be investigated is the developmental nature of working memory. VanPatten and Cadierno (1993) and Pienemann and Johnston (1987, 1988) specifically state that changes in the ability to process language results in developmental modifications to the interlanguage system and they claim that there are differences in processing capacities over time. Skehan's model, on the other hand, does not elaborate on the question of developmental changes in working memory. To better understand the nature of working memory and its development, we will need once again to refer to the literature in cognitive psychology. Here, studies in the neurological sciences may also be informative.

A Multicomponential System

Is working memory a unitary or multicomponential system? The question has been widely debated in the field of cognitive psychology since working memory was offered as an alternative to short-term memory. Authors such as Case (1987), Turner and Engle (1989) and others view working memory as a unitary resource that is independent of task content (see also Jurden, 1995). In these models, competition for this shared resource between the need to store information and the act of processing is believed to account for differential performance on tasks. On the other hand, a growing number of researchers are beginning to favor a multicomponential model that would be consistent with Skehan's (1998) description of working memory. Daneman (1991) found that L1 reading span did not predict speech production well and thus concluded that working memory was not unitary. Furthermore, in a recent study investigating task effect, Towse, Hitch and Hutton (1998) proposed that harder tasks extend the processing time period and thus make it more difficult to retain an item in memory. They did not find any consistent interference effects in their study and concluded that their results did not support a resource sharing model. Rather a task-switching model was favored and they are "inclined toward a decay interpretation" (p. 196). Research in this vein supports the view expressed earlier that noticing alone does not guarantee that an item will be maintained in working memory long enough to be processed.

Further support for a multicomponential model of working memory can be found in research in the neurological sciences. Using positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) while subjects conduct tasks involving working memory, several teams of researchers have demonstrated that various components of working memory are anatomically distinct. Paulesu, Frith, and Frackowiak (1993) found that the phonological store involved the supramarginal gyri (BA 40) whereas the subvocal rehearsal system involved in phonological processing was in the superior temporal gyri (BA22/44). Shallice, Fletcher, Frith, Grasby, Frackowiak, and Dolan (1994) found that encoding and retrieval of information activated different parts of the brain (left dorsolateral prefrontal and retrosplenial cortex vs. bilateral precuveus and right prefrontal cortex), and they claim that these results were consistent with other PET studies. D'Espisito, Detre, Alsop, Shin, Atlas, and Grossman (1995) found results that suggest that the central executive system may involve several components and they found neurological overlap between the central executive and slave systems (e.g., phonological loop). Cohen, Perlstein, Braver, Nystrom, Noll, Jonides, and Smith (1997) claim that their study provides evidence for a complex relationship (i.e., not a clean dissociation) between the executive system and maintenance processes (the phonological loop). Furthermore, their study revealed the interesting finding that there was a quantitative difference in the amount of rehearsal at lower and higher loads of information. This, they assert, "raises the possibility of a disassociation between… explicit rehearsal, and other mechanisms for actively maintaining information that may reside within the dorsolateral PFC" (p. 607).

These neurological studies far from confirm the assumption by Skehan (1998) that working memory may be subdivided into processing routines such as matching, feedback appreciation, recombination and rule-based exemplar generation, but they do paint a clear picture of working memory as a multicomponential system that is far more complex than the tripartite model put forth by Baddeley (1986). This conclusion is supported by the work in cognitive psychology investigating differences across tasks. Perhaps with further studies involving a variety of tasks, we may eventually be able to make a claim for the independent processing routines put forth by Skehan, or we may yet identify other cognitive routines in working memory. Once these processing mechanisms in working memory are defined, we may be able to trace the development of the processing mechanisms as working memory matures.

Developmental Changes in Working Memory

The question of the developmental changes in working memory is of particular interest to the language teacher and learner. As Berquist (1997) points out, however, there is no study in the literature in SLA that has set as its goal tracing the development of working memory. In contrast, the literature in the field of psychology provides some evidence of changes over time, but these changes are only broadly described. First, we may outline the changes that have been demonstrated to occur with changes in age. The ability to hold verbal material in working memory has been shown to increase significantly between the age of four and adolescence. At this stage, the rate of increase levels off considerably and then declines again with older adults (for an overview, see Carpenter, Miyake, & Just, 1994). To the limited knowledge of this author, there are no studies investigating the development of processing mechanisms within working memory over time. There are however a few interesting studies that might indicate an approach to further research in this area.

In a longitudinal study of 80 children, tracking children from age four to age eight, Gathercole, Willis, Emslie, and Baddeley (1994) investigated their subjects' ability to repeat non-words and correlated the scores on this task with vocabulary knowledge. They cross-lagged partial correlations and found that performance on the non-word repetition task was a better predictor of vocabulary for the ages four to six than vice versa. At later time intervals, however, this directional causality was not found. This led Gathercole's research team to conclude that the relationship between phonological retention and vocabulary acquisition is complex in nature. They further posit that the causal relationship changes during the early school years with linguistic knowledge, as indexed by vocabulary scores, positively influencing performance on the non-word repetition task. Thus, they understand that developmental changes in working memory entail an increased ability to hold phonological signals in store, leading to gains in lexical knowledge. This increased lexical knowledge then becomes a basis for better retention of "lexical and nonlexical sequences" in the phonological memory system.

The importance of phonological memory has also been demonstrated in L2 research. Service (1992) followed the second language development of nine to eleven year old Finnish children learning EFL. Service used a pseudo-word repetition task, a pseudo-word copying task and a task involving the comparison of L1 syntactic-semantic structures similar to the syntactic comparison task on the MLAT. The major finding was that repetition accuracy of English pseudo-words was a good predictor of English learning during the first two to three years. The repetition task correlated highly with listening comprehension (r = .62), reading      (r = .74) and production (r = .58). In a multiple regression of the three tasks and the subjects' English grades, Service found that 44% of the variation in English grade could be accounted for by the repetition task, whereas only 6% and 3% of the variance was attributable to the syntactical judgment and copying task respectively.  Based on the results of this multiple regression, Service notes that the metalinguistic task of syntactic comparisons relies on different processing resources, thus lending further support to the multicomponent model of working memory.

Although Service (1992) does not discuss how the syntactic task may be related to working memory, she claims that the weak correlation between the repetition task and English proficiency could be due to one of three factors. The subjects could have been hampered in their ability to reproduce the pseudo-words because of a rapidly fading phonological trace, because of a retarded transformation from input to output or as a result of problems in encoding material into the phonological store or in the rehearsal process. Service notes that these factors would all be involved in the learning of new words. However, she seems to favor the notion of trace quality in the input store because it is "related to durability and affected by long-term memory support, i.e. the familiarity of the material" (p. 44).

As for the role of working memory in the syntactic comparison task, one could hypothesize that the role of syntactic judgment in predicting language learning would start exhibiting higher correlations with measurements of proficiency later in the learning process. There are three possible reasons why this may be the case. First, phonological decoding processes would be most demanding of working memory resources early in the acquisition phase since there is a need to discern word boundaries prior to assigning syntactic function based on word order. Secondly, only after phonological memory has become sufficiently developed to hold bound morphemes in working memory can these features of the language be used to inform syntactic judgment. Finally, only after the learner's working memory has developed enough efficiency with the new phonology can processing resources be devoted to the metalinguistic task of analyzing syntax. Service's (1992) findings of a rising correlation between the L1 syntactic comparison task and English grade coupled with a falling correlation with the pseudo-word task would be consistent with all three of these points.

In several studies, the processing of syntax has been found to be more demanding of processing resources. For example, VanPatten (1990) examined the ability of 202 students of Spanish to comprehend a listening passage, indexed by an idea unit analysis of free recall, while counting various parts of speech. Highest performance was found for the control group that had no counting task; and this group's scores were not found to be significantly different from the group whose task it was to count the occurences of a lexical item. The group assigned to counting the frequency of a lexical item in the text performed better than the group counting the definite article, but this difference was only significant for the second year learners in the study. The group instructed to count a bound morpheme recalled the fewest idea units, a difference of statistical significance for the first year subjects and the third year group. This suggests that processing resources were taxed the most with the group focusing on the morpheme and least by the group that focused on comprehension alone with focus on a lexical item and on a function word falling somewhere between the two. A post-hoc Tukey revealed that there was statistical significance between the two meaning focused tasks and the two form-focused tasks, supporting the claim that it is more difficult to maintain syntax in working memory.

A study that looked directly at the relation between working memory and syntactic processing is cited in Miyake and Friedman (in press). The listening span measure of 59 native Japanese university students was found to correlate with cue preference in an agent identification task (r = -.37) and a measure of syntactic comprehension (r = .52). The finding most significant to the current discussion was that high and low span learners engaged in qualitatively different strategies for interpreting English sentences with the high span taking better advantage of word order cues, and the low span group basing their decisions on animacy, a feature determined by lexical representation and semantics. This finding would tend to support a multicomponent view of working memory with a change in processing strategies dependent on the resources available. Learners with smaller spans might not be capable of holding word order cues in working memory long enough for processing. Thus, they make decisions based on lexical information within the range of their processing capacity.

To summarize the discussion on the developmental nature of working memory, we must first point out that we know very little. However, the literature under review here supports the claim that because of a restricted L2 working memory capacity at the early stages of acquisition, learners first depend on working memory processes that focus on the phonological interpretation of the input. This is reflected in the high correlations between the pseudo-word repetition task and subsequent acquisition (Service, 1992). As more lexicon is acquired, the lexicon begins informing the judgment of phonological input (Gathercole, Willis, Emslie, & Baddeley, 1994) and the learner begins relying more on semantic information for comprehension (Miyake & Friedman, in press). This constitutes a change in the processing routines used for reaching comprehension. It is only as the processing of phonological and lexical information becomes more efficient that more of the resources in working memory will be available for making syntactic judgment (Miyake & Friedman, in press; Service, 1992; and VanPatten, 1990). As the interlanguage syntax becomes increasingly developed and thus increases the efficiency of working memory to make syntactic judgment, this increased ability begins to inform decisions regarding newly encountered lexical items. The move from relying heavily on lexical items to depending more on syntax requires yet another change in the use of routines in working memory. Furthermore, the use of syntax coupled with acquired vocabulary to help make guesses at the meaning of new words found in context also requires different strategies. Each of these changes in the respective processing routines suggests a multicomponential view of working memory, precisely what is being proposed here. In conclusion, we may propose a model of working memory that is a multicompential system whereby there are developmental differences in the importance of the respective strategies that make up the system.

The Proposed Model

It is being submitted here that there are different strategies in working memory and that the general pattern of use of these strategies changes over time. At any given time, however, all of these strategies work in cooperation and competition to derive meaning from the features in the noticed input. The "restriction of access to information" noted by Pinker (1997) at the outset of this paper is the motivation behind the competition between the various processes in working memory. There is an ongoing need to clear working memory as efficiently as possible to create space for subsequent processing. It is this drive for efficiency that causes developmental changes in the L2 working memory and it is also the process that drives the acquisition of language.

The input with which the language learner comes into contact is broken down into noticed features and elements that are unattended which are, by definition, lost. At the time the input is encountered, noticing is a function of both the internal expectations regarding what features of the input will be most meaningful (attention) and the current ability of working memory (as determined by phonological decoding ability, knowledge of lexicon, knowledge of syntax, familiarity with prosodic features of the language, etc.) to process the incoming stream of language. The working memory system is then in a race against time to derive as much information as possible out of the message before the linguistic, semantic, syntactic and prosodic coding of the message decays.

To meet the demands of this task, the learner activates the routines in working memory in a competitive manner to encourage the most efficient processing. While phonological decoding, a task that requires the matching of phonemes and sequencing of the phonemes, is least demanding on available resources and a prerequisite for further processing, it is also an approach that is furthest removed from meaning and perhaps an approach that consumes the most time. Unless required to devote resources to this strategy either due to the inability to carry out other strategies or as a result the features in the noticed input (e.g., a key word encountered for the first time), the language processor will opt to use more meaning-focused strategies.
 A lexical recognition approach allows the processing to get closer to meaning and will be favored over the phonological decoding routine where possible. It has the disadvantage of not providing complete meaning, however, and still requires the learner to activate real-world knowledge and other information in long-term memory (e.g., L1 syntax) into working memory to determine the role relationship between the recognized lexical items. This is done at the cost of consuming resources in working memory that could be otherwise devoted to the processing of subsequent input.

The recognition of syntax in the intake requires the most resources—matching of phonemes, recall of word order, interpreting prosodic signals, comprehension of clues given by bound morphemes, etc.—but, when operating efficiently, it greatly increases the speed with which the stream of noticed language can be removed from working memory and brought in the form of ideas into long-term memory, thus freeing up the valuable workspace of the mind for further processing. It will thus be the approach that is favored when the resources available in working memory are capable of processing the current stream of noticed input at this level. Where this approach fails, lower level processing strategies will be used to get as close as possible to comprehension; however the use of these strategies to get at meaning will be more demanding, in terms of length of processing, on working memory. The outcome of this competitive process will result in qualitative and quantitative changes, dependent on the depth of processing, in working memory and the interlanguage system over time.

A schematic representation of the model proposed here is provided below. To clarify the figure, a few points should be made. First, attention is assumed to be a vital part of working memory. Secondly, while the noticed syntax, prosodic features, lexicon, phonology are represented by a box inside of working memory, they are not in of themselves part of the working memory construct. They are merely the intake that the various processing strategies in working memory, in conjunction with information from long-term memory, act upon to create meaning.

Evaluating the Model

While the discussion in this paper has attempted to provide evidence for the validity of second language working memory by demonstrating its independence from intelligence, L1 working memory and proficiency; more evidence will be needed to further confirm this preliminary conclusion. Assuming that L2 working memory is a valid construct, this paper has gone forth and made a claim for a multicomponential system and for developmental changes in working memory over time. These two claims will have to be supported by research into the nature of L2 working memory. Specifically, this model argues for the independence of specific strategies in working memory. This will have to be empirically demonstrated through the use of a variety of L2 working memory measures. Furthermore, this model claims that there are developmental changes in the use of these strategies over time. To support such a claim, it will have to be demonstrated that the correlation between the specific strategies and comprehension of meaning changes over time with a decreasing importance placed on the phonological decoding process, and an increasing importance placed on lexicon first and then on interpretation of syntax. To carry out an investigation into the nature of L2 working memory and into develop-mental questions, researchers will have to use measures that specifically tap into these processes.

With a few exceptions (Berquist, 1997; Service, 1992), the SLA research to date on working memory has relied exclusively on some form of the reading span measure first developed by Daneman and Carpenter (1980). Curiously, however, no researcher in L1 or L2 has provided a reliability score for this measure. Furthermore, although the reading/listening span measure is assumed to test the ability to store and process information, several authors using the instrument have noted uncertainty about what it actually measures (Baddeley, Logie, Nimmo-Smith, & Brereton, 1985; Carpenter, Miyake, & Just, 1994). As Baddeley and his team note

Both the strength and weakness of the working memory span is its complexity. It involves a number of subcomponents, including comprehension, the selection and operation of strategies, learning, and recall… One can argue that something analogous to working memory does appear to be an important factor, but we are left with few clues as to which of the several factors might be of crucial importance. (p. 120)

Another point to be made is that one of the basic assumptions of the reading span test may not in fact be functioning as planned. The underlying premise of the reading span test is that recall of final words in the larger groups of sentences will be more difficult. In a recent item analysis of 25 graduate students' performance on the measure, this was not found to be the case. In fact, in several cases performance on items in the 'most difficult part of the test', recall of final words in the series of five sentences, was higher than on items found earlier in the test. This is not to say that this approach to operationalizing working memory should be discarded.  However, this measure should be put to the same amount of scrutiny that researchers use when examining measures of proficiency or any other instrumentation.

Of note for the current discussion, it is quite evident that the reading/ listening span test alone will not serve the purposes of investigating the questions that emanate from the proposed model. Working memory measures that tap at phonological decoding of L2 phonemes, L2 word recognition and syntactical judgment  will be required. Here, we may refer to the pseudo-word repetition task used by Service (1992) and Gathercole, Willis, Emslie and Baddeley (1992) for a test of phonological decoding in working memory. For a word recognition task, we can draw on the work done by Berquist (1997) who found that his orally presented word span test actually correlated more highly with performance on the TOEIC (.46) than the reading span instrument he used. For a measure of working memory and syntactic judgment, we may turn to the work of Service (1992), Miyake and Friedman (in press), and VanPatten (1990), or assessment of this working memory strategy may be examined with a more finely tuned reading span measure that draws on the different processing constraints placed on progressively more complex grammatical structures.

Once a set of measures has been decided on, these could be administered first to different groups of learners to determine if the assumed working memory processes contribute differentially to comprehension and acquisition. We could then use these measures to return to the questions posed at the outset of this paper regarding the independence of second language working memory from intelligence, L1 working memory and proficiency. It could be that the different measures may reveal varying degrees of dependence on these other constructs; intelligence, proficiency and L1 working memory. Also, we could track working memory developmentally in a cross-lagged partial correlation study like that conducted by Gathercole, Willis, Emslie, and Baddeley (1994) to determine if the various strategies contribute differentially to comprehension and proficiency across time. In this manner, the hypothesized developmental changes in the proposed multicomponential view of working memory could be empirically tested.

Finally, by exploiting this research paradigm, several of the as yet unanswered questions regarding working memory could be investigated. For example, a longitudinal approach with these measures could be used to find evidence in support of the Multidimensional Model.  Also, it would be possible to determine if specific working memory processes covary with specific reading skills, a question posed by Harrington and Sawyer (1992). If the developmental view of working memory is supported, then we could also use these instruments to investigate if it is possible to accelerate the process through intervention.  For example, does prolonged practice with an activity such as shadowing enhance the growth of L2 working memory.  Perhaps there are also applications for these measures in studies involving speech production and acquisition through interaction. For example, in the 1998 SLRF conference, Doughty (1998) proposed that we should start looking how recasts contribute to acquisition from a processing constraints perspective. This would presumably involve the examination of how development in working memory contributes to the ability to increase linguistic knowledge through interaction.


In conclusion, this paper has made an argument for the inclusion of L2 working memory into a processing model of second language acquisition. Through a selective review of the literature in SLA, the cognitive sciences and the neurosciences, an argument for the treatment of L2 working memory as a construct that contributes independently to the learning process has been submitted. Furthermore, it has been posited that working memory is a multicomponent system. The system has been demonstrated in psychological studies to undergo considerable growth from childhood through early adolescence; and it is posited here that there is also growth in L2 working memory. However, it is not being claimed here that this growth in L2 working memory is a result of changes in capacity. It is far more likely that growth occurs as a result of increased efficiency in the strategies carried out by working memory over time.  As these strategies compete with each other to construct meaning out of received input, they lead to changes in the interlanguage and in L2 working memory itself. It is through this competitive process that acquisition of language occurs over time. This model of working memory is admittedly built on several assumptions that will have to be empirically tested through carefully involved studies using a variety of L2 working memory measures to demonstrate the independence of working memory from other constructs, to show that working memory is indeed a multicomponential system and to evaluate if indeed there are changes within working memory over time. Through such research, we may be able to better appreciate how language learners are limited in their access to information and how this limitation changes over time.


  • Ando, J., Fukunaga, N., Kurahashi, J., Suto, T., Nakano, T., & Kage, M. (1992). A comparative study on the two EFL teaching methods: The communicative and the grammatical approach. Japanese Journal of Educational Psychology, 40, 247-256.
  • Baddeley, A. (1986). Working memory. Oxford: Claredon Press.
  • Baddeley, A., & Hitch, G. J. (1974). Working memory. In G. Bower (Ed.), The psychology of learning and motivation:  Advances in research and theory (pp. 47-90). New York: Academic Press.
  • Baddeley, A., Logie, R., Mimmo-Smith, I., & Brereton, N. (1985). Components of fluent reading. Journal of Memory and Language, 24, 119-131.
  • Berquist, B. (1997). Individual differences in working memory span and L2 proficiency:  Capacity or processing efficiency? In A. Sorace, C. Heccock, & R. Shillcock (Eds.), GALA (pp. 468-473). Edinburgh: The University of Edinburgh.
  • Berquist, B. (1997). Memory models applied to L2 comprehension. In G. Taillefer & A. K. Pugh (Eds.), Lecture a l'Universite:  Langues maternelle, seconde, et etrangere (pp. 29-44). Toulouse: Presses de l'Universite des Sciences Sociales de Toulouse.
  • Carpenter, P., Miyake, A., & Just, M. A. (1994). Working memory constraints in comprehension. Handbook of Psycholinguistics (pp. 1075-1122). New York: Academic Press.
  • Case, R. (1987). The structure and process of intellectual development. International Journal of Psychology, 22, 571-607.
  • Case, R., Kurland, D. M., & Goldberg, J. (1982). Operational efficiency and the growth of short-term memory span. Journal of Experimental Child Psychology, 33, 386-404.
  • Cohen, J. D., Peristein, W.M., Braver, T. S., Nystrom, L.E., Noll, D. C., Jonides, J., & Smith, E. E. (1997). Temporal dynamics of brain activation during a working memory task. Nature, 386, 604-608.
  • D'Esposito, M., Detre, J. A., Alsop, D. C., Shin, R. K., Atlas, S., & Grossman, M. (1995). The neural basis of the central executive system of working memory. Nature, 378, 279-281.
  • Daneman, M. (1991). Working memory as a predictor of verbal fluency. Journal of Psycholinguistic Research, 20(6), 445-464.
  • Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 19, 450-466.
  • Daneman, M., & Green, I. (1986). Individual differences in comprehending and producing words in context. Journal of Memory and Language, 25, 1-18.
  • Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A meta-analysis. Psychological Bulletin & Review, 3, 465-472.
  • Gathercole, S. E., & Baddeley, A. D. (1990). The role of phonological memory in vocabulary acquisition:  A study of young children learning arbitrary names of toys. British Journal of Psychology, 81, 439-454.
  • Gathercole, S. E., Willis, C. S., Emslie, H., & Baddeley, A. D. (1992). Phonological memory and vocabulary development during the early school years. Developmental Psychology, 28(5), 887-898.
  • Geva, E., & Ryan, E. B. (1993). Linguistic and cognitive correlates of academic skills in first and second languages. Language Learning, 43(1), 5-42.
  • Harrington, M. (1992). Working memory capacity as a constraint on L2 development. In R. J. Harris (Ed.), Cognitive Processing in Bilinguals. Amsterdam:  North Holland.
  • Harrington, M., & Sawyer, M. (1992). L2 working memory capacity and L2 reading skill. Studies in Second Language Acquisition, 14, 25-38.
  • Hulstijn, J., & Hulstijn, W. (1984). Grammatical errors as a function of processing constraints and explicit knowledge. Language Learning, 34, 23-43.
  • Hummel, K. M. (1998). Working memory capacity and L2 proficiency. Presentation at SLRF 1998. Honolulu, HI.
  • Ikeno, O. (1997). Processing efficiency, working memory capacity, and second language reading comprehension. Presentation at Kobe University, Kobe, Japan.
  • Jurden, F. H. (1995). Individual differnces in working memory and complex cognition. Journal of Educational Psychology, 87(1), 93-102.
  • Kyllonen, P. C., & Christal, R. E. (1990). Reasoning ability is (little more than) working memory capacity?! Intelligence, 14, 389-433.
  • Larsen-Freeman, D., & Long, M. (1991). An introduction to second language acquisition. London: Longman.
  • McLaughlin, B., Rossman, T., & McLeod, B. (1983). Second language learning  An information-processing perspective. Language Learning, 33, 135-158.
  • Meisel, J., Clashen, H., & Pienemann, M. (1981). On determining developmental stages in natural second language acquisition. Studies in Second Language Acquisition, 3, 109-135.
  • Miyake, A., & Friedman, N. P. (In Press). Individual differences in second language proficiency: Working memory as "language aptitude". In A. F. Healy & L. E. Bourne (Eds.), Foreign language learning: Psycholinguistic studies on training and retention. Mahwah, NJ: Erlbaum.
  • Miyake, A., Just, M. A., & Carpenter, P. A. (1994). Working memory constraints on the resolution of lexical ambiguity: Maintaining multiple interpretations in neutral contexts. Journal of Memory and Language, 33(2), 175-202.
  • Osaka, M., & Osaka, N. (1992). Language-independent working memory as measured by Japanese and English reading span tests. Bulletin of the Psychonomic Society, 30(4), 287-289.
  • Osaka, M., Osaka, N., & Groner, R. (1993). Language-independent working memory: Evidence from German and French reading span tests. Bulletin of the Psychonomic Society, 31(2), 117-118.
  • Paulesu, E., Frith, C.D., & Frankowiak, R. S. J. (1993). The neural correlates of the verbal component of working memory. Nature, 362, 342-345.
  • Pienemann, M., & Johnston, M. (1987). An acquisition-based procedure for second language assessment. Australian Review of Applied Linguistics, 10, 92-122.
  • Pinker, S. (1997). How the mind works. New York: W.W. Norton & Company.
  • Sagarra, N. (1998). The longitudinal role of working memory on SLA. Presentation at SLRF 1998, Honolulu, HI.
  • Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129-158.
  • Schmidt, R. (1994). Deconstructing consciousness in search of useful definitions for applied linguistics. AILA Review, 11, 11-26.
  • Service, E., & Craik, F. I. M. (1993). Differences between young and older adults in learning a foreign vocabulary. Journal of Memory and Language, 32(5), 608-623.
  • Shallice, T., Fletcher, P., Firth, C.D., Grasby, P., Frackowiak, R.S.J., & Dolan, R.J. (1994). Brain regions associated with acquisition and retrieval of verbal episodic memory. Nature, 368, 633-635.
  • Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.
  • Towse, J. N., Hitch, G. J., & Hutton, U. (1998). A reevaluation of working memory capacity in children. Journal of Memory and Language, 39, 195-217.
  • Turner, M. L., & Engle, R. W. (1989). Is working memory task dependent? Journal of Memory and Language, 28, 127-154.
  • VanPatten, B. (1990). Attending to form and content in the input. Studies in Second Language Acquisition, 12, 287-301.
  • VanPatten, B., & Cadierno, T. (1993). Explicit instruction and input processing. Studies in Second Language Acquistion, 15, 225-243.