Cognitive Predictors of Child Second Language Comprehension and Syntactic Learning

: This study examined the role of child cognitive abilities for procedural and declarative learning in the earliest stages of second language (L2) exposure. In the context of a computer game, 53 ﬁrst language Italian monolingual children were aurally trained in a novel miniature language over 3 consecutive days. A mixed effects model analysis of the relationship between cognitive predictors and outcomes in morphosyntax measured via a grammaticality judgment test (GJT) was performed. Relative to adults trained in the same paradigm, children with higher procedural learning ability (measured via an alternate serial reaction time task) showed signiﬁcantly better learning of word order, although the effect size was small. Modeling accuracy in online sentence comprehension during the game also evidenced that higher procedural learning ability was positively associated with signiﬁcantly better outcomes as practice progressed. By contrast, a composite measure of verbal and visual declarative learning ability did not predict L2 outcomes in either the GJT or the online measure. The present study is based on an analysis of data from the author’s PhD dissertation. I thank the audiences of AAAL 2018 (Chicago) and of the Child Language Symposium 2018 (Reading) for their discussion. I am grateful to Kara Morgan-Short, Michael Ullman, and Judit Kormos for comments on a previous draft, to Jarrad Lum and Kara Morgan-Short for generously sharing parts of their experimental paradigms with me, to Michael Ratajczak for discussion on aspects of the data analysis, and to the editor and three anonymous reviewers for their constructive feedback. All errors remain my own. Preparation of the manuscript was supported by an ESRC postdoctoral grant awarded to the author.


Introduction
Recent studies investigating language acquisition and processing in children have highlighted the roles of cognitive individual differences. With specific reference to long-term learning, two main strands of research have emerged. The first strand has considered declarative and procedural learning ability as behavioral correlates of declarative and procedural long-term memory (for a metaanalysis of these associations, see Hamrick, Lum, & Ullman, 2018). The second strand has focused on differences in implicit statistical learning, broadly characterized as the largely implicit 1 ability to track and learn co-occurrence patterns from repeated exposure to sensory input in different modalities (Perruchet & Pachton, 2006; for reviews, see Kidd, Donnelly, &Christiansen, 2018, andWilliams, 2020). The present study mainly considered the relationship between child second language (L2) learning and cognitive individual differences from the perspective of the declarative/procedural model of learning and memory, but relevant child studies within the implicit statistical learning framework were also considered in the literature review and in the discussion.

Background Literature
According to the dual-system representation of long-term memory (Squire & Wixted, 2011), declarative memory is involved in encoding and retrieving semantic and episodic memories and is specialized in the fast learning of associations between different stimuli or bits of information. By contrast, procedural memory (a specific type of nondeclarative memory) is involved in encoding and retrieving motor and cognitive sequences and is specialized in learning that occurs as a result of repeated exposure to stimuli over time (Lee & Tomblin, 2015). The two memory systems are largely independent of each other (i.e., can learn or process the same information in parallel) but may interact cooperatively and/or competitively depending on a range of endogenous and environmental variables (e.g., amount of training, impairment of one of the systems, specific task conditions). For example, due to its flexibility and its capacity to retain information after limited exposure, the role of declarative long-term memory is prominent in low-input conditions (at initial stages of learning), when it has been shown to inhibit processing in procedural memory (for a review, see Packard & Goodman, 2013;Ullman, 2015).
Importantly, neuropsychological studies have evidenced different developmental trajectories relative to the time at which the declarative and procedural memory systems reach full anatomical and functional maturity. In general, brain structures subserving nondeclarative memory (including procedural memory) mature earlier in life during infancy and early childhood (Nelson & Webb, 2002), although there is evidence that additional development may occur at later stages (Lee, Nopoulos, & Tomblin, 2020). By contrast, the declarative memory system matures comparatively more slowly during childhood, reaching full functional maturity only during adolescence with a peak in early adulthood (e.g., Bauer, 2007;Giedd et al., 1999). Behavioral studies have largely confirmed these developmental trajectories for declarative memory (e.g., Lum, Kidd, Davis, & Conti-Ramsden, 2010). For procedural memory, some studies have not found significant developmental differences between children of different ages (e.g.,  or between children and adults (e.g., Meulemans, Van der Linden, & Perruchet, 1998), some studies have found that procedural learning can be more robust in adults compared to children (e.g., Thomas et al., 2004), and other studies have found that procedural learning is comparatively stronger in children compared to adults (e.g., Juhász, Németh, & Janacsek, 2019).
Adopting a dual-system representation of long-term memory, the declarative/procedural model (Ullman, 2015; see also Paradis, 2009, for a slightly different formulation) is a domain-general neurocognitive model of language acquisition that posits specific roles for each memory system in the acquisition and processing of linguistic information in both first language (L1) and L2. Although linguistic information can be potentially acquired and processed by either memory system, the model posits that declarative memory has a specific role in the acquisition of lexis and semiproductive and idiosyncratic forms (e.g., irregular lexical forms), and that procedural memory is specifically relevant for acquiring rule-based patterns across morphosyntax (e.g., word order, productive inflectional morphology) and phonology. Given their developmental differences, the declarative/procedural model also predicts a greater role for procedural memory in language learning and processing in early childhood and an increased importance of declarative memory from late childhood onward as the declarative memory system reaches full anatomical and functional maturity. It should be noted that the evidence that procedural learning ability may be stronger in adults compared to children is not itself counterevidence to the declarative/procedural model. What is relevant as evidence in support of the model's predictions regarding developmental differences in L2 learning is a less prominent role of declarative memory in children compared to adults.
More generally, aptitudes for implicit procedural and declarative learning have also been identified as important predictors of L2 outcomes in the L2 acquisition literature (e.g., Bolibaugh & Foster, 2021;DeKeyser, 2000;Granena, 2013). However, most of these studies were conducted with adults or employed aptitude batteries not fully validated for use with children (Rogers, Meara, Barnett-Legh, Curry, & Davie, 2017). In the L2 acquisition literature, age-related "differences in cognitive functioning" (DeKeyser, 2000(DeKeyser, , p. 519, 2012) and children's less developed ability for declarative learning (and possibly their greater reliance on procedural learning) have also been independently suggested as crucial variables to account for adults' advantage in L2 rate of learning.
Similar to procedural learning, implicit statistical learning has often been characterized as a type of implicit long-term learning (e.g., Christiansen, 2018;Shafto, Conway, Field, & Houston, 2012) and requires repeated exposure to stimuli in order to occur. However, procedural learning as conceptualized in the declarative/procedural model is defined as relying on the procedural memory system alone (i.e., one type of nondeclarative memory mainly subserved by cortical-striatal neural areas including the basal ganglia and connected frontal cortical areas as well as the cerebellum), whereas implicit statistical learning has been associated with a more complex and distributed network of nondeclarative memory functions (Batterink, Paller, & Reber, 2019;Reber, 2013).
However, the underlying neural mechanisms that the two learning processes implicate significantly overlap. Robust associations with domaingeneral procedural memory areas, established for implicit visual sequence learning and probabilistic learning (e.g., Janacsek et al., 2020), have recently been evidenced also for learning based on the ability to segment input and track sequential probabilities (in this case, alongside modality-specific areas in the sensory cortex; see Batterink et al., 2019 for a review).
Because no studies to date have examined the role of long-term memory in child L2 acquisition, the present review starts with a brief presentation of adult L2 studies in the declarative/procedural-model framework followed by a presentation of child studies that have investigated how long-term memory modulates L1 outcomes. The current study aimed to elucidate the roles of declarative and procedural learning abilities in the initial stages of the acquisition of a novel language in typically developing 8-to 9-year-old children.
Some of these studies have found that the relationships between declarative/procedural learning ability and L2 outcomes are moderated by a range of variables (amount of input/training, exposure conditions, type of task, etc.) believed to mediate declarative and procedural processing. For example, a well-known variable that increasingly favors engagement of procedural memory over declarative memory is the amount of repeated exposure to stimuli (e.g., Henke, 2010;Packard & Goodman, 2013). Increasing reliance on procedural processing in these conditions has been associated with a corresponding weaker engagement of the declarative learning route (possibly due to neural inhibitory mechanisms; see, e.g., Poldrack et al., 2001;Turner, Crossley, & Ashby, 2017;Ullman, 2015). In a related study, Morgan-Short et al. (2014) found that declarative learning ability (measured by a composite of verbal pair associates and visual recognition) significantly predicted word order accuracy (as measured by an aural grammaticality judgment test [GJT]) after a single session of 12 comprehension and production game blocks, but procedural learning ability (measured by a composite of scores from the Weather Prediction Task and initial thinking time in the Tower of London Task) was the sole predictor of word order accuracy after 72 blocks (four sessions over 2 weeks).
Further, both Brill-Schuetz and Morgan-Short (2014) and Carpenter (2008) found positive relationships between procedural learning ability (measured by an alternate serial reaction time [ARST] task and the Weather Prediction Task, respectively in each study), and end-of-training accuracy in Brocanto2 word order in learners with higher procedural learning ability after 20 and 44 game blocks, respectively, in each study. However, Pili-Moss et al. (2020), a study examining online sentence comprehension data collected but not analyzed in the Morgan-Short et al. (2014) study, found that declarative learning ability consistently predicted sentence comprehension accuracy throughout exposure (72 blocks), and did not find that procedural learning ability predicted comprehension accuracy at later stages of exposure (although after extended practice higher procedural learning ability was associated with better comprehension automatization). Finally, in a different artificial language paradigm, Hamrick (2015) reported a shift from declarative to procedural processing similar to the shift observed by Morgan-Short et al. (2014), by administering GJTs at immediate posttest and after 1 to 3 weeks of no exposure.
In naturalistic L2 acquisition, Granena (2013) also found a significant positive relationship between procedural learning ability measured by a serial reaction time (SRT) task and accuracy of nominal agreement in Chinese L2 learners of Spanish after more than 5 years of immersion. Functional magnetic resonance imaging studies have also provided evidence that adult L2 processing of inflectional morphology is related to neural activation of areas subserved by the procedural memory system (e.g., Nevat, Ullman, Eviatar, & Bitan, 2017) and that the pattern of activation becomes increasingly native like with higher proficiency and longer L2 immersion exposure (Pliatsikas, Johnstone, & Marinis, 2014).
Overall, at least for morphosyntax, adult behavioral and functional magnetic resonance imaging studies have found evidence of positive relationships between procedural learning ability and L2 outcomes for increasing amounts of L2 input and/or proficiency level, with some training studies evidencing a shift from early declarative processing to procedural processing as training progressed (or after periods of no exposure). In the case of Pili-Moss et al.'s (2020) study, the different findings (less evidence of a shift) might have been due to the fact that, unlike previous Brocanto2 studies, it employed an online measure of accuracy and measured accuracy in sentence comprehension rather than through specific aspects of morphosyntax.

The Relationship Between Cognitive Learning Abilities and Child First Language Outcomes
To date, child studies that have investigated the relationship between language and long-term memory abilities have focused mainly on L1 outcomes. Overall, child studies have varied not only with regard to the type of learning ability that they have considered (declarative, procedural, or implicit statistical) but also with regard to the type of measures used to index those abilities as well as the type of measures employed to assess language outcomes. In general, measures of visual and verbal declarative learning ability have been obtained in tasks presenting visual or verbal stimuli in a familiarization phase and subsequently probing children's ability to recognize or to recall the same information after a delay (e.g., Conti-Ramsden, Ullman, & Lum, 2015;Kidd, 2012).
Studies have also varied considerably with regard to the linguistic skills that they have assessed. They have specifically examined competence in syntax and morphology (e.g., Hedenius et al., 2011;Kidd, 2012;Kidd & Arciuli, 2016;Kidd & Kirjavainen, 2011) and used standardized batteries measuring morphosyntax as well as aspects of semantics or vocabulary (e.g., Conti-Ramsden et al., 2015;Conway et al., 2011;Hedenius et al., 2011;Riches & Jackson, 2018). Overall, a number of studies that have compared atypically with typically developing populations have found positive relationships between procedural learning ability (particularly when measured via visuomotor sequence learning) and the L1 outcomes of their typically developing control groups (see Hamrick et al., 2018, for a meta-analysis, but also, e.g., Lammertink et al., 2020;Spit & Rispens, 2019;or West, 2017, for studies in which this relationship was not evidenced).
To date, only few studies have specifically investigated long-term learning ability and L1 outcomes in typically developing children. Kidd (2012) explored the relationship between L1 syntactic priming of full be passive constructions and implicit sequence learning ability measured by an SRT task in a group of 100 English speaking children (M = 5;7 years). Employing the Word Pairs subtest from the Children's Memory Scale (Cohen, 1997) as a measure of explicit learning ability, the study found that performance on the SRT task, but not explicit learning ability, predicted priming at posttest.
In another study, Kidd and Arciuli (2016) investigated the relationship between sensitivity to the transitional probabilities of visual stimuli and comprehension of active, passive, subject relative, and object relative clauses in 68 L1 English children (M = 7;1 years). The study found that accurate performance with passive and object relative clauses was significantly positively related to statistical learning ability. In one of the few child studies that looked at morphology, Kidd and Kirjavainen (2011) examined the relationship between long-term memory measures and the acquisition of the Finnish past tense by 4-to 6-year-old native speakers (M = 5; 2 years) and found that vocabulary development, but not procedural learning ability measured by an SRT task predicted morphological attainment.
Previous research (e.g., Wonnacott, Boyd, Thomson, & Goldberg, 2012) demonstrated that children as young as 5 years of age can acquire novel word order patterns after minimal exposure to a miniature language and that they are sensitive to the statistical structure of the input. However, no child study to date has investigated the relationship between declarative and procedural learning abilities and L2 learning in children. Conducting an experimental study of this type was of interest not only because it could generally contribute to elucidating the relationship between child cognitive ability and language outcomes but also because it would shed initial light on potential differences and similarities between cognitive variables at play in child L2 acquisition versus child L1 acquisition (by comparing current findings to those from previous literature, e.g., Kidd, 2012) and between child L2 acquisition versus adult L2 acquisition (by comparing current findings to those from previous literature, e.g., Morgan-Short et al., 2014).

The Current Study
In the present study, 8-to 9-year-old L1 Italian speakers were aurally exposed to a novel miniature language modeled on Japanese in the Brocanto2 learning environment. Sentence comprehension was tracked at item level during the game practice, and learning of word order and case marking were assessed via an aural GJT administered at the end of practice (after six blocks). The study focused specifically on word order and inflectional morphology to maximize result comparability with previous research. Furthermore, it was decided to investigate the two structures separately because (a) unlike word order, there is some evidence that child learning of morphology may primarily involve variables other than implicit learning ability (e.g., Kidd & Kirjavainen;2011) and (b) there is evidence that children may find novel inflectional morphology difficult to learn (e.g., Ferman & Karni, 2010), whereas previous miniature learning studies with children have consistently reported successful learning of novel word order (e.g., Wonnacott et al., 2012). The study addressed three research questions: r Research Question 1: To what extent do declarative and procedural learning abilities predict child learning of the word order of a novel miniature language? r Research Question 2: To what extent do declarative and procedural learning abilities predict child learning of the case marking of a novel miniature language? r Research Question 3: To what extent do declarative and procedural learning abilities predict child aural comprehension of full sentences in a novel miniature language across the game practice?
Adopting the declarative/procedural model assumption of a general role of procedural memory in the learning of rule-based grammar (Ullman, 2015), the initial hypothesis for Research Questions 1 and 2 was that procedural learning ability should be positively related to learning of both L2 word order and case marking. A relationship between procedural learning ability as measured by visuomotor sequence learning and L2 gains in syntax would also be expected if findings about L1 learning in typically developing children (e.g., Kidd, 2012) can be extended to the learning of a L2. However, because there is evidence among adults that amount of input can moderate the relationship between cognitive abilities and L2 accuracy (with the procedural memory system seeming to become more engaged with increased exposure and/or at higher proficiency levels), outcomes could depend on the extent to which the shorter exposure phase used in the current study relative to previous Brocanto2 training studies was sufficient for an effect of procedural learning ability to emerge.
Compared to Research Questions 1 and 2, Research Question 3 had a more exploratory nature. This was because it is more difficult to hypothesize what the predictions of the declarative/procedural model would be for a complex linguistic skill like online sentence comprehension, and only one previous adult study adopting the Brocanto2 training paradigm (Pili-Moss et al., 2020) had investigated long-term memory and L2 outcomes employing an online measure.

Participants
The participating primary school was a state school located in Northern Italy in a southwest suburb of Milan with a population from mixed socioeconomic backgrounds. Italian was chosen as the participants' L1 due to its morphosyntactic differences from Japanese, the natural language after which the miniature language was modeled (see the Artificial Language section). The school preselected potential participants based on their attainment in L1 literacy and on general and medical records. Based on this sample, the participants selected for the study were 53 L1 Italian monolingual typically developing children with no diagnosis of learning differences or hearing impairment, with normal or corrected to normal vision, and with at least average attainment in L1 literacy relative to their school grade. The data from 13 participants were excluded from the final analysis because these participants either did not reach a minimum score in the computer game set at above chance performance in at least three out of six blocks (two participants) or did not complete training (11 participants).
Overall, the final sample included data from 40 participants (10 females) for which mean age at testing was 9 years and 2 months (M age = 109.5 months; SD = 7.1). Fifteen participants (four females) were in Grade 3 (M age = 101.6 months; SD = 4.3); 21 participants (three females) were in Grade 4 (M age = 112.5 months; SD = 3.6); four participants (three females) were in Grade 5 (M age = 119.0 months; SD = 0.0). The participants had all started to learn English as a L2 in the classroom from Grade 1. An age range between 8 and 9 years was selected based on evidence of learning from a pilot study (Pili-Moss, 2017) and in view of the fact that the game task was considered too complex for children any younger than 8 years old. The study was approved by the Ethics Research Committee of Lancaster University, and parental consent to participate was obtained for all participants prior to the beginning of the study.

Artificial Language
Participants were exposed to the miniature language BrocantoJ in the context of a computer board game similar to chess where the rules of the game were distinct from the rules of the language. BrocantoJ adapts vocabulary items from the Brocanto2 language (e.g., Morgan-Short et al., 2014) and follows the morphosyntax of Japanese and the phonotactics of Italian (for a comparison of the morphosyntax of BrocantoJ and Italian see Appendix S1 in the online Supporting Information). It includes 14 items: four nouns (blomi, nipo, pleca, vode; the tokens' names), two adjectives (troise, neimo; the token's shapes), four verbs (klino, nima, yabe, prazi; the game's moves corresponding respectively to move, capture, release, and switch), two adverbs (noika, zeima; the moves' directions), and two postpositional markers for nominative and accusative case (ri, ru), respectively. The verbs are all obligatorily transitive, that is, they always occur with an object noun phrase, except klino, which is intransitive. Examples 1 to 3 illustrate the BrocantoJ sentence types to which participants were exposed during the experiment and correspond to the game constellations in Figure 1. Example 1 SV Neimo blomi ri noika klino Square blomi NOM vertically move "The square blomi token moves vertically" Example 2 SOV Trose blomi ri neimo blomi ru zeima nima Round blomi NOM square blomi ACC horizontally capture "The round blomi piece captures the square blomi piece horizontally" Example 3 OV Neimo blomi ru zeima nima Ø square blomi ACC horizontally capture "It/another token captures the square blomi piece horizontally" For details of what the participants were expected to learn as a consequence of exposure to aural BrocantoJ sentences and to the corresponding game moves, see the Outcome Measures section. GJT/Debriefing Q.

Design
The study design included three sessions on separate, subsequent days lasting about 40 to 45 minutes, 50 minutes, and 60 minutes, respectively (see Figure 2). The memory tasks were administered on different days, with the order of days counterbalanced across participants (declarative learning ability on 1 day, lasting about 40 minutes, and procedural learning ability on a separate day, lasting about 30 minutes). Except for the vocabulary training, the participants were trained and tested individually, wore headphones, and sat in a quiet room with the researcher at individual laptop computers, two at a time.

Vocabulary Training and Testing
The participants watched a video with no audio (4.38 minutes, excluding pauses) where Suzy, a cartoon character (see Appendix S2 in the online Supporting Information) presented the tokens, shapes, moves, and directions that would be encountered in the game (12 items in total). The matching vocabulary items were introduced aurally by the researcher (who shared the participants' L1) without translations and in association with a corresponding static picture (game tokens, shapes, and directions) or animation (moves), with each item introduced in isolation (not in a phrase). The postpositional case markers were not presented in the vocabulary training. The participants rehearsed the vocabulary items in pairs with the researcher and were subsequently tested individually (see Appendix S3 in the online Supporting Information for the full procedure). The participants had to reach a criterion of 100% correctly identified word/visual associations in order to proceed to the subsequent stage of the experiment (vocabulary testing was repeated at the beginning of Sessions 2 and 3 as well). If criterion was not reached, further vocabulary instruction was provided followed by a new test (with a maximum of four attempts). All participants were able to reach criterion within three attempts, at each of the vocabulary tests. As the ability to learn and retrieve associations between sounds and meaningful pictures (or animations) assessed during vocabulary testing varied across the participants, this vocabulary learning measure (coded as VocLearn for analyses in R; see Appendix S9 in the Supporting Information online) was included as a covariate in all inferential analyses. A raw vocabulary learning score was obtained for each participant by standardizing (turning into z-scores) the sum of errors in the vocabulary test across the three sessions and multiplying this sum by −1.

Passive Exposure
After taking the vocabulary test, the participants watched a video showing a series of game moves in association with the aural BrocantoJ sentences that described them (144 aural sentence stimuli in six blocks across three sessions, 24 stimuli per block; for more information on the passive exposure stimuli, see Appendix S4 in the online Supporting Information).

Game Practice
In total, the game practice set consisted of 120 novel BrocantoJ stimuli administered in the same order to all participants (20 moves per block) in six blocks (see Figure 2). Each trial started with an on-screen game configuration accompanied by an aural sentence stimulus. The participants had to listen to the sentence and then make the move that they thought the words were describing as fast as they could, using a computer mouse. After each move, feedback appeared on screen as the words "correct" or "incorrect." The next stimulus was presented immediately afterward or after 60 seconds in case of no response. In the game practice, each correct trial corresponded to an increase of 5% of the total block score, with the count starting at 0%. A percentage correct score appeared on screen at the end of each block. As feedback was not experimentally manipulated in the present study, the extent to which it supported learning remains open to further investigation.

Outcome Measures Online Comprehension
The participants were not aware that the computer program created a bytrial online record of their moves and accuracy. This was used as an overall measure of BrocantoJ accuracy in sentence comprehension. In game trials (see Figure 1), provided participants could efficiently detect relevant word units in the aural stream and keep them in short-term memory to allow further processing, sentence comprehension minimally required: (a) the ability to match these word units to visuals in the game constellation and (b) the ability to combine their meanings to obtain an overall sentence interpretation (the move event).
These word units included the token(s) involved in a move (in some cases they were further specified by adjectives), the type of move and the thematic functions assigned to specific tokens (e.g., which of two tokens would be released in a release move), and the direction of movement with respect to the initial position of the token(s). Each game constellation included between zero and five distractor tokens. Although trials could be solved largely by applying lexically or semantically driven strategies, it was likely that learning the morphosyntactic properties of the language (word order and case marking) increasingly supported sentence interpretation as training progressed (e.g., via learning the associations between case markers and the thematic interpretation of the associated nominal phrases).

Aural Untimed Grammaticality Judgment Test
After the end of the game practice in Session 3, the participants were told that they were now experts in the new language. As such, they would help Suzy select suitable items for a new game block by judging whether a set of novel aural sentence stimuli was similar to sets of stimuli encountered during the practice. Reference to metalinguistic concepts like (un)grammaticality or grammatical acceptability was avoided. The precise instruction given and its translation into English can be found in Appendix S5 in the Supporting Information online. The GJT was developed in E-Prime (Version 2.0.10.356; Psychology Software Tools, Inc., 2016) and administered on an ASUS X553M laptop computer. Unlike the online measure, the GJT was administered by presenting the aural sentence stimuli outside the game context.
The trial started with a fixation cross (3 seconds) after which a sound icon appeared on screen while an aural sentence stimulus was played. Immediately after the aural stimulus, the text Com'è? "How is it?" appeared on screen together with a yellow arrow pointing down at the top-right part of the keyboard where six aligned keys were used to select the judgment response (corresponding to the keys from 7 to = on a British keyboard). The keys were labeled with yellow stickers depicting six smileys (see Figure 3). After a participant had pressed one of the smiley keys, or after 7 seconds, a confidence judgment on a four-point scale was elicited (participants rated the extent to which they felt that their choice was correct). Immediately after the confidence rating was provided, or after 7 seconds, the next trial started.
The test comprised a total of 28 novel test sentences (14 ungrammatical) and four practice sentences (two ungrammatical). The 14 ungrammatical sentences matched the corresponding grammatical ones and were created by inserting violations of case assignment (six sentences) and of word order (eight sentences). The slight imbalance between case assignment and word order items was due to the need to include a full range of word order violations while keeping the number of GJT items sufficiently low to avoid participant fatigue. Proportionally matching the distribution of sentence types in the exposure and gaming sets, the GJT included 16 SOV sentences, eight OV sentences, and four SV sentences.
The test started with a practice block (four items) followed by three experimental blocks (10, 10, and eight items, respectively). Participants could take short self-managed breaks at the end of each block and ask further clarification questions immediately after the practice items. In the GJT set, vocabulary items, including case markers, were counterbalanced across word categories. All sentence stimuli (practice and experimental) contained five words (eight to nine syllables each). In the ill-formed sentences, the ungrammaticality was never triggered by the first word in the sentence. The order of the practice items was the same for each participant, but the order of the experimental blocks, as well as the order of items in each experimental block, was randomized across participants (see Appendix S5 in the online Supporting Information for a list of the GJT stimuli).
Although the participants were verbally asked to provide a binary decision (the sentence is either good enough to be selected for a new level of play [a new block] or not), they were given the possibility to express a graded judgment (via the six smileys) with the purpose of at least partially mitigating yes biases in simple binary choices that had been reported in previous studies (e.g., McDaniel & Cairns, 1996) and of providing the opportunity for the participants to express their lack of endorsement in a more nuanced way. For the purposes of the statistical analysis, GJT binary scores were obtained by coding the higher three points in the scale (i.e., the happy smileys) as sentence endorsement. There was no a priori intention to obtain ordinal data from the GJT. This would have required specific instruction to ensure that the different endorsement grades within each scale (positive and negative) were similarly interpreted across participants (for instance by providing examples for each case, something that was done for the confidence rating). Thus, given the type of instruction that was provided, forcing post hoc ordinal scoring would not have been appropriate.

Cognitive Measures Declarative Learning Ability
The materials for the visual and verbal declarative memory tasks (declarative learning ability was coded as Decl for analyses in R, see Appendix S9 in the online Supporting Information), were administered individually and were part of a test battery normed for Italian children of primary-school age (Vicari, 2007). A visual declarative memory task probed the retention of associations between a series of 15 pictures of familiar objects and their positions in a four-space grid immediately after exposure (three consecutive cycles of presentation and testing) as well as after a 15-minute delay (one test only, Figure 4). Learning was measured by assigning one point for each correctly recalled association and averaging the sum of scores of the three immediate tests and of the delayed test together. In the verbal declarative memory task, the participants listened to a short story in Italian (58 words) comprised of 28 information units and were asked to repeat it once as precisely as they could immediately afterward. A measure of verbal declarative memory was obtained in a similar manner to that used for the visual declarative memory task by assigning points for accurately recalled information units immediately and, without warning, after a 15-minute delay (see Appendix S6 in the online Supporting Information for full procedures for stimuli presentation and scoring in the two declarative memory tasks). A final composite declarative memory score was obtained by applying the norms provided to the raw scores and averaging the two components as was indicated in the battery manual.

Procedural Learning Ability
An ARST task, created by adapting the SRT paradigm employed in  study, provided a measure of procedural learning ability (implicit visuomotor sequence learning). In the SRT task (e.g., Nissen & Bullemer, 1987), the participants manually respond to visual stimuli sequentially presented in one of four positions on a computer screen via a response box. In the initial blocks, the sequences follow a pattern, but in the last block the sequence is random. Procedural sequence learning is indexed by the rebound effect, that is, the extent to which reaction times increased between the last pattern block and the final random block. The ASRT task was similar, but sequence (i.e., pattern) trials (in this study 1, 2, 4, 3; see Figure 5) and random trials (r) alternated in all blocks according to a trial pattern 1 r 2 r 4 r 3 r (eight blocks, each of 80 trials administered to the participants in the same order). Consequently, learning was indexed by an increasing reduction of reaction times on sequence trials compared to random trials across blocks (e.g., Hedenius, 2013). An important advantage of the ASRT task over the SRT task is that it has been shown not to lead to the development of explicit sequence knowledge even after extended training (e.g., Hedenius, 2013;Song, Howard, & Howard, 2007). Other advantages of the ASRT task include that it allows teasing apart general motor skill learning from sequence specific learning and assessing these two types of learning continuously as opposed to only once at the end of training.
In this ASRT task, the participants were asked to react to visual stimuli (smileys) appearing on screen in one of four positions arranged in a diamond shape by pressing corresponding buttons on a game controller (see Figure 5). At the end of training, the participants were informed that the smiley sequence was not random and were asked by the researcher to guess it. None of the participants reproduced it correctly.
The task provided two measures of procedural learning ability: one based on reaction times (RT) in milliseconds and one based on (in)accuracy (see Appendix S7 in the online Supporting Information for RT cleaning procedures). For each participant, median RTs from Block 1 to Block 4 and from Block 5 to Block 8 were averaged to obtain an A and a B value, respectively; this was done separately for both the sequence and the random trial subsets. The difference between A and B (i.e., RT gain) reflected the change in reaction times from the first half to the second half of the training. Finally, RT gains from random trials were subtracted from RT gains from sequence trials, with higher positive differences indicating better sequence learning (see, e.g., Hedenius, 2013;Song et al., 2007). For the (in)accuracy measure, sequence learning was operationalized as an overall higher number of errors on random trials compared to sequence trials. For each participant the average number of errors in sequence trials was subtracted from the average number of errors in random trials, with larger positive differences indicating better sequence learning. Finally, a composite measure of procedural learning ability (coded Proc for analyses in R) was obtained by standardizing and then averaging the two components (reaction times and (in)accuracy).

Results
The analysis code and data are openly available at https://osf.io/nh4cx/

Data Analysis
General descriptive statistics were calculated, including statistics to determine whether scores for learning of BrocantoJ word order, case marking, and sentence comprehension were above chance at the group level. The inferential statistics employed binomial generalized linear mixed-effects models using the glmer function with maximum likelihood and Laplace approximation from the lme4 package (Bates, Machler, & Bolker, 2015) in R (R Core Team, 2018). All individual difference variables were standardized. The variables considered as predictors included the two cognitive predictors of interest (declarative and procedural learning ability), sentence grammaticality in the GJT analysis, and block for the analysis of the online measure. Other covariates were phonological short-term memory (see Appendix S8 in the online Supporting Information for a description), the measure of attainment in the vocabulary testing sessions, age at testing, school grade, and sex. Interactions between the main cognitive predictors and between these and sentence grammaticality and block were also investigated.
Fixed effects, including interactions, were added one at a time successively comparing nested models using the likelihood ratio test and the Akaike information criterion. Fixed effects were included in the model if the model converged and the effect statistically significantly (α = .05) improved the model's fit. Once the structure of fixed effects had been determined, random effects were explored in the same way, starting from random effects on intercepts (items, participants, and the class in which the participants belonged in school) and subsequently considering random effects on the slopes of the fixed effects. The interpretation of the effect sizes R 2 followed the fieldspecific recommendations in Plonsky and Ghanbar (2018). Inspection of the Q-Q plots of the final models indicated that the distribution of residuals was  approximately normal. Summaries of model selection and the complete code and data are available (see Appendix S9 in the online Supporting Information and https://osf.io/nh4cx/, respectively).

Research Question 1 and Research Question 2: Declarative and Procedural Learning Abilities as Predictors of Learning for Word Order and Case Marking
To address Research Questions 1 and 2, descriptive statistics for accuracy scores in the GJT overall and by type of sentence stimuli were calculated (see Table 1 and Figure 6). For all GJT categories in Table 1, z-scores for skewness and kurtosis were between −1.59 and .68, that is, in a range compatible with a normal distribution.
In the GJT, the chance threshold was 50% and, although performance was on average above chance overall, a breakdown of the results by type of stimuli Note. Decl = declarative memory; Proc = procedural learning ability. Note. Decl = declarative memory; Proc = procedural learning ability; GJT = Grammaticality Judgment Test (case ungrammatical, word order ungrammatical, and grammatical) revealed that, although performance at the group level for GJT sentences that contained a word order violation or for grammatical sentences was significantly above chance, performance for case ungrammatical sentences was below chance. For the individual difference measures, Table 2 shows the average raw scores for the components of declarative and procedural learning ability. Kolmogorov-Smirnov tests showed that the composite declarative and procedural predictors were normally distributed, D(38) = .108, p = .200 and D(36) = .122, p = .193, respectively. Table 3 provides initial Pearson's correlations of the composite (and standardized) individual difference predictors and of these predictors and the outcome measure of language comprehension and the GJT. None of the correlations was significant. To test learning of word order in the GJT, sentences with word order violations and grammatical sentences were analyzed.
For Research Question 1, the specification of Model 1 in R provided the best fit, accounting for about 12% of the variance. Given the low condition number of 1.10, multicollinearity between the predictors did not seem to pose an issue in this dataset. Table 4 presents the results of this model that returned a significant positive, small-sized effect of procedural learning ability, OR = 1.60, R 2 = .12, but the effect of declarative learning ability was nonsignificant, with a negligible effect size, OR = 1.06; R 2 = .00. In Model 1, accuracy Note. Number of observations = 747. R 2 = .12; marginal R 2 = .01. Decl = declarative memory; Proc = procedural learning ability.
(ACC) was modeled by random effects of items on intercepts (1|ITEM), random effects of the class that the participants belonged to in school on intercepts (1|CLASS), and declarative and procedural learning ability.
Model 1 ACC ∼ (1|ITEM) + (1|CLASS) + DECL + PROC For Research Question 2, the GJT outcomes indicated that the participants' ability to accurately detect case ungrammaticality was below chance at group level, hence particular caution had to be used in the interpretation of this analysis. The specification of Model 2 in R provided the best fit, accounting for about 13% of the variance. In Model 2, accuracy (ACC) was modeled by random effects of items on intercepts (1|ITEM), declarative and procedural learning ability, sentence grammaticality (GRAMM), and sex.
Model 2 ACC ∼ (1|ITEM) + DECL + PROC + GRAMM+ SEX Table 5 shows the results of this model, which had a low condition number of 1.10 and returned a significant small-sized effect of grammaticality (judgment on grammatical sentences was significantly more accurate than on ungrammatical sentences), OR = 2.56, R 2 = .06, a significant but negligible effect for sex (males were significantly more accurate than females), OR = 1.50, R 2 = .01, and nonsignificant effects of negligible size for both declarative and procedural learning ability, OR = 0.95, R 2 = .00 and OR = 1.21; R 2 = .01 respectively.
A follow-up analysis conducted only with data from participants who had performed above chance on case sensitivity in the GJT returned results comparable to those represented in Table 5 for Model 2 with the exception that Note. Number of observations = 678. R 2 = .13; marginal R 2 = .04. Decl = declarative memory; Proc = procedural learning ability. sentence grammaticality ceased to be a significant predictor (the outcome of the model is included in Appendix S9 in the online Supporting Information). Table 6 shows the percentage accuracy scores in the comprehension task overall and by block, and Figure 7 shows the density distribution of the overall accuracy scores (z-skewness = 1.93; z-kurtosis = -.117). Following the practice of previous studies that adopted the same game paradigm (e.g., Morgan-Short et al., 2014;Pili-Moss, 2017), the level of chance performance in the game was defined for each trial as the ratio of all possible correct moves to all possible moves that could be made by the game token that had the function of subject in the sentence (Morgan- Short, 2007, p. 143) and set at 14% accuracy. The comprehension scores indicated that, on average, performance was above chance overall as well as in single blocks. The specification of Model  Note. Number of observations = 4,270. R 2 = .23; marginal R 2 = .05. Decl = declarative memory; Proc = procedural learning ability; VocLearn = vocabulary learning. * = standardized coefficient.

Research Question 3: Declarative and Procedural Learning Abilities as Predictors of Sentence Comprehension During Practice
3 in R provided the best fit, with a condition number of 1.30 and accounted for about 23% of the variance (see Table 7). In Model 3, accuracy (ACC) was modeled by random effects of items on intercepts (1|ITEM), random effects of participants on intercepts (1|PART), declarative learning ability, the inverse of the errors in the vocabulary testing across the three sessions (VocLearn), and a two-way interaction between procedural learning ability and block (PROC * BLOCK).
Model 3 ACC ∼ (1|ITEM) + (1|PART) + DECL + VocLearn + PROC * BLOCK Figure 8 Procedural Learning Ability × Block interaction at early, middle, and later stages of training (a density plot of the procedural learning ability variable is given above the interaction graph).
The model returned small-sized significant positive effects for vocabulary learning, OR = 1.49, R 2 = .12, for block, OR = 1.28, R 2 = .10, and for a Procedural Learning Ability × Block interaction, 3 OR = 1.09, R 2 = .01 (see Figure 8). These effects showed that (a) the ability to learn and retain vocabulary as assessed in the vocabulary tests was significant in determining overall accuracy in sentence comprehension, (b) sentence comprehension accuracy significantly improved with practice, and (c), better procedural learners showed significantly more increase in learning as the participants progressed through the blocks.

Discussion
The present study was the first training study to investigate the relationship between declarative and procedural learning abilities and child learning of a novel miniature language with natural language characteristics. I discuss the findings for each of the three main research questions.

Cognitive Predictors of Child Learning of L2 Word Order
The first research question asked to what extent cognitive abilities that depend on long-term memory would predict child learning of word order as measured by an aural GJT administered after six language practice blocks (three sessions). Although the effect size was small, the analysis found that procedural learning ability (measured by visuomotor sequence learning), but not declarative learning ability, was a significant predictor of child L2 learning of word order. Because all test trials were novel items, the findings showed that, on average, the participants with higher procedural learning ability were statistically significantly better at generalizing to new sentences the word order properties of input to which they had been aurally exposed.
These results extend previous L1 findings that have shown that procedural and implicit statistical learning ability, but not declarative learning ability, predict syntactic development in children (Kidd, 2012;Kidd & Arciuli, 2016). They are also broadly in line with the evidence of positive relationships between visuomotor sequence learning and the L1 grammar abilities of typically developing children that has been found in some specific language impairment studies (e.g., Hedenius et al., 2011). Taken together, these findings could suggest that procedural learning ability may play a key role in child syntax acquisition across L1 and L2. The finding that implicit visual sequence learning can predict aural learning of L2 word order in children also corroborates the existence of cross-modal effects in procedural and statistical learning ability, at least when this is measured via visuomotor sequence learning (Conway, Bauernschmidt, Huang, & Pisoni, 2010;Kidd, 2012).
More generally, finding a significant procedural learning ability effect in children on a measure indexing learning of word order, and finding that no significant effect for declarative learning ability emerged, is broadly compatible with the predictions of the declarative/procedural model (Ullman, 2015) in at least two ways. First, because procedural learning ability is expected to be specifically involved in the acquisition of rule-based grammar, including word order regularities. Second, because declarative learning ability is expected to have a less prominent role in children (e.g., compared to older learners) due to the shape of its developmental trajectory (as also proposed by DeKeyser, 2012).
A further point concerns the timing of the cognitive ability effects. The GJT results showed that the participants with higher procedural learning ability developed significant sensitivity to word order after only six game blocks (120 items), a relatively short practice period. Although the present study did not include an adult group, this pattern contrasts with the findings of previous adult studies conducted in the same or very similar experimental paradigms, where significant effects of procedural learning ability emerged behaviorally only after more extended practice (20+ blocks), and no effects were found in GJTs administered after shorter training periods (e.g., 12 blocks in Morgan-Short et al., 2014). Overall, these preliminary comparisons suggest that children might engage procedural cognitive resources substantially earlier in the language learning process compared to adults. An early engagement of procedural memory resources could be facilitated if declarative learning ability is generally less robust in children, allowing for a possible early transition from declarative to procedural processing even in conditions of low input and/or reduced practice (perhaps due to reduced neural inhibition effects; see, e.g., Packard & Goodman, 2013;Poldrack et al., 2001;Ullman, 2015).
An anonymous reviewer suggested the possibility that the significant correlation found between procedural learning ability and GJT outcomes may be driven by similarities between the ASRT and the GJT tasks. In the present experimental paradigm this seems unlikely at least for two reasons. First, the ASRT task required processing of and fast motor reaction to visual information, but the GJT required processing aural stimuli. Second, even though there were similarities between the tasks in that both were computerized and required participants to respond by pressing keys or buttons, the GJT gave participants up to 7 seconds to respond, whereas the ASRT required reaction to visual stimuli within 500 milliseconds from onset. The GJT's longer response window would have leveled out any potential individual differences in motor control that might have strengthened the parity between the tasks and increased the correlation between their results. Also, if the correlation depended on task similarities in the visuomotor domain, one would have perhaps expected a stronger and more consistent relationship between ASRT performance and the scores during practice whilst playing the online gaming task that, unlike the GJT, required extensive visual processing.

Cognitive Predictors of Child Learning of L2 Case Marking
The second research question asked to what extent cognitive abilities predict the learning of L2 case marking. Against the general predictions of the declarative/procedural model, the analysis found that procedural learning ability was not significantly related to accurate performance on case marking. In this respect the results align with previous findings by Kidd and Kirjavainen (2011). Another interesting finding was that accuracy on case was significantly moderated by sex (male participants were significantly more accurate than female participants in detecting case ungrammaticality). However, the sizes of the two groups were not numerically balanced, which could have affected the result.
It is important to reiterate that the results of the analysis of the case marking scores need to be treated with particular caution due to the fact that the participants' performance on detecting ungrammatical case was at chance. This finding is in fact particularly important in a study that investigates amount of learning and language-related cognitive abilities in a correlational design. Reduced learning of morphology may have a number of explanations: the limited exposure to the language; the lack of phonological or semantic saliency of the postpositional markers; cross-linguistic differences between the participants' L1 (with no case markers on nouns) and BrocantoJ (with postpositional case markers); their functional (or communicative) redundancy in most of the exposure instances; or a combination of these variables (for a discussion of variables contributing to learning difficulty of L2 morphology, see DeKeyser, 2005). Another element that likely contributed to the lack of semantic saliency of the case markers is that they were not presented during vocabulary training and that their meaning (function) had to be induced during exposure.

Cognitive Predictors of Child L2 Sentence Comprehension
The third research question was more exploratory and asked to what extent declarative and procedural learning abilities can predict real-time aural comprehension of full sentences during a game task. On average, accurate comprehension significantly improved across practice and was most strongly predicted by the vocabulary testing scores. This is understandable because robust learning of the vocabulary items would have allowed accurate and fast identification of relevant elements in the game constellation and more efficient semantic processing at sentence level.
Similar to the results for the GJT, declarative learning ability did not significantly predict online comprehension. However, whereas the apparent absence of this relationship was expected for the GJT, it was somewhat unexpected for sentence comprehension. This is because sentence comprehension during game playing requires extensive semantic and visuospatial processing, and it has been suggested that the declarative memory system plays a key role in visuospatial perception and memory (e.g., Barense, Gaffan, & Graham, 2007). Indeed, Pili-Moss et al. (2020), the only other Brocanto2 study that has assessed comprehension accuracy online, found that declarative learning ability (indexed by a measure comparable to that used to index declarative memory in the present study) did consistently significantly predict sentence comprehension across 72 practice blocks, albeit among adult participants.
Overall, procedural learning ability did not predict accurate comprehension in this dataset. However, there was a small-sized but significant positive interaction of procedural learning ability with block, indicating that, as practice progressed, the participants with higher procedural learning ability were more likely to interpret BrocantoJ sentences correctly and thus perform correct moves. An interesting question in this respect is how exactly visuomotor sequence learning supported the development of sentence comprehension more as practice progressed (see Figure 8). Because it was known that visuomotor sequence learning predicted sensitivity to the word order properties of the language by the end of the game practice, one possibility is that the learning of word order mediated the relationship between procedural learning ability and comprehension.
Implicit knowledge of word order has been related to the ability to predict the next word in an aural sentence stimulus based on the distributional properties of the word in the input (Conway et al., 2010). In the game task, a developing sensitivity to word order could have facilitated comprehension and made sentence processing and semantic integration more efficient by supporting online predictions about specific incoming words (e.g., pleca, zeima, ru, etc.) or their semantic properties (i.e., whether the next word would be a referential expression [token], a property of a referent [adjective], a motion event [verb], etc.). However, the data do not suggest a direct relationship between emerging sensitivity to word order and comprehension because a positive correlation between GJT and comprehension scores was weak (see Table 3). 4 An alternative may be that the observation that procedural learning ability accounted for comprehension reflected a more general proceduralization/optimization of semantic integration routines that was independent of word order.

Summary of Key Findings
In summary, the overall pattern of results in the present study suggests that the procedural memory system can be engaged relatively early during exposure to a new language in children, and this can support learning of a novel word order, and can progressively facilitate the comprehension of full sentences in a meaningful gaming environment. For both learning of word order and sentence comprehension, this contrasts with evidence from previous studies employing the same paradigm that indicated that procedural learning ability predicted adult L2 outcomes only after extended practice. Finally, although knowledge of word order would generally be expected to contribute to online comprehension, this relationship was not supported in the present study where sensitivity to word order was measured via a GJT.

Limitations and Future Directions
The present study has a number of limitations that should be addressed in future research. First, although the learning of word order and language comprehension was robust, at least in the context of this 3-day experiment, the amount of exposure was not sufficient to guarantee the learning of case marking. Insufficient exposure may also have played a role in the magnitude of effect sizes. The effect size of procedural learning ability in the GJT dataset was small (OR = 1.60; 95% CI [1.17, 2.18]), whereas in their meta-analysis, Hamrick et al. (2018) reported medium effect sizes (mean weighted r = .269, p = .043) for relationships between procedural learning ability and ability to process L1 grammar in children. A possible explanation for this difference in findings could be that, unlike the L1 studies included in the meta-analysis, the present study was a short training study on a new language, and it is possible that exposure was too limited for a more robust relationship between procedural learning ability and learning to emerge (compared to the massive exposure that L1 learners have). Overall, in order to better clarify the relationship between L2 learning of morphosyntax and cognitive ability, future training studies with children should plan longer exposure and practice phases.
A second consideration concerns the study's power to detect an effect that was at least of small magnitude. Particularly for the mixed model analysis of the GJT, it is likely that the study was underpowered because the number of items was low. The comprehension analysis, on the other hand, drew on numbers of items and participants that were more similar to previous Brocanto studies that had found at least small-sized effects for the cognitive variables of interest. Future studies should aim to have more participants and adopt designs that allow the administration of a larger number of test items whilst also limiting participant fatigue effects. Further, due to the comparatively higher number of male participants, the final sample was skewed in terms of sex. Although the study did not set out to analyze this variable, sex is known to modulate longterm memory engagement (Ullman, 2015), and more balanced groups should be included in future studies.
In the present study declarative learning, indexed by the ability to recollect verbal information and visuospatial associations, did not predict L2 outcomes. However, effects may exist for other types of declarative memory (e.g., aural or visual recognition, cross-modal association, etc.). Similarly, future research might examine the extent to which additional individual differences such as working memory (see Janacsek & Nemeth, 2013) or attention (see West, 2017) moderate the relationship between declarative/procedural learning ability and L2 outcomes. Child L2 studies could also consider a broader range of implicit learning measures and, for example, explore the extent to which language learning outcomes are predicted by measures of statistical learning ability (e.g., the ability to track transitional probabilities in sequentially presented input). In order to avoid confounding effects of declarative learning on these measures, it is important that studies employ tasks for which robust relationships with nondeclarative and, more specifically, procedural memory brain areas (e.g., basal ganglia) have been established (see Janacsek et al., 2020). Finally, studies could include child and adult groups to examine the effects of age and cognitive ability in L2 learning more directly.

Conclusion
The present study analyzed the role of declarative and procedural learning abilities in the first 3 days of aural exposure to a novel miniature language in 8-to 9-year-old children. Although the effect sizes were small in this dataset, the study found that procedural learning ability predicted the participants' learning of L2 word order and, increasingly during practice, overall sentence comprehension. As such, its results corroborate and extend previous findings in cognitive psychology relating child L1 outcomes to procedural and implicit statistical learning ability (e.g., Kidd, 2012;Kidd & Arciuli, 2016).
Overall, the findings also provide initial experimental evidence compatible with predictions made by recent behavioral and neurocognitive models of L2 acquisition (e.g., DeKeyser, 2012;Ullman, 2015), according to which, in child L2 learning, one should expect a more limited reliance on declarative learning and a greater reliance on procedural learning ultimately due to differences in the maturational trajectories of declarative and procedural cognitive abilities.

Final revised version accepted 20 January 2021
Notes 1 Not all authors agree on conflating implicit and statistical learning, and some highlight a possible additional role of declarative learning in statistical learning (e.g., Batterink, Paller, & Reber, 2019). 2 The main analyses reported here investigated exclusively the interactions of the main predictors of interest with time (block, session). However, an exploratory model (see Appendix S9 in the Supporting Information online) found that the interaction of declarative learning ability and vocabulary learning ability was positive and significant. 3 A Declarative Learning Ability × Block interaction was tested during the model derivation and not found to significantly improve its fit, hence it was not included. 4 Further analyses also confirmed that the word order GJT scores did not predict comprehension, including when the relationship was moderated by block.

Open Research Badges
This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data are available at https://osf.io/nh4cx/.