How to get really smart: Modeling retest and training effects in ability testing using computer-generated figural matrix items

Research output: Journal contributionsJournal articlesResearchpeer-review


The interpretation of retest scores is problematic because they are potentially affected by measurement and predictive bias, which impact construct validity, and because their size differs as a function of various factors. This paper investigates the construct stability of scores on a figural matrices test and models retest effects at the level of the individual test taker as a function of covariates (simple retest vs. training, use of identical vs. parallel retest forms, and general mental ability). A total of N=189 subjects took two tests of matrix items that were automatically generated according to a strict construction rationale. Between test administrations, participants in the intervention groups received training, while controls did not. The Rasch model fit the data at both time points, but there was a lack of item difficulty parameter invariance across time. Training increased test performance beyond simple retesting, but there was no large difference between the identical and parallel retest forms at the individual level. Individuals varied greatly in how they profited from retest experience, training, and the use of identical vs. parallel retest forms. The results suggest that even with carefully designed tasks, it is problematic to directly compare scores from initial tests and retests. Test administrators should emphasize learning potential instead of state level assessment, and inter-individual differences with regard to test experience should be taken into account when interpreting test results.
Original languageEnglish
Issue number4
Pages (from-to)233-243
Number of pages11
Publication statusPublished - 07.2011