Performance decline in low-stakes educational assessments: different mixture modeling approaches

Marit K. List; Alexander Robitzsch; Oliver Lüdtke; Olaf Köller; Gabriel Nagy

doi:10.1186/s40536-017-0049-3

Performance decline in low-stakes educational assessments: different mixture modeling approaches

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Standard

Performance decline in low-stakes educational assessments: different mixture modeling approaches. / List, Marit K.; Robitzsch, Alexander; Lüdtke, Oliver et al.
in: Large-Scale Assessments in Education, Jahrgang 5, Nr. 1, 15, 01.12.2017.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Bibtex

@article{fe0b42fe95484e27ad9337f4e49db1ea,

title = "Performance decline in low-stakes educational assessments: different mixture modeling approaches",

abstract = "Background: In low-stakes educational assessments, test takers might show a performance decline (PD) on end-of-test items. PD is a concern in educational assessments, especially when groups of students are to be compared on the proficiency variable because item responses gathered in the groups could be differently affected by PD. In order to account for PD, mixture item response theory (IRT) models have been proposed in the literature. Methods: In this article, multigroup extensions of three existing mixture models that assess PD are compared. The models were applied to the mathematics test in a large-scale study targeting school track differences in proficiency. Results: Despite the differences in the specification of PD, all three models showed rather similar item parameter estimates that were, however, different from the estimates given by a standard two parameter IRT model. In addition, all models indicated that the amount of PD differed between tracks, in that school track differences in proficiency were slightly reduced when PD was accounted for. Nevertheless, the models gave different estimates of the proportion of students showing PD, and differed somewhat from each other in the adjustment of proficiency scores for PD. Conclusions: Multigroup mixture models can be used to study how PD interacts with proficiency and other variables to provide a better understanding of the mechanisms behind PD. Differences between the presented models with regard to their assumptions about the relationship between PD and item responses are discussed.",

keywords = "Aberrant response behavior, Educational assessments, Group comparisons, Mixture IRT models, Performance decline, Empirical education research, Educational science",

author = "List, {Marit K.} and Alexander Robitzsch and Oliver L{\"u}dtke and Olaf K{\"o}ller and Gabriel Nagy",

note = "Publisher Copyright: {\textcopyright} 2017, The Author(s).",

year = "2017",

month = dec,

day = "1",

doi = "10.1186/s40536-017-0049-3",

language = "English",

volume = "5",

journal = "Large-Scale Assessments in Education",

issn = "2196-0739",

publisher = "SpringerOpen",

number = "1",

}

RIS

TY - JOUR

T1 - Performance decline in low-stakes educational assessments

T2 - different mixture modeling approaches

AU - List, Marit K.

AU - Robitzsch, Alexander

AU - Lüdtke, Oliver

AU - Köller, Olaf

AU - Nagy, Gabriel

PY - 2017/12/1

Y1 - 2017/12/1

N2 - Background: In low-stakes educational assessments, test takers might show a performance decline (PD) on end-of-test items. PD is a concern in educational assessments, especially when groups of students are to be compared on the proficiency variable because item responses gathered in the groups could be differently affected by PD. In order to account for PD, mixture item response theory (IRT) models have been proposed in the literature. Methods: In this article, multigroup extensions of three existing mixture models that assess PD are compared. The models were applied to the mathematics test in a large-scale study targeting school track differences in proficiency. Results: Despite the differences in the specification of PD, all three models showed rather similar item parameter estimates that were, however, different from the estimates given by a standard two parameter IRT model. In addition, all models indicated that the amount of PD differed between tracks, in that school track differences in proficiency were slightly reduced when PD was accounted for. Nevertheless, the models gave different estimates of the proportion of students showing PD, and differed somewhat from each other in the adjustment of proficiency scores for PD. Conclusions: Multigroup mixture models can be used to study how PD interacts with proficiency and other variables to provide a better understanding of the mechanisms behind PD. Differences between the presented models with regard to their assumptions about the relationship between PD and item responses are discussed.

AB - Background: In low-stakes educational assessments, test takers might show a performance decline (PD) on end-of-test items. PD is a concern in educational assessments, especially when groups of students are to be compared on the proficiency variable because item responses gathered in the groups could be differently affected by PD. In order to account for PD, mixture item response theory (IRT) models have been proposed in the literature. Methods: In this article, multigroup extensions of three existing mixture models that assess PD are compared. The models were applied to the mathematics test in a large-scale study targeting school track differences in proficiency. Results: Despite the differences in the specification of PD, all three models showed rather similar item parameter estimates that were, however, different from the estimates given by a standard two parameter IRT model. In addition, all models indicated that the amount of PD differed between tracks, in that school track differences in proficiency were slightly reduced when PD was accounted for. Nevertheless, the models gave different estimates of the proportion of students showing PD, and differed somewhat from each other in the adjustment of proficiency scores for PD. Conclusions: Multigroup mixture models can be used to study how PD interacts with proficiency and other variables to provide a better understanding of the mechanisms behind PD. Differences between the presented models with regard to their assumptions about the relationship between PD and item responses are discussed.

KW - Aberrant response behavior

KW - Educational assessments

KW - Group comparisons

KW - Mixture IRT models

KW - Performance decline

KW - Empirical education research

KW - Educational science

UR - http://www.scopus.com/inward/record.url?scp=85065146506&partnerID=8YFLogxK

U2 - 10.1186/s40536-017-0049-3

DO - 10.1186/s40536-017-0049-3

M3 - Journal articles

AN - SCOPUS:85065146506

VL - 5

JO - Large-Scale Assessments in Education

JF - Large-Scale Assessments in Education

SN - 2196-0739

IS - 1

M1 - 15

ER -

In der gleichen Zeitschrift

Same but different? Measurement invariance of the PIAAC motivation-to-learn scale across key socio-demographic groups

Gorges, J., Koch, T., Maehler, D. B. & Offerhaus, J., 01.12.2017, in: Large-Scale Assessments in Education. 5, 1, 28 S., 13.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Who likes to learn new things: measuring adult motivation to learn with PIAAC data from 21 countries

Gorges, J., Maehler, D., Koch, T. & Offenhaus, J., 01.12.2016, in: Large-Scale Assessments in Education. 4, 1, 22 S., 9.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

DOI

https://doi.org/10.1186/s40536-017-0049-3
Endgültige, publizierte Fassung

Performance decline in low-stakes educational assessments: different mixture modeling approaches

Standard

Harvard

APA

Vancouver

Bibtex

RIS

In der gleichen Zeitschrift

Same but different? Measurement invariance of the PIAAC motivation-to-learn scale across key socio-demographic groups

Who likes to learn new things: measuring adult motivation to learn with PIAAC data from 21 countries

DOI

Zuletzt angesehen

Aktivitäten

Publikationen