Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

Test items using open-ended response formats can increase an instrument’s construct validity. However, traditionally, their application in educational testing requires human coders to score the responses. Manual scoring not only increases operational costs but also prohibits the use of evidence from open-ended items to inform routing decisions in adaptive designs. Using machine learning and natural language processing, automatic scoring provides classifiers that can instantly assign scores to text responses. Although optimized for agreement with manual scores, automatic scoring is not perfectly accurate and introduces an additional source of error into the response process, leading to a misspecification of the measurement model used with the manual score. We propose two joint models for manual and automatic scores of automatically scored open-ended items. Our models extend a given model from Item Response Theory for the manual scores by a component for the automatic scores, accounting for classification errors. The models were evaluated using data from the Programme for International Student Assessment (2012) and simulated data, demonstrating their capacity to mitigate the impact of classification errors on ability estimation compared to a baseline that disregards classification errors.

Original languageEnglish
JournalPsychometrika
ISSN0033-3123
DOIs
Publication statusAccepted/In press - 2025

Bibliographical note

Publisher Copyright:
© 2025 Cambridge University Press. All rights reserved.

    Research areas

  • automatic scoring, item response modeling, large-scale assessment
  • Informatics

DOI

Recently viewed

Activities

  1. Simulation and Evaluation of Control Mechanisms for Mobile Robot Fulfillment Systems
  2. Is there only one modelling competency? The question of situated cognition when solving real world problems
  3. Effects of using VR training for skill development and reflection in the context of parent-teacher conferences
  4. Conference on Participatory Approaches in Science & Technology - PATH 2006
  5. The semantics of transformation: conceptual work based on Freirean methodology.
  6. "Curious and Concerned" – A mixed-methods study of teacher educators’ AI literacy, usage experience, and perceptions
  7. On the relational structure of two tests measuring general pedagogical knowledge
  8. Linguistic Determines Mathematics: How Linguistic Item Characteristics Influence the Difficulty of Mathematics Test Ttems
  9. diffractions and the (un-)making of difference - 2020
  10. Expertise in law: 'from above' and 'from below'
  11. An axiomatic foundation of entropic preferences under Knightian uncertainty
  12. Shifting Backstages and Frontlines of Embodiment
  13. How do pre-service teachers analyze classroom lessons? Different patterns of written analysis and effects of direct instructional or problem-oriented learning environments.
  14. Transdisciplinary Evaluation of Alternative Adaptation Strategies Value-Tree Method as a Tool to Integrate Multiple Values of Science, Practice and the General Public into Decision-Making
  15. Between primary and secondary information: Gilbert Simondon and the question of complexity and control
  16. Employer Longevity Readiness Index Workshop: Session 2: How do you build a longevity readiness Index?

Publications

  1. Switching Dispatching Rules with Gaussian Processes
  2. Refusal and the Computational City - From (De)Coding the Machine to (En)Coding Care
  3. A computational study of a model of single-crystal strain-gradient viscoplasticity with an interactive hardening relation
  4. A Wavelet Packet Algorithm for Online Detection of Pantograph Vibrations
  5. Accounting and Modeling as Design Metaphors for CEMIS
  6. Active and semi-supervised data domain description
  7. Formative Perspectives on the Relation Between CSR Communication and CSR Practices
  8. Sensitivity to complexity - an important prerequisite of problem solving mathematics teaching
  9. Combining multiple investigative approaches to unravel functional responses to global change in the understorey of temperate forests
  10. Dispatching rule selection with Gaussian processes
  11. An extended analytical approach to evaluating monotonic functions of fuzzy numbers
  12. Parameters Estimation of a Lotka-Volterra Model in an Application for Market Graphics Processing Units
  13. Estimation and interpretation of a Heckman selection model with endogenous covariates
  14. Comparison of Bio-Inspired Algorithms in a Case Study for Optimizing Capacitor Bank Allocation in Electrical Power Distribution
  15. Changing the Administration from within:
  16. Positioning Improvement for a Laser Scanning System using cSORPD control
  17. An analytical approach to evaluating nonmonotonic functions of fuzzy numbers
  18. Enhancing implicit change detection through action
  19. Mining positional data streams
  20. Who can receive the pass? A computational model for quantifying availability in soccer
  21. Development of a scoring parameter to characterize data quality of centroids in high-resolution mass spectra
  22. Trait correlation network analysis identifies biomass allocation traits and stem specific length as hub traits in herbaceous perennial plants
  23. Material flow during constrained friction processing and its effects on the local properties of AM50 rods
  24. Applications of the Simultaneous Modular Approach in the Field of Material Flow Analysis
  25. Understanding reading as a form of language-use
  26. HAWK - hybrid question answering using linked data
  27. Identification of conductive fiber parameters with transcutaneous electrical nerve stimulation signal using RLS algorithm
  28. Introducing split orders and optimizing operational policies in robotic mobile fulfillment systems
  29. Dynamic priority based dispatching of AGVs in flexible job shops