Automated scoring in the era of artificial intelligence: An empirical study with Turkish essays

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

Automated scoring (AS) has gained significant attention as a tool to enhance the efficiency and reliability of assessment processes. Yet, its application in under-represented languages, such as Turkish, remains limited. This study addresses this gap by empirically evaluating AS for Turkish using a zero-shot approach with a rubric powered by OpenAI's GPT-4o. A dataset of 590 essays written by learners of Turkish as a second language was scored by professional human raters and an artificial intelligence (AI) model integrated via a custom-built interface. The scoring rubric, grounded in the Common European Framework of Reference for Languages, assessed six dimensions of writing quality. Results revealed a strong alignment between human and AI scores with a Quadratic Weighted Kappa of 0.72, Pearson correlation of 0.73, and an overlap measure of 83.5 %. Analysis of rater effects showed minimal influence on score discrepancies, though factors such as experience and gender exhibited modest effects. These findings demonstrate the potential of AI-driven scoring in Turkish, offering valuable insights for broader implementation in under-represented languages, such as the possible source of disagreements between human and AI scores. Conclusions from a specific writing task with a single human rater underscore the need for future research to explore diverse inputs and multiple raters.

Original languageEnglish
Article number103784
JournalSystem
Volume133
Number of pages12
ISSN0346-251X
DOIs
Publication statusPublished - 10.2025

Bibliographical note

Publisher Copyright:
© 2025 The Authors

    Research areas

  • Automated scoring, Large language models, Multilevel models, Rater reliability, Turkish essays, Zero-shot with rubric
  • Educational science

Recently viewed

Activities

  1. Effects of a seminar on mathematical modelling with MathCityMap
  2. Academy of Management Annual Meeting 2023
  3. Management Control in Supply Chain Management: A Concept and first Empirical Insights
  4. Media and Migration: An Introduction and two case studies
  5. BBC Fusion Summit: Playful interfaces for playful human beings: the future of game interfaces
  6. What do we educate for? Critical thinking and reflection as key concepts for a contemporary higher education
  7. “Will I look have I something?” Pragmatic variation across the Englishes
  8. New Work in Queer Studies
  9. 1st Global Conference on Research Integration and Implementation - i2S 2013
  10. International Convention of Psychological Science 2017
  11. Prototypes: The Usefulf Ambiguity of the „Biological Computer" (Annual Meeting of the AMERICAN SOCIETY FOR CYBERNETICS)
  12. Provenance as (Linked) Data
  13. Workshop - pre-ICIS IFIP WG 8.2 OASIS Workshop on Criticality and Values in Digital Transformation Research
  14. 12th EIASM Conference on Performance Measurement and Management Control - 2023
  15. Integrating Time Aspects into the Assessment of Sustainable Resource Management
  16. Towards a sustainable Southern Transylvania: Recognizing existing contributions to reach sustainable visions and empowering stakeholders
  17. Re-thinking Relationality in the Sociotechnological Condition
  18. Positiver Aktionismus
  19. A multi-criteria decision model for selecting a portfolio of sustainable phosphorus management strategies in different regions
  20. GDCP Jahrestagung 2020
  21. Determinants of Researchers' Roles in Real-World Transitions: A Comparative Analysis of Urban Real-World Laboratories
  22. Peripheral Expressionisms

Publications

  1. Supporting Visual and Verbal Learning Preferences in a Second-Language Multimedia Learning Environment
  2. Reporting and Analysing the Environmental Impact of Language Models on the Example of Commonsense Question Answering with External Knowledge
  3. Using Daily Stretching to Counteract Performance Decreases as a Result of Reduced Physical Activity—A Controlled Trial
  4. Challenging the status quo of accelerator research: Concluding remarks
  5. A Besov space mapping property for the double layer potential on polygons
  6. Nonautonomous control of stable and unstable manifolds in two-dimensional flows
  7. Consensus statement on defining and measuring negative effects of Internet interventions
  8. Multifractality Versus (Mono-) Fractality as Evidence of Nonlinear Interactions Across Timescales
  9. Ensuring the Long-Term Provision of Heathland Ecosystem Services—The Importance of a Functional Perspective in Management Decision Frameworks
  10. Disentangling trade-offs and synergies around ecosystem services with the influence network framework
  11. Fruit Detection and Yield Mass Estimation from a UAV Based RGB Dense Cloud for an Apple Orchard
  12. Error handling in office work with computers
  13. Comparison of an Electrochemical and Luminescence-Based Oxygen Measuring System for Use in the Biodegradability Testing According to Closed Bottle Test (OECD 301D)
  14. Dimensions, dialectic, discourse
  15. Synthesis and future research directions linking tree diversity to growth, survival, and damage in a global network of tree diversity experiments
  16. A PD Fuzzy Control of a Nonholonomic Car-Like Robot for Drive Assistant Systems
  17. Maschinenbelegungsplanung mit evolutionären Algorithmen
  18. Time for the Environment: The Tutzing Time Ecology Project
  19. Mathematik als Fremdsprache?
  20. Papers from the 10th Lancaster University Postgraduate Conference in Linguistics and Language Teaching 2015
  21. Conceptualizing sustainable consumption
  22. Negotiating boundaries through reality shows
  23. A Theory-Based Concept for Fostering Sustainability Competencies in Engineering Programs
  24. "to expose, to show, to demonstrate, to inform, to offer. Artistic Practices around 1990"
  25. The development of an eco-label for software products
  26. “The whole is greater than the sum of its parts” – Exploring teachers’ technology readiness profiles and its relation to their emotional state during COVID-19 emergency remote teaching
  27. Innovative approaches in mathematical modeling