Automated scoring in the era of artificial intelligence - Leuphana Universität Lüneburg

Automated scoring in the era of artificial intelligence: An empirical study with Turkish essays

Research output: Journal contributions › Scientific review articles › Research

Authors

Burak Aydın
Tarık Kışla
Nursel Tan Elmas
Okan Bulut

Automated scoring (AS) has gained significant attention as a tool to enhance the efficiency and reliability of assessment processes. Yet, its application in under-represented languages, such as Turkish, remains limited. This study addresses this gap by empirically evaluating AS for Turkish using a zero-shot approach with a rubric powered by OpenAI's GPT-4o. A dataset of 590 essays written by learners of Turkish as a second language was scored by professional human raters and an artificial intelligence (AI) model integrated via a custom-built interface. The scoring rubric, grounded in the Common European Framework of Reference for Languages, assessed six dimensions of writing quality. Results revealed a strong alignment between human and AI scores with a Quadratic Weighted Kappa of 0.72, Pearson correlation of 0.73, and an overlap measure of 83.5 %. Analysis of rater effects showed minimal influence on score discrepancies, though factors such as experience and gender exhibited modest effects. These findings demonstrate the potential of AI-driven scoring in Turkish, offering valuable insights for broader implementation in under-represented languages, such as the possible source of disagreements between human and AI scores. Conclusions from a specific writing task with a single human rater underscore the need for future research to explore diverse inputs and multiple raters.

Original language	English
Article number	103784
Journal	System
Volume	133
Number of pages	12
ISSN	0346-251X
DOIs	https://doi.org/10.1016/j.system.2025.103784
Publication status	Published - 10.2025

Bibliographical note

Publisher Copyright:
© 2025 The Authors

Research areas

Automated scoring, Large language models, Multilevel models, Rater reliability, Turkish essays, Zero-shot with rubric
Educational science

ASJC Scopus Subject Areas

Language and Linguistics
Education
Linguistics and Language

Related by journal

Implicit statistical learning and working memory predict EFL development and written task outcomes in adolescents

Pili-Moss, D., Hamrick, P., Schmidt, T., Meurers, D. & Wendebourg, K., 01.07.2025, In: System. 131, 14 p., 103656.

Research output: Journal contributions › Journal articles › Research › peer-review

A mixed-methods study of the impact of sociocultural adaptation on the development of pragmatic production

Sánchez-Hernández, A., 07.2018, In: System. 75, p. 93-105 13 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Learner pragmatics at the discourse level: Staying “on topic” in a telecollaborative eTandem task

Black, E. & Barron, A., 07.2018, In: System. 75, p. 33-47 15 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Constructing small talk in learner-native speaker voice-based telecollaboration: A focus on topic management and backchanneling

Barron, A. & Black, E., 01.02.2015, In: System. 48, p. 112-128 17 p.

Research output: Journal contributions › Journal articles › Research › peer-review

The most frequent phrasal verbs in English language EU documents - A corpus-based analysis and its implications

Trebits, A., 09.2009, In: System. 37, 3, p. 470-481 12 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Other publications by the same author(s)

ICT knowledge absorptive capacity: A critical factor for technology integration in schools

Fischer‐Schöneborn, S., Brown, C., Aydin, B., MacGregor, S. & Pietsch, M., 13.06.2025, (E-pub ahead of print) In: British Journal of Educational Technology. 25 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Investigating the situational impact of academic language demands on university students’ boredom with an instructional video

Wirth, L., Aydin, B., Ehmke, T., Retelsdorf, J. & Kuhl, P., 10.03.2025, In: European Journal of Psychology of Education. 40, 1, 22 p., 50.

Research output: Journal contributions › Journal articles › Research › peer-review

Microfoundations of open innovation in schools: overcoming teachers’ not-invented-here syndrome with transformational leadership and leader-member-exchange

Witthöft, J., Adams, D., Aydin, B., Muniandy, V. & Pietsch, M., 05.11.2025, In: School Leadership and Management. p. 1-25 25 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Not invented here, not shared here: How school leaders’ attitudes towards external knowledge affect collaborative innovation and collective teacher innovativeness in Germany

Witthöft, J., Aydin, B. & Pietsch, M., 2025, In: Educational Management Administration and Leadership. 27 p., 17411432251346950.

Research output: Journal contributions › Journal articles › Research › peer-review

Examining long-term impacts of a training programme to improve quality of IEP goals

Rakap, S., Balikci, S., Kalkan, S., Coleman, H., Aydin, B. & Gulboy, E., 2025, In: European Journal of Special Needs Education. 40, 1, p. 35-52 18 p.

Research output: Journal contributions › Journal articles › Research › peer-review

DOI

https://doi.org/10.1016/j.system.2025.103784
Final published version

Recently viewed

Projects

Operationalising telecouplings for solving sustainability challenges related to land use

Activities

Publications

Press / Media