CEFR-based Short Answer Grading

title CEFR-based Short Answer Grading
subtitle A corpus of short answers written by learners of English and graded with CEFR levels
creator(s) Anaïs Tack, Thomas François, Sophie Roekhaut, Cédrick Fairon
research center(s) Centre de traitement automatique du langage
description The project through which the corpus was collected is concerned with the task of automatically assessing the written proficiency level of non-native (L2) learners of English. Drawing on previous research on automated L2 writing assessment following the Common European Framework of Reference for Languages (CEFR), we investigate the possibilities and difficulties of deriving the CEFR level from short answers to open-ended questions, which has not yet been subjected to numerous studies up to date. The object of our study is twofold: to examine the intricacy involved with both human and automated CEFR-based grading of short answers. First, we compiled a learner corpus of short answers graded with CEFR levels by three certified Cambridge examiners. Next, we used the corpus to develop a soft-voting system for the automated CEFR-based grading of short answers.
type(s) written, learners
language(s) English
format(s) Extensible Markup Language (.xml)
corpus size 712 texts
date 2017
keywords Common European Framework of Reference, CEFR, automated grading, expert grading, short answers, open questions, English as a foreign language, writing proficiency
smallest annotation unit texte
distribution format(s) archive ZIP
description languages English
contact Anaïs Tack <anais.tack@uclouvain.be>; Cédrick Fairon <cedrick.fairon@uclouvain.be>
corpus reference

Tack, A., François, T., Roekhaut, S., & Fairon, C. (2017). Human and automated CEFR-based grading of short answers. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (pp. 169-179).