title | CEFR-based Short Answer Grading |
subtitle | A corpus of short answers written by learners of English and graded with CEFR levels |
creator(s) | Anaïs Tack, Thomas François, Sophie Roekhaut, Cédrick Fairon |
research center(s) | Centre de traitement automatique du langage |
description | The project through which the corpus was collected is concerned with the task of automatically assessing the written proficiency level of non-native (L2) learners of English. Drawing on previous research on automated L2 writing assessment following the Common European Framework of Reference for Languages (CEFR), we investigate the possibilities and difficulties of deriving the CEFR level from short answers to open-ended questions, which has not yet been subjected to numerous studies up to date. The object of our study is twofold: to examine the intricacy involved with both human and automated CEFR-based grading of short answers. First, we compiled a learner corpus of short answers graded with CEFR levels by three certified Cambridge examiners. Next, we used the corpus to develop a soft-voting system for the automated CEFR-based grading of short answers. |
type(s) | written, learners |
language(s) | English |
format(s) | Extensible Markup Language (.xml) |
corpus size | 712 texts |
date | 2017 |
keywords | Common European Framework of Reference, CEFR, automated grading, expert grading, short answers, open questions, English as a foreign language, writing proficiency |
smallest annotation unit | texte |
distribution format(s) | archive ZIP |
description languages | English |
contact | Anaïs Tack <anais.tack@uclouvain.be>; Cédrick Fairon <cedrick.fairon@uclouvain.be> |
corpus reference | Tack, A., François, T., Roekhaut, S., & Fairon, C. (2017). Human and automated CEFR-based grading of short answers. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (pp. 169-179). |