Automated Assessment of Non-Native Learner Essays: Investigating the Role of Linguistic Features

  • PDF / 352,860 Bytes
  • 27 Pages / 439.642 x 666.49 pts Page_size
  • 52 Downloads / 183 Views

DOWNLOAD

REPORT


Automated Assessment of Non-Native Learner Essays: Investigating the Role of Linguistic Features Sowmya Vajjala1

© International Artificial Intelligence in Education Society 2017

Abstract Automatic essay scoring (AES) refers to the process of scoring free text responses to given prompts, considering human grader scores as the gold standard. Writing such essays is an essential component of many language and aptitude exams. Hence, AES became an active and established area of research, and there are many proprietary systems used in real life applications today. However, not much is known about which specific linguistic features are useful for prediction and how much of this is consistent across datasets. This article addresses that by exploring the role of various linguistic features in automatic essay scoring using two publicly available datasets of non-native English essays written in test taking scenarios. The linguistic properties are modeled by encoding lexical, syntactic, discourse and error types of learner language in the feature set. Predictive models are then developed using these features on both datasets and the most predictive features are compared. While the results show that the feature set used results in good predictive models with both datasets, the question ”what are the most predictive features?” has a different answer for each dataset. Keywords Automated writing assessment · Essay scoring · Natural language processing · Text analysis · Linguistic features · Student modeling

Introduction People learn a foreign language for several reasons such as living in a new country or studying in a foreign language. In many cases, they also take exams that certify their

 Sowmya Vajjala

[email protected] 1

Iowa State University, Ames, IA 50011, USA

Int J Artif Intell Educ

language proficiency based on some standardized scale. Automated Essay Scoring (AES) refers to the process of automatically predicting the grade for a free form essay written by a learner in response to some prompt. This is commonly viewed as one way to assess the writing proficiency of learners, typically non-native speakers of a language. Producing such an essay is also one of the components of many high stakes exams like GRE , TOEFL and GMAT . Several AES systems are already being used in real world applications along with human graders. Along with such high stakes testing scenarios, AES could also be useful in placement testing at language teaching institutes, to suggest the appropriate level language class to a learner. Owing to this relevance to different language assessment scenarios, AES has been widely studied by educational technology researchers. While most of the published research on AES has been on proprietary systems, recent availability of publicly accessible learner corpora facilitated comparable and replicable research on second language (L2) proficiency assessment (Yannakoudakis et al. 2011; Nedungadi and Raj 2014). One non-test taking application of automatic analysis of writing is in providing real-time feedb