Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined?
- PDF / 1,123,773 Bytes
- 47 Pages / 439.37 x 666.142 pts Page_size
- 115 Downloads / 261 Views
Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined? Vivekanandan S. Kumar 1
& David
Boulanger 1
# International Artificial Intelligence in Education Society 2020
Abstract This article investigates the feasibility of using automated scoring methods to evaluate the quality of student-written essays. In 2012, Kaggle hosted an Automated Student Assessment Prize contest to find effective solutions to automated testing and grading. This article: a) analyzes the datasets from the contest – which contained hand-graded essays – to measure their suitability for developing competent automated grading tools; Dedication Prof Jim Greer, along with Prof Gord McCalla, in the late 90s supervised my (Vive Kumar’s) doctoral research at the ARIES lab, University of Saskatchewan. In those days, the pursuit of autonomous AIED was at its frenzy. Jim was one of the first ones to realize the need for the continued existence of an umbilical cord even after the birth of a machine intelligence from its human creators. ARIES later formalized it as ‘human-in-the-loop’, where humans co-create knowledge by cooperating, at various degrees of aggregation and abstraction, with an autonomous learning machine. Jim’s vision was a companionship, where every piece of data, knowledge, advice, decision, and policy that were in play would require an equal say from both the machine and its human creator. The human might convince the machine, or the machine might explain away its reasoning for something to exist in that world of companions. Jim and I had several thoroughly enjoyable conversations about the centrality of humans in a machine-supplemented world and vice-versa. We even had one during a friendly faculty-student baseball game, as he differentiated between a baseball catcher and a cricket wicketkeeper, on a beautiful spring day, at the best university campus in North America. Jim argued for the continued existence of the cord, as a precursor to building a notion of trust between the two entities. That was Jim, seeding his ideas in our minds, no matter the place or the situation. About a decade later, Jim was on the advisory board of the Faculty of Science and Technology at Athabasca University where I had joined as a faculty member in 2008. Normally, he would attend the board meetings via teleconference, but in one such meeting he was there in Edmonton, Alberta, in person. For some reason, he took me aside during the lunch break for a chat. He said he was looking deeply into analytics and urged me to pursue the low-hanging fruits of learning analytics! He wondered about the feasibility of doing analytics with smalldata while not ignoring the compelling need for the AIED community to push the data boundary toward bigdata. We joked about the luxury of our research colleagues in Physics, Astronomy and Biology working with truly big exabyte datasets in subatomic data, astronomy data and genomic datasets, respectively. We talked about ways in which AIED researchers could find a way to collect live educational da
Data Loading...