Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be!
- PDF / 1,439,121 Bytes
- 23 Pages / 439.642 x 666.49 pts Page_size
- 2 Downloads / 172 Views
Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be! Giora Alexandron1 David E. Pritchard2
2 · Sunbok Lee3 · ´ · Lisa Y. Yoo2 · Jose´ A. Ruiperez-Valiente
© International Artificial Intelligence in Education Society 2019
Abstract The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and opennature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners’ activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners’ data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods. Keywords Learning Analytics · MOOCs · Replication research · Sensitivity analysis · Fake learners
Giora Alexandron
[email protected]
Extended author information available on the last page of the article.
International Journal of Artificial Intelligence in Education
Preface: The Beginning of this Research During 2015 we were working on recommendation algorithms in MOOCs. However, we ran into a strange phenomenon – the most successful learners seemed to have very little interest in the course materials (explanation pages, videos), and they mainly concentrated on solving assessment items. As a result, the recommendation algorithms sometimes recommended skipping resources that we thought should be very useful. The hypothesis that these are learners who already know the material (e.g., Physics teachers) did not match the demographic data that we had on these users. One day, we received a strange email from one of the users. The user complained about a certain question, claiming that a certain response that was correct a week ago is now rejected by the system as incorrect. Since it was a parameterized question (randomized per user), we suspected that the user viewed it from two different accounts. Connecting this with the strange pattern of users who achieved high-performance without using the resources, we realized that we b
Data Loading...