Predicting Student Performance from Combined Data Sources

This chapter will explore the use of predictive modeling methods for identifying students who will benefit most from tutor interventions. This is a growing area of research and is especially useful in distance learning where tutors and students do not mee

PDF / 798,102 Bytes
28 Pages / 439.37 x 666.142 pts Page_size
21 Downloads / 353 Views

DOWNLOAD

REPORT

Predicting Student Performance from Combined Data Sources Annika Wolff, Zdenek Zdrahal, Drahomira Herrmannova and Petr Knoth

Abstract This chapter will explore the use of predictive modeling methods for identifying students who will benefit most from tutor interventions. This is a growing area of research and is especially useful in distance learning where tutors and students do not meet face to face. The methods discussed will include decision-tree classification, support vector machine (SVM), general unary hypotheses automaton (GUHA), Bayesian networks, and linear and logistic regression. These methods have been trialed through building and testing predictive models using data from several Open University (OU) modules. The Open University offers a good test-bed for this work, as it is one of the largest distance learning institutions in Europe. The chapter will discuss how the predictive capacity of the different sources of data changes as the course progresses. It will also highlight the importance of understanding how a student’s pattern of behavior changes during the course. Keywords Predictive modeling Student outcome

Education

Virtual learning environment

Abbreviations ANOVA CMS

Analysis of variance Course management system

A. Wolff (&) Z. Zdrahal D. Herrmannova P. Knoth Knowledge Media Institute, The Open University, Milton Keynes, MK7 6AA, UK e-mail: [email protected] Z. Zdrahal e-mail: [email protected] D. Herrmannova e-mail: [email protected] P. Knoth e-mail: [email protected]

A. Peña-Ayala (ed.), Educational Data Mining, Studies in Computational Intelligence 524, DOI: 10.1007/978-3-319-02738-8_7, Springer International Publishing Switzerland 2014

175

176

CS GUHA MOOC OU SVM TMA VLE

A. Wolff et al.

Course signals General unary hypotheses automaton Massive open online course Open university Support vector machine Tutor marked assessment Virtual learning environment

7.1 Introduction Predicting student performance, in time to make interventions for improving student performance and reducing drop out or failure, leads to benefits for both students and teaching institutions. In traditional classroom learning, tutors use a range of information sources to judge whom to help, including their personal interactions with the students. In distance education, where students interact with learning materials on a virtual learning environment (VLE), machine-learning methods can be applied to combined sources of student data to predict which students will benefit most from an intervention and allow tutors to better judge whom to offer their assistance to. Whilst VLE’s have been used to deliver course materials for quite some time, their use for really large-scale delivery is a recent phenomenon. Previous course statistics have focused largely on providing data for a whole course, after completion, using only demographic data and historical analysis. For example, Kabra and Bichkar [1] use decision trees to predict failing engineering students, using past performance as th

Data Loading...

Predicting Student Performance from Combined Data Sources

Recommend Documents

Review on Predicting Student Performance

Predicting Student Flight Performance with Multimodal Features

Predicting Student Retention Among a Homogeneous Population Using Data Mining

IoT streaming data integration from multiple sources

Data Fusion: Resolving Conflicts from Multiple Sources

Modeling Student Performance in Higher Education Using Data Mining

Predicting Student Final Score Using Deep Learning

Data Sources and Data Tools

Predicting the performance of big data applications on the cloud

Working with Data Sources

Learning Personal Representations from fMRI by Predicting Neurofeedback Performance

Understanding Data Sources and Datasets