Likely to stop? Predicting Stopout in Massive Open Online Courses

Understanding why students stopout will help in understanding how students learn in MOOCs. In this report, part of a 3 unit compendium, we describe how we build accurate predictive models of MOOC student stopout. We document a scalable, stopout prediction methodology, end to end, from raw source data to model analysis. We attempted to predict stopout for the Fall 2012 offering of 6.002x. This involved the meticulous and crowd-sourced engineering of over 25 predictive features extracted for thousands of students, the creation of temporal and non-temporal data representations for use in predictive modeling, the derivation of over 10 thousand models with a variety of state-of-the-art machine learning techniques and the analysis of feature importance by examining over 70000 models. We found that stop out prediction is a tractable problem. Our models achieved an AUC (receiver operating characteristic area-under-the-curve) as high as 0.95 (and generally 0.88) when predicting one week in advance. Even with more difficult prediction problems, such as predicting stop out at the end of the course with only one weeks' data, the models attained AUCs of 0.7.

[1]  Matthew W. Ohland,et al.  Identifying Factors Influencing Engineering Student Graduation: A Longitudinal and Cross‐Institutional Study , 2004 .

[2]  Panayiotis E. Pintelas,et al.  A survey on student dropout rates and dropout causes concerning the students in the Course of Informatics of the Hellenic Open University , 2002, Comput. Educ..

[3]  Nicolae Nistor,et al.  From participation to dropout: Quantitative participation patterns in online university courses , 2010, Comput. Educ..

[4]  B. Holder An investigation of hope, academics, environment, and motivation as predictors of persistence in higher education online programs , 2007, Internet High. Educ..

[5]  A. Parker A Study of Variables that Predict Dropout from Distance Education. , 1999 .

[6]  David E. Pritchard,et al.  Studying Learning in the Worldwide Classroom Research into edX's First MOOC. , 2013 .

[7]  Sherif A. Halawa,et al.  Dropout Prediction in MOOCs using Learner Activity Features , 2014 .

[8]  Laurence G Moseley,et al.  Predicting who will drop out of nursing courses: a machine learning exercise. , 2008, Nurse education today.

[9]  Jeff M. Allen,et al.  Prediction of College Major Persistence Based on Vocational Interests, Academic Preparation, and First-Year Academic Performance , 2008 .

[10]  Serge Herzog,et al.  Estimating Student Retention and Degree-Completion Time: Decision Trees and Neural Networks Vis-a-Vis Regression. , 2006 .

[11]  Michalis Nik Xenos Prediction and assessment of student behaviour in open and distance education in computers using Bayesian networks , 2004, Comput. Educ..

[12]  Steven R. Aragon,et al.  Factors Influencing Completion and Noncompletion of Community College Online Courses , 2008 .

[13]  Ji-Hye Park,et al.  Factors Influencing Adult Learners' Decision to Drop Out or Persist in Online Learning , 2009, J. Educ. Technol. Soc..

[14]  Lise Getoor,et al.  Modeling Learner Engagement in MOOCs using Probabilistic Soft Logic , 2013 .

[15]  David E. Pritchard,et al.  Bringing student backgrounds online: MOOC user demographics, site usage, and online learning , 2013, EDM.

[16]  Angel A. Juan,et al.  Developing an Information System for Monitoring Student's Activity in Online Collaborative Learning , 2008, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.

[17]  Franck Dernoncourt,et al.  MOOCdb: Developing Standards and Systems to Support MOOC Data Science , 2014, ArXiv.

[18]  Andrew D. Ho,et al.  Changing “Course” , 2014 .

[19]  Girish Balakrishnan,et al.  Predicting Student Retention in Massive Open Online Courses using Hidden Markov Models , 2013 .

[20]  Yair Levy,et al.  Comparing dropouts and persistence in e-learning courses , 2007, Comput. Educ..

[21]  Keith Tyler-Smith Early Attrition among First Time eLearners: A Review of Factors that Contribute to Drop-out, Withdrawal and Non-completion Rates of Adult Learners undertaking eLearning Programmes , 2006 .

[22]  Mihaela Cocea,et al.  Cross-System Validation of Engagement Prediction from Log Files , 2007, EC-TEL.

[23]  Amy J. Wojciechowski,et al.  Individual Student Characteristics: Can Any Be Predictors Of Success In Online Classes? , 2005 .

[24]  Vassilis Loumos,et al.  Dropout prediction in e-learning courses through the combination of machine learning techniques , 2009, Comput. Educ..

[25]  Hannah D. Street,et al.  Factors Influencing a Learner's Decision to Drop-Out or Persist in Higher Education Distance Learning , 2010 .

[26]  S. Menard Applied Logistic Regression Analysis , 1996 .

[27]  Helen J. Boon,et al.  Risk or resilience? What makes a difference? , 2008 .

[28]  Lise Getoor,et al.  Learning Latent Engagement Patterns of Students in Online Courses , 2014, AAAI.

[29]  Thierry Karsenti,et al.  The Effect of Peer Collaboration and Collaborative Learning on Self-Efficacy and Persistence in a Learner-Paced Continuous Intake Model , 2008 .

[30]  Ke Zhang,et al.  Revealing Online Learning Behaviors and Activity Patterns and Making Predictions with Data Mining Techniques in Online Teaching , 2008 .

[31]  Guillermo Mendez,et al.  Factors Associated With Persistence in Science and Engineering Majors: An Exploratory Study Using Classification Trees and Random Forests , 2008 .

[32]  Sotiris B. Kotsiantis,et al.  Preventing Student Dropout in Distance Learning Using Machine Learning Techniques , 2003, KES.

[33]  Catherine L. Finnegan,et al.  Predicting Retention in Online General Education Courses , 2005 .

[34]  Kalyan Veeramachaneni,et al.  Towards Feature Engineering at Scale for Data from Massive Open Online Courses , 2014, ArXiv.

[35]  Pamela A. Dupin-Bryant Pre-entry Variables Related to Retention in Online Distance Education , 2004 .

[36]  Jennifer DeBoer,et al.  Tracking progress: predictors of students' weekly achievement during a circuits and electronics MOOC , 2014, L@S.

[37]  Carolyn Penstein Rosé,et al.  “ Turn on , Tune in , Drop out ” : Anticipating student dropouts in Massive Open Online Courses , 2013 .