论文信息 - Towards Feature Engineering at Scale for Data from Massive Open Online Courses

Towards Feature Engineering at Scale for Data from Massive Open Online Courses

We examine the process of engineering features for developing models that improve our understanding of learners' online behavior in MOOCs. Because feature engineering relies so heavily on human insight, we argue that extra effort should be made to engage the crowd for feature proposals and even their operationalization. We show two approaches where we have started to engage the crowd. We also show how features can be evaluated for their relevance in predictive accuracy. When we examined crowd-sourced features in the context of predicting stopout, not only were they nuanced, but they also considered more than one interaction mode between the learner and platform and how the learner was relatively performing. We were able to identify different influential features for stop out prediction that depended on whether a learner was in 1 of 4 cohorts defined by their level of engagement with the course discussion forum or wiki. This report is part of a compendium which considers different aspects of MOOC data science and stop out prediction.

[1] Lise Getoor,et al. Learning Latent Engagement Patterns of Students in Online Courses , 2014, AAAI.

[2] Chris Piech,et al. Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[3] Franck Dernoncourt,et al. MOOCdb: Developing Standards and Systems to Support MOOC Data Science , 2014, ArXiv.

[4] Carolyn Penstein Rosé,et al. “ Turn on , Tune in , Drop out ” : Anticipating student dropouts in Massive Open Online Courses , 2013 .

[5] Girish Balakrishnan,et al. Predicting Student Retention in Massive Open Online Courses using Hidden Markov Models , 2013 .

[6] Sherif A. Halawa,et al. Dropout Prediction in MOOCs using Learner Activity Features , 2014 .

[7] Lise Getoor,et al. Modeling Learner Engagement in MOOCs using Probabilistic Soft Logic , 2013 .