Because MOOCs bring big data to the forefront, they confront learning science with technology challenges. We describe an agenda for developing technology that enables MOOC analytics. Such an agenda needs to efficiently address the detailed, low level, high volume nature of MOOC data. It also needs to help exploit the data's capacity to reveal, in detail, how students behave and how learning takes place. We chart an agenda that starts with data standardization. It identifies crowd sourcing as a means to speed up data analysis of forum data or predictive analytics of student behavior. It also points to open source platforms that allow software to be shared and visualization analytics to be discussed. Massi ve Open Online Courses (MOOCs) are college courses offered on the Internet. Lectures are conveyed by videos, textbooks are digitized, and problem sets, quizzes and practice questions are web–based. Students communicate with one another and faculty via discussion forums. Grading, albeit constrained by somewhat restrictive assessment design, is automated. The popularity of MOOCs has made a high volume of learner data available for analytic purposes. Some MOOC data is just like that which comes from the classroom. This can include teaching material, student demographics and background data, enrollment information, assessment scores and grades. But very important differences arise between MOOC and classroom in how behavioral data is collected and what is observable. The platform records, unobtrusively, through input, capture every mouse click, video player control use, and every submission to the platform such as problem solution choice selection, solution composition or text entry for a forum discussion. The level of recorded detail of behavior in a MOOC vastly surpasses that recorded in conventional settings. Very directly, this data can provide a count of problem attempts and video replays. It can reveal how long a student stayed on a textbook page or the presence of very short, quick patterns of resource consultation. It can inform an individualized or aggregated portrait of how a student solves problems or accesses resources. It presents opportunities to identify and compare different cohorts of students in significant quantities, thus enabling us to personalize how content is delivered. It allows us to study learner activities not exclusive to problem-solving, such as forum interactions and video-watching habits (Thille et al., 2014). It also facilitates predictive analytics based on modeling and machine learning. This data also contains large samples. Large sample sizes enable us to …
[1]
Candace Thille,et al.
The Future of Data-Enriched Assessment.
,
2014
.
[2]
Kalyan Veeramachaneni,et al.
Towards Feature Engineering at Scale for Data from Massive Open Online Courses
,
2014,
ArXiv.
[3]
Justin Reich,et al.
Evaluating Geographic Data in MOOCs
,
2013
.
[4]
Franck Dernoncourt,et al.
MOOCdb: Developing Standards and Systems to Support MOOC Data Science
,
2014,
ArXiv.
[5]
Carolyn Penstein Rosé,et al.
“ Turn on , Tune in , Drop out ” : Anticipating student dropouts in Massive Open Online Courses
,
2013
.
[6]
Antonio Torralba,et al.
LabelMe: A Database and Web-Based Tool for Image Annotation
,
2008,
International Journal of Computer Vision.
[7]
Kalyan Veeramachaneni,et al.
Likely to stop? Predicting Stopout in Massive Open Online Courses
,
2014,
ArXiv.
[8]
David E. Pritchard,et al.
Studying Learning in the Worldwide Classroom Research into edX's First MOOC.
,
2013
.
[9]
Carolyn Penstein Rosé,et al.
Social factors that contribute to attrition in MOOCs
,
2014,
L@S.
[10]
Nabeel Gillani,et al.
Learner Communications in Massively Open Online Courses
,
2013
.
[11]
Rebecca Eynon,et al.
Communication patterns in massively open online courses
,
2014,
Internet High. Educ..
[12]
Justin Reich,et al.
HarvardX and MITx: The First Year of Open Online Courses, Fall 2012-Summer 2013
,
2014
.