Predicting Students' Retention of Facts from Feedback during Study

Predicting Students’ Retention of Facts from Feedback during Study Robert Lindsey (robert.lindsey@colorado.edu) Department of Computer Science, 430 UCB University of Colorado, Boulder, CO 80309 USA Owen Lewis (owen.lewis@colorado.edu) Department of Applied Mathematics, 526 UCB University of Colorado, Boulder, CO 80309 USA Harold Pashler (hpashler@ucsd.edu) Department of Psychology, 0109 University of California, San Diego, La Jolla, CA 92093 USA Michael Mozer (mozer@colorado.edu) Department of Computer Science, 430 UCB University of Colorado, Boulder, CO 80309 USA Abstract Testing students as they study a set of facts is known to enhance their learning (Roediger & Karpicke, 2006). Testing also pro- vides tutoring software with potentially valuable information regarding the extent to which a student has mastered study ma- terial. This information, consisting of recall accuracies and re- sponse latencies, can in principle be used by tutoring software to provide students with individualized instruction by allocat- ing a student’s time to the facts whose further study it predicts would provide greatest benefit. In this paper, we propose and evaluate several algorithms that tackle the benefit-prediction aspect of this goal. Each algorithm is tasked with calculat- ing the likelihood a student will recall facts in the future given recall accuracy and response latencies observed in the past. The disparate algorithms we tried, which range from logis- tic regression to a Bayesian extension of the ACT-R declara- tive memory module, proved to all be roughly equivalent in their predictive power. Our modeling work demonstrates that, although response latency is predictive of future test perfor- mance, it yields no predictive power beyond that which is held in response accuracy. Keywords: intelligent tutoring, ACT-R, Bayesian inference, fact learning Introduction An effective way to teach facts is to test students while they are studying (Roediger & Karpicke, 2006). For example, if a student is learning the meanings of foreign words, an ap- propriately designed tutoring system would display a foreign word, ask the student to guess the English translation, and then provide the correct answer. In this work, we consider the case where students undergo several rounds of this type of study. By convention, we refer to the group of rounds as a study session. At the end of a study session, students have had several encounters with each item being studied. In addition to promoting robust learning, testing students during study provides valuable information that can in principle be used to infer a student’s current and future state of memory for the material. Through the use of a student’s performance during study to predict recall at a subsequent test, informed deci- sions can be made about the degree to which individual facts would benefit from further study. In this paper, we explore algorithms to predict a student’s future recall performance on specific facts using both the accuracy of the student’s re- sponses during study, and their response latencies—the time it took to produce the responses. In principle, other informa- tion is available as well, such as the nature of errors made and the student’s willingness to guess a response. However, we restrict ourselves to accuracy and latency data because such data are independent of the domain and the study question format. Thus, we expect that algorithms that base their pre- dictions on accuracy and latency data will be applicable to many domains. Predicting future recall accuracy from observations dur- ing study can be posed as a machine learning problem. Given a group of students for whom we have made obser- vations, we divide the students into “training” and “test” groups. The training group is used to build predictive mod- els whose performance is later evaluated using the test group. We developed several predictive models and describe them later in this paper. Of particular interest is a method we call Bayesian ACT-R (BACT-R). It is based on the declar- ative memory module of the ACT-R cognitive architecture (Anderson, Byrne, Douglass, Lebiere, & Qin, 2004). The module has equations that interrelate response latency dur- ing study, accuracy during study, the time periods separating study sessions from one another and from the test, and the probability of a correct answer at test. However, these equa- tions have a large number of free parameters which makes it challenging to use the model in a truly predictive manner. BACT-R is a method for using Bayesian techniques to infer a distribution over the free parameters, which makes it possible to use the ACT-R equations to predict future recall. This paper is organized as follows: first, we describe the experiment from which we obtained accuracy and latency data for a group of students studying paired associates. Next, we describe BACT-R and three other models we built to pre- dict student recall in the experiment. Finally, we evaluate and discuss the performance of the algorithms.