Optimizing Instructional Policies

Psychologists are interested in developing instructional policies that boost student learning. An instructional policy specifies the manner and content of instruction. For example, in the domain of concept learning, a policy might specify the nature of exemplars chosen over a training sequence. Traditional psychological studies compare several hand-selected policies, e.g., contrasting a policy that selects only difficult-to-classify exemplars with a policy that gradually progresses over the training sequence from easy exemplars to more difficult (known as fading). We propose an alternative to the traditional methodology in which we define a parameterized space of policies and search this space to identify the optimal policy. For example, in concept learning, policies might be described by a fading function that specifies exemplar difficulty over time. We propose an experimental technique for searching policy spaces using Gaussian process surrogate-based optimization and a generative model of student performance. Instead of evaluating a few experimental conditions each with many human subjects, as the traditional methodology does, our technique evaluates many experimental conditions each with a few subjects. Even though individual subjects provide only a noisy estimate of the population mean, the optimization method allows us to determine the shape of the policy space and to identify the global optimum, and is as efficient in its subject budget as a traditional A-B comparison. We evaluate the method via two behavioral studies, and suggest that the method has broad applicability to optimization problems involving humans outside the educational arena.

[1]  Thomas L. Griffiths,et al.  Faster Teaching by POMDP Planning , 2011, AIED.

[2]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .

[3]  Michael C. Ferris,et al.  A Direct Search Algorithm for Optimization with Noisy Function Evaluations , 2000, SIAM J. Optim..

[4]  Robert L. Goldstone,et al.  The sensitization and differentiation of dimensions during category learning. , 2001, Journal of experimental psychology. General.

[5]  Denise C. Park,et al.  A lifespan database of adult facial stimuli , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[6]  Kurt VanLehn,et al.  Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies , 2011, User Modeling and User-Adapted Interaction.

[7]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[8]  Sun-Joo Cho,et al.  Explanatory Item Response Models , 2004 .

[9]  Andy J. Keane,et al.  Recent advances in surrogate-based optimization , 2009 .

[10]  Joel D. Martin,et al.  Student assessment using Bayesian nets , 1995, Int. J. Hum. Comput. Stud..

[11]  J. P. Salmon,et al.  Norms for two types of manipulability (graspability and functional usage), familiarity, and age of acquisition for 320 photographs of objects , 2010, Behavior research methods.

[12]  Diane Pecher,et al.  The effect of study time distribution on learning and retention: a Goldilocks principle for presentation rate. , 2012, Journal of experimental psychology. Learning, memory, and cognition.

[13]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[14]  Michael A. Osborne,et al.  Gaussian Processes for Global Optimization , 2008 .

[15]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[16]  R. Bjork,et al.  Learning Concepts and Categories , 2008, Psychological science.

[17]  R. Sawyer The Cambridge Handbook of the Learning Sciences: Index , 2014 .

[18]  John R. Anderson,et al.  Skill Acquisition and the LISP Tutor , 1989, Cogn. Sci..

[19]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[20]  P. Boeck,et al.  Explanatory item response models : a generalized linear and nonlinear approach , 2004 .

[21]  J. Whitehill Optimal Teaching Machines , 2009 .

[22]  Sean H. K. Kang,et al.  Learning Painting Styles: Spacing is Advantageous when it Promotes Discriminative Contrast , 2012 .

[23]  Bilge Mutlu,et al.  How Do Humans Teach: On Curriculum Learning and Teaching Dimension , 2011, NIPS.

[24]  Karen B. Schloss,et al.  Aesthetic response to color combinations: preference, harmony, and similarity , 2010, Attention, perception & psychophysics.