Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator

We show that reverse-mode AD (Automatic Differentiation)—a generalized gradient-calculation operator—can be incorporated as a first-class function in an augmented lambda calculus, and therefore into a functional-programming language. Closure is achieved, in that the new operator can be applied to any expression in the augmented language, yielding an expression in that language. This requires the resolution of two major technical issues: (a) how to transform nested lambda expressions, including those with free-variable references, and (b) how to support self application of the AD machinery. AD transformations preserve certain complexity properties, among them that the reverse phase of the reverse-mode AD transformation of a function have the same temporal complexity as the original untransformed function. First-class unrestricted AD operators increase the expressive power available to the numeric programmer, and may have significant practical implications for the construction of numeric software that is robust, modular, concise, correct, and efficient.

[1]  Gershon Kedem,et al.  Automatic Differentiation of Computer Programs , 1980, TOMS.

[2]  William H. Press,et al.  Numerical recipes in C (2nd ed.): the art of scientific computing , 1992 .

[3]  Andrew W. Appel,et al.  SSA is functional programming , 1998, SIGP.

[4]  Barak A. Pearlmutter,et al.  Perturbation Confusion and Referential Transparency:Correct Functional Implementation of Forward-Mode AD , 2005 .

[5]  Louis B. Rall,et al.  Automatic Differentiation: Techniques and Applications , 1981, Lecture Notes in Computer Science.

[6]  Barak A. Pearlmutter,et al.  Nesting forward-mode AD in a functional framework , 2008, High. Order Symb. Comput..

[7]  M. Felleisen,et al.  Reasoning about programs in continuation-passing style , 1993 .

[8]  Jerzy Karczmarczuk Lazy Time Reversal , and Automatic Differentiation , 2002 .

[9]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[10]  Gerald J. Sussman,et al.  Structure and interpretation of classical mechanics , 2001 .

[11]  Jerzy Karczmarczuk,et al.  Functional Differentiation of Computer Programs , 1998, ICFP '98.

[12]  Andreas Griewank,et al.  Algorithm 755: ADOL-C: a package for the automatic differentiation of algorithms written in C/C++ , 1996, TOMS.

[13]  Jerzy Karczmarczuk,et al.  Adjoint Codes in Functional Framework , 2000 .

[14]  B. Speelpenning Compiling Fast Partial Derivatives of Functions Given by Algorithms , 1980 .

[15]  Richard Kelsey,et al.  A correspondence between continuation passing style and static single assignment form , 1995, IR '95.

[16]  Jonathan Rees,et al.  Revised3 report on the algorithmic language scheme , 1986, SIGP.

[17]  Andreas Griewank,et al.  Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[18]  Bruce Christianson,et al.  Automatic Hessians by reverse accumulation , 1992 .

[19]  SiskindJeffrey Mark,et al.  Reverse-mode AD in a functional framework , 2008 .

[20]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[21]  William H. Press,et al.  Numerical recipes in C , 2002 .

[22]  Robert Hieb,et al.  Revised 5 Report on the Algorithmic Language , 1999 .

[23]  Dafydd Gibbon,et al.  1 User’s guide , 1998 .

[24]  R. E. Wengert,et al.  A simple automatic derivative evaluation program , 1964, Commun. ACM.

[25]  Jerzy Karczmarczuk Functional Coding of Differential Forms , 1999 .

[26]  Michael B. Monagan,et al.  GRADIENT: algorithmic differentiation in Maple , 1993, ISSAC '93.

[27]  Jeffrey Mark Siskind,et al.  Flow-Directed Lightweight Closure Conversion , 2000 .

[28]  Laurent Hascoët,et al.  TAPENADE 2.1 user's guide , 2004 .

[29]  William H. Press,et al.  Numerical Recipes in C, 2nd Edition , 1992 .

[30]  Jerzy Karczmarczuk,et al.  Calcul des adjoints et programmation paresseuse , 2001, JFLA.

[31]  R. Kent Dybvig,et al.  Revised5 Report on the Algorithmic Language Scheme , 1986, SIGP.

[32]  Barak A. Pearlmutter,et al.  Lazy multivariate higher-order forward-mode AD , 2007, POPL '07.

[33]  Andreas Griewank,et al.  Automatic Differentiation of Algorithms: From Simulation to Optimization , 2000, Springer New York.