Perturbation Confusion and Referential Transparency:Correct Functional Implementation of Forward-Mode AD

It is tempting to incorporate dierentiation operators into functional-programming languages. Making them rst-class citizens, however, is an enterprise fraught with danger. We discuss a potential problem with forward-mode AD common to many AD systems, including all attempts to integrate a forward-mode AD operator into Haskell. In particular, we show how these implementations fail to preserve referential transparency, and can compute grossly incorrect results when the dierentiation operator is applied to a function that itself uses that operator. The underlying cause of this problem is perturbation confusion, a failure to distinguish between distinct perturbations introduced by distinct invocations of the dierentiation operator. We then discuss how perturbation confusion can be avoided.