A Menu of Designs for Reinforcement Learning Over Time

This chapter contains sections titled: Introduction and Overview, A Simple Two-Component Adaptive Critic Design, HDP and Dynamic Programming, Alternative Ways to Figure 3.2 in Adapting the Action Network, Alternatives to HDP in Adapting the Critic Network, Some Topics for Further Research, Equations and Code For Implementation, References