Adaptive Critic Design with ESN Critic for Bioprocess Optimization

We propose an on-line action-dependent heuristic dynamic programming approach based on recurrent neural network architecture - Echo state network (ESN) - as critic network within the frame of adaptive critic design (ACD), to be used for adaptive control. Here it is applied to the optimization of a complex nonlinear process for production of a biodegradable polymer, briefly called PHB. The on-line procedure for simultaneous critic training and process optimization is tested in the absence and presence of measurement noise. In both cases the optimization procedure succeeded in increasing the productivity and in proper training of the adaptive critic network at the same time.

[1]  Herbert Jaeger,et al.  Adaptive Nonlinear System Identification with Echo State Networks , 2002, NIPS.

[2]  George G. Lendaris,et al.  A retrospective on Adaptive Dynamic Programming for control , 2009, 2009 International Joint Conference on Neural Networks.

[3]  Genetic Algorithmic Optimization of PHB Production by a Mixed Culture in an Optimally Dispersed Fed-batch Bioreactor ♣ , 2009 .

[4]  Kazuyuki Shimizu,et al.  Modeling of the mixed culture and periodic control for PHB production , 2002 .

[5]  Danil V. Prokhorov Training Recurrent Neurocontrollers for Real-Time Applications , 2007, IEEE Transactions on Neural Networks.

[6]  Jennie Si,et al.  Online learning control by association and reinforcement , 2001, IEEE Trans. Neural Networks.

[7]  P. Koprinkova-Hristova ACD approach to optimal control of mixed culture cultivation for PHB production process — sugar’s time profile synthesis , 2008, 2008 4th International IEEE Conference Intelligent Systems.

[8]  R. Bellman Dynamic programming. , 1957, Science.

[9]  D. Prokhorov,et al.  Echo state networks: appeal and challenges , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[10]  Donald C. Wunsch,et al.  Adaptive critic designs and their applications , 1997 .

[11]  Petia Koprinkova-Hristova,et al.  Adaptive Critic Design with Echo State Network , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[12]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[13]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[14]  P. Patnaik Neural network designs for poly- β-hydroxybutyrate production optimization under simulated industrial conditions , 2005, Biotechnology Letters.

[15]  Herbert Jaeger,et al.  Echo state network , 2007, Scholarpedia.

[16]  Jochen J. Steil,et al.  Improving reservoirs using intrinsic plasticity , 2008, Neurocomputing.

[17]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[18]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[19]  D. Prokhorov Toward effective combination of off-line and on-line training in ADP framework , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.