Realization of an Adaptive Memetic Algorithm Using Differential Evolution and Q-Learning: A Case Study in Multirobot Path Planning

Memetic algorithms (MAs) are population-based meta-heuristic search algorithms that combine the composite benefits of natural and cultural evolutions. An adaptive MA (AMA) incorporates an adaptive selection of memes (units of cultural transmission) from a meme pool to improve the cultural characteristics of the individual member of a population-based search algorithm. This paper presents a novel approach to design an AMA by utilizing the composite benefits of differential evolution (DE) for global search and Q-learning for local refinement. Four variants of DE, including the currently best self-adaptive DE algorithm, have been used here to study the relative performance of the proposed AMA with respect to runtime, cost function evaluation, and accuracy (offset in cost function from the theoretical optimum after termination of the algorithm). Computer simulations performed on a well-known set of 25 benchmark functions reveal that incorporation of Q-learning in one popular and one outstanding variants of DE makes the corresponding algorithm more efficient in both runtime and accuracy. The performance of the proposed AMA has been studied on a real-time multirobot path-planning problem. Experimental results obtained for both simulation and real frameworks indicate that the proposed algorithm-based path-planning scheme outperforms the real-coded genetic algorithm, particle swarm optimization, and DE, particularly its currently best version with respect two standard metrics defined in the literature.

[1]  David B. Fogel,et al.  A Note on the Empirical Evaluation of Intermediate Recombination , 1995, Evolutionary Computation.

[2]  P. Cowling,et al.  CHOICE FUNCTION AND RANDOM HYPERHEURISTICS , 2002 .

[3]  Andries Petrus Engelbrecht,et al.  Differential evolution methods for unsupervised image classification , 2005, 2005 IEEE Congress on Evolutionary Computation.

[4]  Amit Konar,et al.  Distributed cooperative multi-robot path planning using differential evolution , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[5]  Carlos A. Coello Coello,et al.  A comparative study of differential evolution variants for global optimization , 2006, GECCO.

[6]  William A. Gruver,et al.  Motion planning with time-varying polyhedral obstacles based on graph search and mathematical programming , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[7]  Joachim Coche An evolutionary approach to the examination of capital market efficiency , 1998 .

[8]  R. W. Derksen,et al.  Differential Evolution in Aerodynamic Optimization , 1999 .

[9]  H. Le-Huy,et al.  Robot path planning using neural networks and fuzzy logic , 1994, Proceedings of IECON'94 - 20th Annual Conference of IEEE Industrial Electronics.

[10]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[11]  Tucker R. Balch,et al.  Behavior-based formation control for multirobot teams , 1998, IEEE Trans. Robotics Autom..

[12]  S.X. Yang,et al.  A Knowledge Based GA for Path Planning of Multiple Mobile Robots in Dynamic Environments , 2006, 2006 IEEE Conference on Robotics, Automation and Mechatronics.

[13]  Ivan Zelinka,et al.  ON STAGNATION OF THE DIFFERENTIAL EVOLUTION ALGORITHM , 2000 .

[14]  Meryem Simsek,et al.  Improved decentralized Q-learning algorithm for interference reduction in LTE-femtocells , 2011, 2011 Wireless Advanced.

[15]  Stan C. A. M. Gielen,et al.  Neural Network Dynamics for Path Planning and Obstacle Avoidance , 1995, Neural Networks.

[16]  Iraj Hassanzadeh,et al.  Path planning for a mobile robot using fuzzy logic controller tuned by GA , 2009, 2009 6th International Symposium on Mechatronics and its Applications.

[17]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[18]  Madhu Sudan,et al.  Motion planning on a graph , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[19]  Ming C. Lin,et al.  Constraint-Based Motion Planning Using Voronoi Diagrams , 2002, WAFR.

[20]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[21]  Zbigniew Michalewicz,et al.  Adaptive evolutionary planner/navigator for mobile robots , 1997, IEEE Trans. Evol. Comput..

[22]  Lynne E. Parker,et al.  Path Planning and Motion Coordination in Multiple Mobile Robot Teams , 2009 .

[23]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[24]  Jing J. Liang,et al.  Problem Definitions and Evaluation Criteria for the CEC 2005 Special Session on Real-Parameter Optimization , 2005 .

[25]  Graham Kendall,et al.  A Hyperheuristic Approach to Scheduling a Sales Summit , 2000, PATAT.

[26]  Takanori Shibata,et al.  Intelligent motion planning by genetic algorithm with fuzzy critic , 1993, Proceedings of 8th IEEE International Symposium on Intelligent Control.

[27]  Majid Nili Ahmadabadi,et al.  Interaction of Culture-based Learning and Cooperative Co-evolution and its Application to Automatic Behavior-based System Design , 2010, IEEE Transactions on Evolutionary Computation.

[28]  Yusuke Aoki,et al.  GA-Based Q-Learning to Develop Compact Control Table for Multiple Agents , 2010 .

[29]  Lakhmi C. Jain,et al.  Intelligent Autonomous Systems: Foundations and Applications , 2010, Intelligent Autonomous Systems.

[30]  Punit Pandey,et al.  Approximate Q-Learning: An Introduction , 2010, 2010 Second International Conference on Machine Learning and Computing.

[31]  Chien-Chou Lin,et al.  Motion Planning Using a Memetic Evolution Algorithm for Swarm Robots , 2012 .

[32]  K. K. Bharadwaj,et al.  An Efficient Global Optimization Approach to Multi Robot Path Exploration Problem Using Hybrid Genetic Algorithm , 2008, 2008 4th International Conference on Information and Automation for Sustainability.

[33]  Tadahiko Murata,et al.  Multi-Legged Robot Control Using GA-Based Q-Learning Method With Neighboring Crossover , 2008 .

[34]  Xin Ma,et al.  Genetic Algorithm-based Multi-robot Cooperative Exploration , 2007, 2007 IEEE International Conference on Control and Automation.

[35]  Wei Liu,et al.  Enhanced Q-learning algorithm for dynamic power management with performance constraint , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[36]  Yew-Soon Ong,et al.  Memetic Computation—Past, Present & Future [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[37]  Amit Konar,et al.  Differential Evolution with Local Neighborhood , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[38]  Zbigniew Michalewicz,et al.  Intelligent Decision Support: A Fuzzy Stock Ranking System , 2009, Aspects of Natural Language Processing.

[39]  Qin Zhang,et al.  Immunity-Based Adaptive Genetic Algorithm for Multi-robot Cooperative Exploration , 2007, ICIC.

[40]  Pratyusha Rakshit,et al.  Multi-robot path-planning using artificial bee colony optimization algorithm , 2011, 2011 Third World Congress on Nature and Biologically Inspired Computing.

[41]  Amit Konar,et al.  Differential Evolution Using a Neighborhood-Based Mutation Operator , 2009, IEEE Transactions on Evolutionary Computation.

[42]  R. Lewontin ‘The Selfish Gene’ , 1977, Nature.

[43]  K. K. Bharadwaj,et al.  A Hybrid Evolutionary Approach for Multi Robot Path Exploration Problem , 2008 .

[44]  Amit Konar,et al.  Cooperative multi-robot path planning using differential evolution , 2009, J. Intell. Fuzzy Syst..

[45]  Saku Kukkonen,et al.  Real-parameter optimization with differential evolution , 2005, 2005 IEEE Congress on Evolutionary Computation.

[46]  Swagatam Das,et al.  Automatic Clustering Using an Improved Differential Evolution Algorithm , 2007 .

[47]  SRIDHAR MAHADEVAN,et al.  Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.

[48]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[49]  Kostas E. Bekris,et al.  Efficient and complete centralized multi-robot path planning , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[50]  Thomas Dean,et al.  Reinforcement Learning for Planning and Control , 1993 .

[51]  Indrani Goswami,et al.  Conditional Q-learning algorithm for path-planning of a mobile robot , 2010, 2010 International Conference on Industrial Electronics, Control and Robotics.

[52]  Kevin Kok Wai Wong,et al.  Classification of adaptive memetic algorithms: a comparative study , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[53]  Jing J. Liang,et al.  Novel composition test functions for numerical global optimization , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[54]  Kenneth V. Price,et al.  An introduction to differential evolution , 1999 .

[55]  Yiming Yang,et al.  Applying Q-Learning Algorithm to Study Line-Grasping Control Policy for Transmission Line Deicing Robot , 2010, 2010 International Conference on Intelligent System Design and Engineering Application.

[56]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[57]  Janez Brest,et al.  Performance comparison of self-adaptive and adaptive differential evolution algorithms , 2007, Soft Comput..

[58]  Uday K. Chakraborty,et al.  Advances in Differential Evolution , 2010 .

[59]  Dinesh Manocha,et al.  Multi-robot coordination using generalized social potential fields , 2009, 2009 IEEE International Conference on Robotics and Automation.

[60]  P. N. Suganthan,et al.  Differential Evolution Algorithm With Strategy Adaptation for Global Numerical Optimization , 2009, IEEE Transactions on Evolutionary Computation.

[61]  Marina L. Gavrilova,et al.  Roadmap-Based Path Planning - Using the Voronoi Diagram for a Clearance-Based Shortest Path , 2008, IEEE Robotics & Automation Magazine.

[62]  E. Ebrahimi,et al.  Self-adaptive memetic algorithm: an adaptive conjugate gradient approach , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[63]  Zhen Ji,et al.  Memetic Ant Colony Optimization for Band Selection of Hyperspectral Imagery Classification , 2010, 2010 Chinese Conference on Pattern Recognition (CCPR).

[64]  Amit Konar,et al.  Two-Dimensional IIR Filter Design with Modern Search Heuristics: a Comparative Study , 2006, Int. J. Comput. Intell. Appl..

[65]  Dinko Osmankovic,et al.  Implementation of Q — Learning algorithm for solving maze problem , 2011, 2011 Proceedings of the 34th International Convention MIPRO.

[66]  Hitoshi Iba,et al.  A study on the computational efficiency of Baldwinian evolution , 2010, 2010 Second World Congress on Nature and Biologically Inspired Computing (NaBIC).

[67]  Peter J. Angeline,et al.  Evolutionary Optimization Versus Particle Swarm Optimization: Philosophy and Performance Differences , 1998, Evolutionary Programming.

[68]  A. Kai Qin,et al.  Self-adaptive differential evolution algorithm for numerical optimization , 2005, 2005 IEEE Congress on Evolutionary Computation.