论文信息 - Scaling Adaptive Agent-Based Reactive Job-Shop Scheduling to Large-Scale Problems

Scaling Adaptive Agent-Based Reactive Job-Shop Scheduling to Large-Scale Problems

Most approaches to tackle job-shop scheduling problems assume complete task knowledge and search for a centralized solution. In this work, we adopt an alternative view on scheduling problems where each resource is equipped with an adaptive agent that, independent of other agents, makes job dispatching decisions based on its local view on the plant and employs reinforcement learning to improve its dispatching strategy. We delineate which extensions are necessary to render this learning approach applicable to job-shop scheduling problems of current standards of difficulty and present results of an adequate empirical evaluation

Martin A. Riedmiller | Thomas Gabel | T. Gabel

[1] Dario Pacciarelli,et al. Job-shop scheduling with blocking and no-wait constraints , 2002, Eur. J. Oper. Res..

[2] Martin A. Riedmiller,et al. A Neural Reinforcement Learning Approach to Learn Local Dispatching Policies in Production Scheduling , 1999, IJCAI.

[3] Jacek Blazewicz,et al. Scheduling in Computer and Manufacturing Systems , 1990 .

[4] Kenichi Abe,et al. Switching Q-learning in partially observable Markovian environments , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[5] Egon Balas,et al. The Shifting Bottleneck Procedure for Job Shop Scheduling , 1988 .

[6] S. S. Panwalkar,et al. A Survey of Scheduling Rules , 1977, Oper. Res..

[7] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[8] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[9] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.

[10] E. Nowicki,et al. A Fast Taboo Search Algorithm for the Job Shop Problem , 1996 .

[11] Martin A. Riedmiller,et al. Reducing policy degradation in neuro-dynamic programming , 2006, ESANN.

[12] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[13] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[14] J. C. Bean. Genetics and random keys for sequencing amd optimization , 1993 .

[15] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[16] David Joslin,et al. "Squeaky Wheel" Optimization , 1998, AAAI/IAAI.

[17] Jan Karel Lenstra,et al. Job Shop Scheduling by Simulated Annealing , 1992, Oper. Res..

[18] Michael Pinedo,et al. Scheduling: Theory, Algorithms, and Systems , 1994 .

[19] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[20] Beatrice M. Ombuki-Berman,et al. Local Search Genetic Algorithms for the Job Shop Scheduling Problem , 2004, Applied Intelligence.

[21] James C. Bean,et al. Genetic Algorithms and Random Keys for Sequencing and Optimization , 1994, INFORMS J. Comput..

[22] Dimitri P. Bertsekas,et al. Missile defense and interceptor allocation by neuro-dynamic programming , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[23] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[24] A. Robinson. I. Introduction , 1991 .

[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.

[27] William J. Cook,et al. A Computational Study of the Job-Shop Scheduling Problem , 1991, INFORMS Journal on Computing.

[28] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.