A Neural Reinforcement Learning Approach to Learn Local Dispatching Policies in Production Scheduling

Finding optimal solutions for job shop scheduling problems requires high computational effort, especially under consideration of uncertainty and frequent replanning. In contrast to computational solutions, domain experts are often able to derive good local dispatching heuristics by looking at typical problem instances. They can be efficiently applied by looking at few relevant features. However, these rules are usually not optimal, especially in complex decision situations. Here we describe an approach that tries to combine both worlds. A neural network based agent autonomously optimizes its local dispatching policy with respect to a global optimization goal, defined for the overall plant. On two benchmark scheduling problems, we show both learning and generalization abilities of the proposed approach.