Inventory management in supply chains: a reinforcement learning approach

Abstract A major issue in supply chain inventory management is the coordination of inventory policies adopted by different supply chain actors, such as suppliers, manufacturers, distributors, so as to smooth material flow and minimize costs while responsively meeting customer demand. This paper presents an approach to manage inventory decisions at all stages of the supply chain in an integrated manner. It allows an inventory order policy to be determined, which is aimed at optimizing the performance of the whole supply chain. The approach consists of three techniques: (i) Markov decision processes (MDP) and (ii) an artificial intelligent algorithm to solve MDPs, which is based on (iii) simulation modeling. In particular, the inventory problem is modeled as an MDP and a reinforcement learning (RL) algorithm is used to determine a near optimal inventory policy under an average reward criterion. RL is a simulation-based stochastic technique that proves very efficient particularly when the MDP size is large.

[1]  Randall P. Sadowski,et al.  Simulation with Arena , 1998 .

[2]  דליה גורדון ותמר אליאב משפחות חד הוריות, 1991 , 1992 .

[3]  Maap Martin Verwijmeren,et al.  Networked inventory management information systems: materializing supply chain management , 1996 .

[4]  S. Hoekstra,et al.  Integral Logistic Structures: Developing Customer-Oriented Goods Flow , 1992 .

[5]  תמר אליאב ולאה ענבר גמלאות אמהות 1990/91 , 1992 .

[6]  אסתר טולידנו מקבלי דמי אבטלה בשנת 1995 , 1996 .

[7]  George L. Nemhauser,et al.  Handbooks in operations research and management science , 1989 .

[8]  J. Forrester Industrial Dynamics , 1997 .

[9]  אסתר טולידנו,et al.  מקבלי דמי אבטלה, 1997 , 1998 .

[10]  Mohamed Mohamed Naim,et al.  Smoothing Supply Chain Dynamics , 1991 .

[11]  S. Mahadevan,et al.  Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning , 1999 .

[12]  תמר אליאב גמלאות אמהות 1995-1992 , 1996 .

[13]  P. Kelle,et al.  The effect of (s, S) ordering policy on the supply chain , 1999 .

[14]  Joseph D. Blackburn,et al.  Time-based competition : the next battleground in American manufacturing , 1991 .

[15]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[16]  Morris A. Cohen,et al.  The Stabilizing Effect of Inventory in Supply Chains , 1998, Oper. Res..

[17]  רבקה פריאור מקבלי דמי פגיעה, 1988 , 1990 .

[18]  דבי עובדיה,et al.  קו חירום לנשים מוכות , 1991 .

[19]  D. Towill Industrial dynamics modelling of supply chains , 1996 .

[20]  Douglas J. Thomas,et al.  Coordinated supply chain management , 1996 .

[21]  Cipriano Forza,et al.  Achieving superior operating performance from integrated pipeline management: an empirical study , 1996 .

[22]  אסתר טולידנו,et al.  מקבלי דמי אבטלה בשנת 1993 , 1994 .

[23]  Evan L. Porteus Chapter 12 Stochastic inventory theory , 1990 .

[24]  רבקה פריאור,et al.  גמלאות נפגעי עבודה, 1989 , 1991 .

[25]  George Jr. Stalk Hout Competing Against Time : How Time Based Competition Is Reshaping Global Markets , 2002 .

[26]  אסתר טולידנו,et al.  מקבלי דמי אבטלה בשנת 1990 , 1991 .

[27]  יעקב צדקה,et al.  משפחתונים לקשישים בטבריה , 1991 .

[28]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[29]  אסתר טולידנו מקבלי דמי אבטלה בשנת 1991 , 1992 .

[30]  Asoo J. Vakharia,et al.  Integrated production/distribution planning in supply chains: An invited review , 1999, Eur. J. Oper. Res..

[31]  T. Hout,et al.  Competing Against Time , 1990 .

[32]  J. Muckstadt,et al.  Analysis of Multistage Production Systems , 1988 .

[33]  T. Jones,et al.  Using Inventory for Competitive Advantage through Supply Chain Management , 1985 .

[34]  אסתר טולידנו מקבלי דמי אבטלה בשנת 1989 , 1990 .