Model-Based and Learning-Based Decision Making in Incomplete Information Cournot Games: A State Estimation Approach

In an incomplete information game, a big challenge is to find the best way of exploiting available information for optimal decision making of the agents. In this paper, two decision making methods, namely model-based and learning-based bidding strategies, are proposed and compared, for repeated Cournot competition of the generators in a day-ahead electricity market. The sum of the rivals' offered quantities (SROQ) is considered as the state of the agent and its value is estimated using an adaptive expectation method. In the model-based approach, the convergence of the agents' strategies to the Nash equilibrium point is also studied in two different cases. In the learning-based approach, the optimal bidding strategy is learned through combination of state estimation and a reinforcement learning method. Using the estimated state (SROQ), the optimal decision is learned through a fuzzy Q-learning algorithm. Through a case study, which is performed on the three-bus benchmark Cournot model, the convergence of the generators' bids to the Nash-Cournot equilibrium is examined.

[1]  P. Glorennec,et al.  Fuzzy Q-learning , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[2]  Julián Barquín Gil,et al.  Cournot equilibrium calculation in power networks: An optimization approach with price response computation , 2008 .

[3]  Habib Rajabi Mashhadi,et al.  An Adaptive $Q$-Learning Algorithm Developed for Agent-Based Computational Modeling of Electricity Market , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Keith W. Hipel,et al.  A decision support system for interactive decision making-Part I: model formulation , 2003, IEEE Trans. Syst. Man Cybern. Part C.

[5]  M. L. Baughman,et al.  An Empirical Study of Applied Game Theory: Transmission Constrained Cournot Behavior , 2002, IEEE Power Engineering Review.

[6]  Gian Italo Bischi,et al.  Equilibrium selection in a nonlinear duopoly game with adaptive expectations , 2001 .

[7]  S. Borenstein,et al.  An Empirical Analysis of the Potential for Market Power in California&Apos;S Electricity Industry , 1998 .

[8]  Generator bidding in oligopolistic electricity markets using optimal control: Fundamentals and application , 2006, 2012 IEEE Power and Energy Society General Meeting.

[9]  M. Mansour,et al.  On Cournot dynamic multi-team game using incomplete information dynamical system , 2012, Appl. Math. Comput..

[10]  Keith W. Hipel,et al.  Agent-Based Modeling of Competitive and Cooperative Behavior Under Conflict , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[11]  Felix F. Wu,et al.  Prisoner Dilemma: Generator Strategic Bidding in Electricity Markets , 2007, IEEE Transactions on Automatic Control.

[12]  O. Nelles Nonlinear System Identification , 2001 .

[13]  R. Baldick,et al.  Hybrid coevolutionary programming for Nash equilibrium search in games with local optima , 2004, IEEE Transactions on Evolutionary Computation.

[14]  Benjamin F. Hobbs,et al.  Network-constrained Cournot models of liberalized electricity markets: the devil is in the details , 2005 .

[15]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Keith W. Hipel,et al.  Correction to "A decision support system for interactive decision making - Part I: model formulation" , 2003, IEEE Trans. Syst. Man Cybern. Part C.

[17]  Thomas Vallée,et al.  Can They Beat the Cournot Equilibrium? Learning with Memory and Convergence to Equilibria in a Cournot Oligopoly , 2012 .

[18]  Majid Nili Ahmadabadi,et al.  SIMULTANEOUS STATE ESTIMATION AND LEARNING IN REPEATED COURNOT GAMES , 2014, Appl. Artif. Intell..

[19]  R. Sarin,et al.  Payoff Assessments without Probabilities: A Simple Dynamic Model of Choice , 1999 .

[20]  Riyaz Sikora,et al.  Learning bidding strategies with autonomous agents in environments with unstable equilibrium , 2008, Decis. Support Syst..

[21]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[22]  J. Pang,et al.  Oligopolistic Competition in Power Networks: A Conjectured Supply Function Approach , 2002, IEEE Power Engineering Review.

[23]  L. Buşoniu,et al.  A comprehensive survey of multi-agent reinforcement learning , 2011 .

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Nicolaas J. Vriend,et al.  Evolving market structure: An ACE model of price dispersion and loyalty , 2001 .

[26]  Keith W. Hipel,et al.  GAME THEORETIC MODELS IN ENGINEERING DECISION MAKING , 1993 .

[27]  Yves Smeers,et al.  Variational Inequality Models of Restructured Electricity Systems , 2001 .

[28]  Uzay Kaymak,et al.  Q-learning agents in a Cournot oligopoly model , 2008 .

[29]  Keith W. Hipel,et al.  Correction to "A decision support system for interactive decision making - Part II: analysis and output interpretation" , 2003, IEEE Trans. Syst. Man Cybern. Part C.

[30]  Alex Simpkins,et al.  System Identification: Theory for the User, 2nd Edition (Ljung, L.; 1999) [On the Shelf] , 2012, IEEE Robotics & Automation Magazine.

[31]  Keith W. Hipel,et al.  A Support System for Multilateral Negotiations , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.