Learning Strategies in Decentralized Matching Markets under Uncertain Preferences

We study two-sided decentralized matching markets in which participants have uncertain preferences. We present a statistical model to learn the preferences. The model incorporates uncertain state and the participants' competition on one side of the market. We derive an optimal strategy that maximizes the agent's expected payoff and calibrate the uncertain state by taking the opportunity costs into account. We discuss the sense in which the matching derived from the proposed strategy has a stability property. We also prove a fairness property that asserts that there exists no justified envy according to the proposed strategy. We provide numerical results to demonstrate the improved payoff, stability and fairness, compared to alternative methods.

[1]  Shlomo Zilberstein,et al.  Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[2]  Feng Ruan,et al.  Bandit Learning in Decentralized Matching Markets , 2020, J. Mach. Learn. Res..

[3]  Dennis Epple,et al.  Admission, Tuition, and Financial Aid Policies in the Market for Higher Education , 2006 .

[4]  Donald E. Knuth,et al.  Stable Marriage and Its Relation to Other Combinatorial Problems: An Introduction to the Mathematical Analysis of Algorithms , 1996 .

[5]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[6]  J. Schreiber Foundations Of Statistics , 2016 .

[7]  Ana-Andreea Stoica,et al.  Bridging Machine Learning and Mechanism Design towards Algorithmic Fairness , 2020, FAccT.

[8]  Karthik Abinav Sankararaman,et al.  Dominate or Delete: Decentralized Competing Bandits in Serial Dictatorship , 2021, AISTATS.

[9]  Jan Vondrák,et al.  Approximating the stochastic knapsack problem: the benefit of adaptivity , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[10]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[11]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[12]  Alvin E. Roth,et al.  The Economics of Matching: Stability and Incentives , 1982, Math. Oper. Res..

[13]  A. Siow,et al.  Who Marries Whom and Why , 2006, Journal of Political Economy.

[14]  Emir Kamenica,et al.  Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment , 2006 .

[15]  Shimon Whiteson,et al.  Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.

[16]  M. Balinski,et al.  A Tale of Two Mechanisms: Student Placement , 1999 .

[17]  Antonio Romero-Medina,et al.  Simple Mechanisms to Implement the Core of College Admissions Problems , 2000, Games Econ. Behav..

[18]  G. W. Wornell,et al.  Decentralized control of a multiple access broadcast channel: performance bounds , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[19]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[20]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[21]  Ashish Goel,et al.  Improved approximation results for stochastic knapsack problems , 2011, SODA '11.

[22]  Sanmay Das,et al.  Two-Sided Bandits and the Dating Market , 2005, IJCAI.

[23]  Yi Lin Tensor product space ANOVA models in multivariate function estimation , 1998 .

[24]  Alvin E. Roth Deferred acceptance algorithms: history, theory, practice, and open questions , 2008, Int. J. Game Theory.

[25]  Hector Chade,et al.  Simultaneous Search , 2006 .

[26]  Alvin E. Roth,et al.  Stable Matchings, Optimal Assignments, and Linear Programming , 1993, Math. Oper. Res..

[27]  G. Wahba,et al.  Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy : the 1994 Neyman Memorial Lecture , 1995 .

[28]  L. Shapley,et al.  The assignment game I: The core , 1971 .

[29]  Richard J. Zeckhauser,et al.  The Early Admissions Game: Joining the Elite , 2004 .

[30]  Yeon-Koo Che,et al.  Decentralized College Admissions , 2016, Journal of Political Economy.

[31]  Amos J. Storkey,et al.  Multi-period Trading Prediction Markets with Connections to Machine Learning , 2014, ICML.

[32]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[33]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[34]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[35]  Eduardo M. Azevedo,et al.  A Supply and Demand Framework for Two-Sided Matching Markets , 2014, Journal of Political Economy.

[36]  Dorian Kodelja,et al.  Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.

[37]  Gerald Tesauro,et al.  Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.

[38]  Sarah H. Cen,et al.  Regret, stability & fairness in matching markets with bandit learners , 2021, AISTATS.

[39]  Alvin E. Roth,et al.  Two-Sided Matching: A Study in Game-Theoretic Modeling and Analysis , 1990 .

[40]  Larry Samuelson,et al.  Stable Matching with Incomplete Information (Second Version) , 2012 .

[41]  A. Roth,et al.  New physicians: a natural experiment in market organization , 1990, Science.

[42]  SangMok Lee,et al.  Incentive Compatibility of Large Centralized Matching Markets , 2017 .

[43]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[44]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[45]  Isa Emin Hafalir,et al.  College admissions with entrance exams: Centralized versus decentralized , 2018, J. Econ. Theory.

[46]  Lones Smith,et al.  Student Portfolios and the College Admissions Problem , 2013 .

[47]  Anna R. Karlin,et al.  Game Theory, Alive , 2017 .

[48]  Aranyak Mehta,et al.  Online Matching and Ad Allocation , 2013, Found. Trends Theor. Comput. Sci..

[49]  G. Wahba Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV , 1999 .

[50]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[51]  P. Chiappori,et al.  The Econometrics of Matching Models , 2016 .

[52]  Guillaume Haeringer,et al.  Decentralized job matching , 2011, Int. J. Game Theory.

[53]  L. S. Shapley,et al.  College Admissions and the Stability of Marriage , 2013, Am. Math. Mon..

[54]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[55]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[56]  Alvin E. Roth,et al.  The assignment game , 1990 .

[57]  Karthik Abinav Sankararaman,et al.  Beyond log2(T) Regret for Decentralized Bandits in Matching Markets , 2021, ICML.

[58]  A. Roth,et al.  Turnaround Time and Bottlenecks in Market Clearing: Decentralized Matching in the Market for Clinical Psychologists , 1997, Journal of Political Economy.

[59]  Michael I. Jordan,et al.  Competing Bandits in Matching Markets , 2019, AISTATS.

[60]  Jonathan Levin,et al.  Early Admissions at Selective Colleges , 2009 .

[61]  Lorenzo Rosasco,et al.  Generalization Properties of Learning with Random Features , 2016, NIPS.

[62]  Chao Fu,et al.  Equilibrium Tuition, Applications, Admissions and Enrollment in the College Market , 2012 .

[63]  A. Roth The Evolution of the Labor Market for Medical Interns and Residents: A Case Study in Game Theory , 1984, Journal of Political Economy.

[64]  Keith W. Ross,et al.  The stochastic knapsack problem , 1989, IEEE Trans. Commun..

[65]  Itai Ashlagi,et al.  Clearing Matching Markets Efficiently: Informative Signals and Match Recommendations , 2018, Manag. Sci..

[66]  Konrad Menzel Large Matching Markets as Two‐Sided Demand Systems , 2015 .

[67]  Atila Abdulkadiroglu,et al.  School Choice: A Mechanism Design Approach , 2003 .

[68]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .