暂无分享,去创建一个
Shie Mannor | Balázs Szörényi | Róbert Busa-Fekete | Paul Weng | Shie Mannor | R. Busa-Fekete | Paul Weng | Balázs Szörényi
[1] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[2] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[3] Jonny Dambrowski. Review on Methods of State-of-Charge Estimation with Viewpoint to the Modern LiFePO 4 / Li 4 Ti 5 O 12 Lithium-Ion Systems , 2013 .
[4] Wlodzimierz Ogryczak,et al. On solving linear programs with the ordered weighted averaging objective , 2003, Eur. J. Oper. Res..
[5] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[6] Rong Jin,et al. Stochastic Convex Optimization with Multiple Objectives , 2013, NIPS.
[7] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[8] William H Press,et al. Bandit solutions provide unified ethical models for randomized clinical trials and comparative effectiveness research , 2009, Proceedings of the National Academy of Sciences.
[9] John A. Weymark,et al. GENERALIZED GIN 1 INEQUALITY INDICES , 2001 .
[10] Ann Nowé,et al. Designing multi-objective multi-armed bandits algorithms: A study , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).
[11] Peter L. Bartlett,et al. Blackwell Approachability and No-Regret Learning are Equivalent , 2010, COLT.
[12] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[13] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[14] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[15] Konkoly Thege. Multi-criteria Reinforcement Learning , 1998 .
[16] Valerie H. Johnson,et al. Battery performance models in ADVISOR , 2002 .
[17] Ronald R. Yager,et al. On ordered weighted averaging aggregation operators in multicriteria decision-making , 1988 .
[18] H. Dalton. The Measurement of the Inequality of Incomes , 1920 .
[19] Patrice Perny,et al. A Compromise Programming Approach to multiobjective Markov Decision Processes , 2011, Int. J. Inf. Technol. Decis. Mak..
[20] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[21] Aleksandrs Slivkins,et al. Contextual Bandits with Similarity Information , 2009, COLT.
[22] Roger A. Dougal,et al. Dynamic lithium-ion battery model for system simulation , 2002 .
[23] Patrice Perny,et al. On Minimizing Ordered Weighted Regrets in Multiobjective Markov Decision Processes , 2011, ADT.
[24] Shie Mannor,et al. Approachability in unknown games: Online learning meets multi-objective optimization , 2014, COLT.
[25] Ralph E. Steuer,et al. An interactive weighted Tchebycheff procedure for multiple objective programming , 1983, Math. Program..
[26] Gábor J. Székely,et al. When is a weighted average of ordered sample elements a maximum likelihood estimator of the location parameter , 1989 .
[27] Moshe Tennenholtz,et al. Sequential decision making with vector outcomes , 2014, ITCS.
[28] John N. Tsitsiklis,et al. Online Learning with Sample Path Constraints , 2009, J. Mach. Learn. Res..