暂无分享,去创建一个
Bruno Castro da Silva | Andrew G. Barto | Philip S. Thomas | Emma Brunskill | Bruno C. da Silva | A. Barto | P. Thomas | Emma Brunskill | B. B. D. Silva | E. Brunskill
[1] H. Kahn,et al. Methods of Reducing Sample Size in Monte Carlo Computations , 1953, Oper. Res..
[2] A. Charnes,et al. Chance-Constrained Programming , 1959 .
[3] C. Jackson Grayson,et al. Decisions Under Uncertainty: Drilling Decisions by Oil and Gas Operators. , 1961 .
[4] J. Hammersley. MONTE CARLO METHODS FOR SOLVING MULTIVARIABLE PROBLEMS , 1960 .
[5] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[6] R. Jagannathan,et al. Chance-Constrained Programming with Joint Constraints , 1974, Oper. Res..
[7] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[8] M. Houck. A Chance Constrained Optimization Model for reservoir design and operation , 1979 .
[9] B. Efron. Better Bootstrap Confidence Intervals , 1987 .
[10] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[11] C. Watkins. Learning from delayed rewards , 1989 .
[12] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[13] M. Rabin. Published by: American , 2022 .
[14] Howard B. Demuth,et al. Neutral network toolbox for use with Matlab , 1995 .
[15] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[16] William Nick Street,et al. Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..
[17] S. Sastry,et al. Hybrid Control in Air Traac Management Systems , 1995 .
[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[19] John M. Wilson,et al. Introduction to Stochastic Programming , 1998, J. Oper. Res. Soc..
[20] E. Fehr. A Theory of Fairness, Competition and Cooperation , 1998 .
[21] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[22] Darinka Dentcheva,et al. Concavity and efficient points of discrete distributions in probabilistic programming , 2000, Math. Program..
[23] Sham M. Kakade,et al. Optimizing Average Reward Using Discounted Rewards , 2001, COLT/EuroCOLT.
[24] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[25] Datta N. Godbole,et al. Addressing Multiobjective Control: Safety and Performance through Constrained Optimization , 2001, HSCC.
[26] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..
[27] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[28] A. Folsom,et al. Coronary heart disease risk prediction in the Atherosclerosis Risk in Communities (ARIC) study. , 2003, Journal of clinical epidemiology.
[29] J. Pankow,et al. Prediction of coronary heart disease in middle-aged adults with diabetes. , 2003, Diabetes care.
[30] A. Messac,et al. The normalized normal constraint method for generating the Pareto frontier , 2003 .
[31] D. Bertsimas,et al. Shortfall as a risk measure: properties, optimization and applications , 2004 .
[32] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[33] Tom Fawcett,et al. Robust Classification for Imprecise Environments , 2000, Machine Learning.
[34] Gajendra P.S. Raghava,et al. Prediction of CTL epitopes using QM, SVM and ANN techniques. , 2004, Vaccine.
[35] Mala Htun. From "Racial Democracy" to Affirmative Action: Changing State Policy on Race in Brazil , 2004, Latin American Research Review.
[36] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..
[37] Alexandre M. Bayen,et al. A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games , 2005, IEEE Transactions on Automatic Control.
[38] Giuseppe Carlo Calafiore,et al. Uncertain convex programs: randomized solutions and confidence levels , 2005, Math. Program..
[39] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[40] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.
[41] Mark Kotanchek,et al. Pareto-Front Exploitation in Symbolic Regression , 2005 .
[42] Armin Falk,et al. A Theory of Reciprocity , 2001, Games Econ. Behav..
[43] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[44] Alexander Shapiro,et al. Convex Approximations of Chance Constrained Programs , 2006, SIAM J. Optim..
[45] Kalyanmoy Deb,et al. Introducing Robustness in Multi-Objective Optimization , 2006, Evolutionary Computation.
[46] Garud Iyengar,et al. Ambiguous chance constrained problems and robust optimization , 2006, Math. Program..
[47] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[48] Andy Liaw,et al. Classification and Regression by randomForest , 2007 .
[49] P. Massart,et al. Concentration inequalities and model selection , 2007 .
[50] R. Jibson. Regression models for estimating coseismic landslide displacement , 2007 .
[51] Steffen Udluft,et al. Safe exploration for reinforcement learning , 2008, ESANN.
[52] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[53] Laurent El Ghaoui,et al. Robust Optimization , 2021, ICORES.
[54] Toon Calders,et al. Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.
[55] Larry D. Pyeatt,et al. Reinforcement Learning for Closed-Loop Propofol Anesthesia: A Human Volunteer Study , 2010, IAAI.
[56] Toon Calders,et al. Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.
[57] Itay Gurvich,et al. Staffing Call Centers with Uncertain Demand Forecasts: A Chance-Constrained Optimization Approach , 2010, Manag. Sci..
[58] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[59] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[60] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..
[61] Jun Sakuma,et al. Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.
[62] Shie Mannor,et al. Probabilistic Goal Markov Decision Processes , 2011, IJCAI.
[63] Franco Turini,et al. k-NN as an implementation of situation testing for discrimination discovery and prevention , 2011, KDD.
[64] Arkadi Nemirovski,et al. On safe tractable approximations of chance constraints , 2012, Eur. J. Oper. Res..
[65] N. Shah,et al. Increased Mortality of Patients With Diabetes Reporting Severe Hypoglycemia , 2012, Diabetes Care.
[66] Nuno C. Martins,et al. Control Design for Markov Chains under Safety Constraints: A Convex Approach , 2012, ArXiv.
[67] lexander,et al. THE GENERALIZED SIMPLEX METHOD FOR MINIMIZING A LINEAR FORM UNDER LINEAR INEQUALITY RESTRAINTS , 2012 .
[68] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[69] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.
[70] Olivier Sigaud,et al. Policy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning , 2012 .
[71] Qianfan Wang,et al. A chance-constrained two-stage stochastic program for unit commitment with uncertain wind power output , 2012, 2012 IEEE Power and Energy Society General Meeting.
[72] Scott Kuindersma,et al. Variational Bayesian Optimization for Runtime Risk-Sensitive Control , 2012, Robotics: Science and Systems.
[73] Thomas G. Dietterich,et al. Allowing a wildfire to burn: estimating the effect on future fire suppression costs , 2013 .
[74] Latanya Sweeney,et al. Discrimination in online ad delivery , 2013, CACM.
[75] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[76] Sridhar Mahadevan,et al. Projected Natural Actor-Critic , 2013, NIPS.
[77] Suchi Saria,et al. A $3 Trillion Challenge to Computational Scientists: Transforming Healthcare Delivery , 2014, IEEE Intelligent Systems.
[78] Jaime F. Fisac,et al. Reachability-based safe learning with Gaussian processes , 2014, 53rd IEEE Conference on Decision and Control.
[79] C. Cobelli,et al. The UVA/PADOVA Type 1 Diabetes Simulator , 2014, Journal of diabetes science and technology.
[80] Thorsten Dickhaus,et al. Simultaneous Statistical Inference , 2014, Springer Berlin Heidelberg.
[81] Meysam Bastani,et al. Model-Free Intelligent Diabetes Management Using Machine Learning , 2014 .
[82] Nick Bostrom,et al. Superintelligence: Paths, Dangers, Strategies , 2014 .
[83] Mohammad Ghavamzadeh,et al. Algorithms for CVaR Optimization in MDPs , 2014, NIPS.
[84] Philip S. Thomas,et al. Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees , 2015, IJCAI.
[85] Carlos Eduardo Scheidegger,et al. Certifying and Removing Disparate Impact , 2014, KDD.
[86] Philip S. Thomas,et al. High Confidence Policy Improvement , 2015, ICML.
[87] Shie Mannor,et al. Optimizing the CVaR via Sampling , 2014, AAAI.
[88] Shlomo Zilberstein. Building Strong Semi-Autonomous Systems , 2015, AAAI.
[89] Sean A. Munson,et al. Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.
[90] Michael Carl Tschantz,et al. Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..
[91] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[92] Marcello Restelli,et al. Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation , 2014, AAAI.
[93] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[94] Philip S. Thomas,et al. High-Confidence Off-Policy Evaluation , 2015, AAAI.
[95] Lihong Li,et al. Doubly Robust Off-policy Evaluation for Reinforcement Learning , 2015, ArXiv.
[96] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .
[97] Arya Irani,et al. Utilizing negative policy information to accelerate reinforcement learning , 2015 .
[98] András Prékopa,et al. ON PROBABILISTIC CONSTRAINED PROGRAMMING , 2015 .
[99] Suresh Venkatasubramanian,et al. Auditing Black-box Models by Obscuring Features , 2016, ArXiv.
[100] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[101] Stuart Russell. Should We Fear Supersmart Robots? , 2016, Scientific American.
[102] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[103] Marek Petrik,et al. Safe Policy Improvement by Minimizing Robust Baseline Regret , 2016, NIPS.
[104] Aaron Roth,et al. Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.
[105] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.
[106] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[107] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[108] Benjamin Fish,et al. A Confidence-Based Approach for Balancing Fairness and Accuracy , 2016, SDM.
[109] Philip S. Thomas,et al. Importance Sampling with Unequal Support , 2016, AAAI.
[110] Yair Zick,et al. Algorithmic Transparency via Quantitative Input Influence , 2017 .
[111] Andreas Krause,et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.