论文信息 - On Ensuring that Intelligent Machines Are Well-Behaved

On Ensuring that Intelligent Machines Are Well-Behaved

Machine learning algorithms are everywhere, ranging from simple data analysis and pattern recognition tools used across the sciences to complex systems that achieve super-human performance on various tasks. Ensuring that they are well-behaved---that they do not, for example, cause harm to humans or act in a racist or sexist way---is therefore not a hypothetical problem to be dealt with in the future, but a pressing one that we address here. We propose a new framework for designing machine learning algorithms that simplifies the problem of specifying and regulating undesirable behaviors. To show the viability of this new framework, we use it to create new machine learning algorithms that preclude the sexist and harmful behaviors exhibited by standard machine learning algorithms in our experiments. Our framework for designing machine learning algorithms simplifies the safe and responsible application of machine learning.

[1] H. Kahn,et al. Methods of Reducing Sample Size in Monte Carlo Computations , 1953, Oper. Res..

[2] A. Charnes,et al. Chance-Constrained Programming , 1959 .

[3] C. Jackson Grayson,et al. Decisions Under Uncertainty: Drilling Decisions by Oil and Gas Operators. , 1961 .

[4] J. Hammersley. MONTE CARLO METHODS FOR SOLVING MULTIVARIABLE PROBLEMS , 1960 .

[5] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[6] R. Jagannathan,et al. Chance-Constrained Programming with Joint Constraints , 1974, Oper. Res..

[7] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8] M. Houck. A Chance Constrained Optimization Model for reservoir design and operation , 1979 .

[9] B. Efron. Better Bootstrap Confidence Intervals , 1987 .

[10] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .

[11] C. Watkins. Learning from delayed rewards , 1989 .

[12] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[13] M. Rabin. Published by: American , 2022 .

[14] Howard B. Demuth,et al. Neutral network toolbox for use with Matlab , 1995 .

[15] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[16] William Nick Street,et al. Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[17] S. Sastry,et al. Hybrid Control in Air Traac Management Systems , 1995 .

[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[19] John M. Wilson,et al. Introduction to Stochastic Programming , 1998, J. Oper. Res. Soc..

[20] E. Fehr. A Theory of Fairness, Competition and Cooperation , 1998 .

[21] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.

[22] Darinka Dentcheva,et al. Concavity and efficient points of discrete distributions in probabilistic programming , 2000, Math. Program..

[23] Sham M. Kakade,et al. Optimizing Average Reward Using Discounted Rewards , 2001, COLT/EuroCOLT.

[24] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[25] Datta N. Godbole,et al. Addressing Multiobjective Control: Safety and Performance through Constrained Optimization , 2001, HSCC.

[26] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..

[27] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.

[28] A. Folsom,et al. Coronary heart disease risk prediction in the Atherosclerosis Risk in Communities (ARIC) study. , 2003, Journal of clinical epidemiology.

[29] J. Pankow,et al. Prediction of coronary heart disease in middle-aged adults with diabetes. , 2003, Diabetes care.

[30] A. Messac,et al. The normalized normal constraint method for generating the Pareto frontier , 2003 .

[31] D. Bertsimas,et al. Shortfall as a risk measure: properties, optimization and applications , 2004 .

[32] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[33] Tom Fawcett,et al. Robust Classification for Imprecise Environments , 2000, Machine Learning.

[34] Gajendra P.S. Raghava,et al. Prediction of CTL epitopes using QM, SVM and ANN techniques. , 2004, Vaccine.

[35] Mala Htun. From "Racial Democracy" to Affirmative Action: Changing State Policy on Race in Brazil , 2004, Latin American Research Review.

[36] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..

[37] Alexandre M. Bayen,et al. A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games , 2005, IEEE Transactions on Automatic Control.

[38] Giuseppe Carlo Calafiore,et al. Uncertain convex programs: randomized solutions and confidence levels , 2005, Math. Program..

[39] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[40] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.

[41] Mark Kotanchek,et al. Pareto-Front Exploitation in Symbolic Regression , 2005 .

[42] Armin Falk,et al. A Theory of Reciprocity , 2001, Games Econ. Behav..

[43] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[44] Alexander Shapiro,et al. Convex Approximations of Chance Constrained Programs , 2006, SIAM J. Optim..

[45] Kalyanmoy Deb,et al. Introducing Robustness in Multi-Objective Optimization , 2006, Evolutionary Computation.

[46] Garud Iyengar,et al. Ambiguous chance constrained problems and robust optimization , 2006, Math. Program..

[47] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[48] Andy Liaw,et al. Classification and Regression by randomForest , 2007 .

[49] P. Massart,et al. Concentration inequalities and model selection , 2007 .

[50] R. Jibson. Regression models for estimating coseismic landslide displacement , 2007 .

[51] Steffen Udluft,et al. Safe exploration for reinforcement learning , 2008, ESANN.

[52] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.

[53] Laurent El Ghaoui,et al. Robust Optimization , 2021, ICORES.

[54] Toon Calders,et al. Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[55] Larry D. Pyeatt,et al. Reinforcement Learning for Closed-Loop Propofol Anesthesia: A Human Volunteer Study , 2010, IAAI.

[56] Toon Calders,et al. Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[57] Itay Gurvich,et al. Staffing Call Centers with Uncertain Demand Forecasts: A Chance-Constrained Optimization Approach , 2010, Manag. Sci..

[58] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.

[59] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[60] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[61] Jun Sakuma,et al. Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[62] Shie Mannor,et al. Probabilistic Goal Markov Decision Processes , 2011, IJCAI.

[63] Franco Turini,et al. k-NN as an implementation of situation testing for discrimination discovery and prevention , 2011, KDD.

[64] Arkadi Nemirovski,et al. On safe tractable approximations of chance constraints , 2012, Eur. J. Oper. Res..

[65] N. Shah,et al. Increased Mortality of Patients With Diabetes Reporting Severe Hypoglycemia , 2012, Diabetes Care.

[66] Nuno C. Martins,et al. Control Design for Markov Chains under Safety Constraints: A Convex Approach , 2012, ArXiv.

[67] lexander,et al. THE GENERALIZED SIMPLEX METHOD FOR MINIMIZING A LINEAR FORM UNDER LINEAR INEQUALITY RESTRAINTS , 2012 .

[68] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[69] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.

[70] Olivier Sigaud,et al. Policy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning , 2012 .

[71] Qianfan Wang,et al. A chance-constrained two-stage stochastic program for unit commitment with uncertain wind power output , 2012, 2012 IEEE Power and Energy Society General Meeting.

[72] Scott Kuindersma,et al. Variational Bayesian Optimization for Runtime Risk-Sensitive Control , 2012, Robotics: Science and Systems.

[73] Thomas G. Dietterich,et al. Allowing a wildfire to burn: estimating the effect on future fire suppression costs , 2013 .

[74] Latanya Sweeney,et al. Discrimination in online ad delivery , 2013, CACM.

[75] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[76] Sridhar Mahadevan,et al. Projected Natural Actor-Critic , 2013, NIPS.

[77] Suchi Saria,et al. A $3 Trillion Challenge to Computational Scientists: Transforming Healthcare Delivery , 2014, IEEE Intelligent Systems.

[78] Jaime F. Fisac,et al. Reachability-based safe learning with Gaussian processes , 2014, 53rd IEEE Conference on Decision and Control.

[79] C. Cobelli,et al. The UVA/PADOVA Type 1 Diabetes Simulator , 2014, Journal of diabetes science and technology.

[80] Thorsten Dickhaus,et al. Simultaneous Statistical Inference , 2014, Springer Berlin Heidelberg.

[81] Meysam Bastani,et al. Model-Free Intelligent Diabetes Management Using Machine Learning , 2014 .

[82] Nick Bostrom,et al. Superintelligence: Paths, Dangers, Strategies , 2014 .

[83] Mohammad Ghavamzadeh,et al. Algorithms for CVaR Optimization in MDPs , 2014, NIPS.

[84] Philip S. Thomas,et al. Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees , 2015, IJCAI.

[85] Carlos Eduardo Scheidegger,et al. Certifying and Removing Disparate Impact , 2014, KDD.

[86] Philip S. Thomas,et al. High Confidence Policy Improvement , 2015, ICML.

[87] Shie Mannor,et al. Optimizing the CVaR via Sampling , 2014, AAAI.

[88] Shlomo Zilberstein. Building Strong Semi-Autonomous Systems , 2015, AAAI.

[89] Sean A. Munson,et al. Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[90] Michael Carl Tschantz,et al. Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[91] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[92] Marcello Restelli,et al. Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation , 2014, AAAI.

[93] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[94] Philip S. Thomas,et al. High-Confidence Off-Policy Evaluation , 2015, AAAI.

[95] Lihong Li,et al. Doubly Robust Off-policy Evaluation for Reinforcement Learning , 2015, ArXiv.

[96] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .

[97] Arya Irani,et al. Utilizing negative policy information to accelerate reinforcement learning , 2015 .

[98] András Prékopa,et al. ON PROBABILISTIC CONSTRAINED PROGRAMMING , 2015 .

[99] Suresh Venkatasubramanian,et al. Auditing Black-box Models by Obscuring Features , 2016, ArXiv.

[100] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.

[101] Stuart Russell. Should We Fear Supersmart Robots? , 2016, Scientific American.

[102] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[103] Marek Petrik,et al. Safe Policy Improvement by Minimizing Robust Baseline Regret , 2016, NIPS.

[104] Aaron Roth,et al. Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[105] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.

[106] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[107] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.

[108] Benjamin Fish,et al. A Confidence-Based Approach for Balancing Fairness and Accuracy , 2016, SDM.

[109] Philip S. Thomas,et al. Importance Sampling with Unequal Support , 2016, AAAI.

[110] Yair Zick,et al. Algorithmic Transparency via Quantitative Input Influence , 2017 .

[111] Andreas Krause,et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.