Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning

Machine learning (ML) models are now routinely deployed in domains ranging from criminal justice to healthcare. With this newfound ubiquity, ML has moved beyond academia and grown into an engineering discipline. To that end, interpretability tools have been designed to help data scientists and machine learning practitioners better understand how ML models work. However, there has been little evaluation of the extent to which these tools achieve this goal. We study data scientists' use of two existing interpretability tools, the InterpretML implementation of GAMs and the SHAP Python package. We conduct a contextual inquiry (N=11) and a survey (N=197) of data scientists to observe how they use interpretability tools to uncover common issues that arise when building and evaluating ML models. Our results indicate that data scientists over-trust and misuse interpretability tools. Furthermore, few of our participants were able to accurately describe the visualizations output by these tools. We highlight qualitative themes for data scientists' mental models of interpretability tools. We conclude with implications for researchers and tool designers, and contextualize our findings in the social science literature.

[1]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[2]  Alun D. Preece,et al.  Interpretable to Whom? A Role-based Model for Analyzing Interpretable Machine Learning Systems , 2018, ArXiv.

[3]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[4]  Paul N. Bennett,et al.  Will You Accept an Imperfect AI?: Exploring Designs for Adjusting End-user Expectations of AI Systems , 2019, CHI.

[5]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[6]  Daniel S. Weld,et al.  The challenge of crafting intelligible intelligence , 2018, Commun. ACM.

[7]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[8]  Jichen Zhu,et al.  Explainable AI for Designers: A Human-Centered Perspective on Mixed-Initiative Co-Creation , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[9]  Himabindu Lakkaraju,et al.  "How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations , 2019, AIES.

[10]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[11]  Jerry Alan Fails,et al.  Interactive machine learning , 2003, IUI '03.

[12]  Cynthia Rudin,et al.  Interpretable classification models for recidivism prediction , 2015, 1503.07810.

[13]  Lauren Wilcox,et al.  "Hello AI": Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making , 2019, Proc. ACM Hum. Comput. Interact..

[14]  Albert Gordo,et al.  Learning Global Additive Explanations for Neural Nets Using Model Distillation , 2018 .

[15]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[16]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[17]  Roger Lamb,et al.  Attribution in conversational context: Effect of mutual knowledge on explanation‐giving , 1993 .

[18]  Todd Kulesza,et al.  Tell me more?: the effects of mental model soundness on personalizing an intelligent agent , 2012, CHI.

[19]  T. Lombrozo The structure and function of explanations , 2006, Trends in Cognitive Sciences.

[20]  John Zimmerman,et al.  Mapping Machine Learning Advances from HCI Research to Reveal Starting Places for Design Innovation , 2018, CHI.

[21]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[22]  Tim Miller,et al.  Towards a Grounded Dialog Model for Explainable Artificial Intelligence , 2018, ArXiv.

[23]  L. Shapley,et al.  A VALUE FOR 𝑛-PERSON GAMES , 2020 .

[24]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[25]  James A. Landay,et al.  Investigating statistical machine learning as a tool for software development , 2008, CHI.

[26]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[27]  Daniel G. Goldstein,et al.  Manipulating and Measuring Model Interpretability , 2018, CHI.

[28]  Steven M. Drucker,et al.  Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models , 2019, CHI.

[29]  David Weinberger,et al.  Accountability of AI Under the Law: The Role of Explanation , 2017, ArXiv.

[30]  Samuel J. Gershman,et al.  Human Evaluation of Models Built for Interpretability , 2019, HCOMP.

[31]  Hall P. Beck,et al.  A Framework of Automation Use , 2001 .

[32]  Emily Chen,et al.  How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation , 2018, ArXiv.

[33]  H. H. Clark,et al.  Common ground at the understanding of demonstrative reference , 1983 .

[34]  Qian Yang,et al.  Designing Theory-Driven User-Centric Explainable AI , 2019, CHI.

[35]  Kristin Branson,et al.  JAABA: interactive machine learning for automatic annotation of animal behavior , 2013, Nature Methods.

[36]  Tim Miller,et al.  Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences , 2017, ArXiv.

[37]  Blaz Zupan,et al.  Orange: From Experimental Machine Learning to Interactive Data Mining , 2004, PKDD.

[38]  Mohan S. Kankanhalli,et al.  Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda , 2018, CHI.

[39]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[40]  L. S. Shapley,et al.  17. A Value for n-Person Games , 1953 .

[41]  A. Panter,et al.  APA handbook of research methods in psychology, Vol 2: Research designs: Quantitative, qualitative, neuropsychological, and biological. , 2012 .

[42]  Thomas G. Dietterich,et al.  Interacting meaningfully with machine learning systems: Three experiments , 2009, Int. J. Hum. Comput. Stud..

[43]  W. Keith Edwards,et al.  Intelligibility and Accountability: Human Considerations in Context-Aware Systems , 2001, Hum. Comput. Interact..

[44]  Dean C. Barnlund,et al.  A Transactional Model of Communication , 2017 .

[45]  Paul Dourish,et al.  Algorithms and their others: Algorithmic culture in context , 2016 .

[46]  Hanna M. Wallach,et al.  Weight of Evidence as a Basis for Human-Oriented Explanations , 2019, ArXiv.

[47]  Donald A. Norman,et al.  Some observations on mental models , 1987 .

[48]  David Leake,et al.  Goal-Based Explanation Evaluation , 1991, Cogn. Sci..

[49]  R. Tibshirani,et al.  Generalized Additive Models: Some Applications , 1987 .

[50]  Scott M. Lundberg,et al.  Consistent Individualized Feature Attribution for Tree Ensembles , 2018, ArXiv.

[51]  Carla E. Brodley,et al.  Deploying an interactive machine learning system in an evidence-based practice center: abstrackr , 2012, IHI '12.

[52]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[53]  John Zimmerman,et al.  Investigating How Experienced UX Designers Effectively Work with Machine Learning , 2018, Conference on Designing Interactive Systems.

[54]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[55]  Deborah E. White,et al.  Thematic Analysis , 2017 .

[56]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[57]  Colin G. Drury,et al.  Foundations for an Empirically Determined Scale of Trust in Automated Systems , 2000 .

[58]  Michael J. Muller,et al.  How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation , 2019, CHI.

[59]  D. Kahneman Thinking, Fast and Slow , 2011 .

[60]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[61]  Miriam A. M. Capretz,et al.  Machine Learning With Big Data: Challenges and Approaches , 2017, IEEE Access.

[62]  D. Goldstein,et al.  Simple Rules for Complex Decisions , 2017, 1702.04690.

[63]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[64]  Rich Caruana,et al.  InterpretML: A Unified Framework for Machine Learning Interpretability , 2019, ArXiv.

[65]  Raja Parasuraman,et al.  Performance Consequences of Automation-Induced 'Complacency' , 1993 .

[66]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[67]  James Fogarty,et al.  Regroup: interactive machine learning for on-demand group creation in social networks , 2012, CHI.

[68]  Scott M. Lundberg,et al.  Explainable machine-learning predictions for the prevention of hypoxaemia during surgery , 2018, Nature Biomedical Engineering.

[69]  H. Grice Logic and conversation , 1975 .

[70]  B. Malle How the Mind Explains Behavior: Folk Explanations, Meaning, and Social Interaction , 2004 .

[71]  A. Strauss,et al.  Grounded theory , 2017 .