Understanding the Usability Challenges of Machine Learning In High-Stakes Decision Making

Fig. 1. The Sibyl "Case-Specific Details" interface. Sibyl was designed to gain concrete feedback on possible methods of addressing ML usability challenges faced by child welfare screeners. The Case-Specific Details interface is one of the ML augmentation approaches evaluated in this paper, and shows how particular factors contribute to predictions made by ML models about child welfare. (Note that we refer to features as “factors,” as this language is more familiar to our users.) Labeled elements are as follows: 1) The risk score for the case (1-20). 2) Categories for each factor, such as demographics (DG) or referral history. 3) A short description of each factor. 4) The value of numeric or categorical factors. 5) The contribution of each factor (the table can be sorted in ascending or descending order of contribution). 6) UI components for searching by factor name or filtering by category, enabled when the full factor list is shown. 7) A button for switching between a view that shows only the top 10 most contributing factors and one that shows all factors. 8) A button for switching between a single-table view and a side-by-side view, which splits factors that increase and decrease risk. 9) A sidebar that includes other explanation types, as described in this paper.

[1]  Alexander M. Rush,et al.  LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks , 2016, IEEE Transactions on Visualization and Computer Graphics.

[2]  Wendy E. Mackay,et al.  Human-Centred Machine Learning , 2016, CHI Extended Abstracts.

[3]  Minsuk Kahng,et al.  ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models , 2017, IEEE Transactions on Visualization and Computer Graphics.

[4]  Scott M. Lundberg,et al.  Explainable machine-learning predictions for the prevention of hypoxaemia during surgery , 2018, Nature Biomedical Engineering.

[5]  Marko Bohanec,et al.  Perturbation-Based Explanations of Prediction Models , 2018, Human and Machine Learning.

[6]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[7]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[8]  Qian Yang,et al.  Designing Theory-Driven User-Centric Explainable AI , 2019, CHI.

[9]  Steven M. Drucker,et al.  Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models , 2019, CHI.

[10]  Tamara Munzner,et al.  A Nested Model for Visualization Design and Validation , 2009, IEEE Transactions on Visualization and Computer Graphics.

[11]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[12]  Yang Wang,et al.  Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models , 2018, IEEE Transactions on Visualization and Computer Graphics.

[13]  Cynthia Rudin,et al.  All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously , 2019, J. Mach. Learn. Res..

[14]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[15]  Jimeng Sun,et al.  RetainVis: Visual Analytics with Interpretable and Interactive Recurrent Neural Networks on Electronic Medical Records , 2018, IEEE Transactions on Visualization and Computer Graphics.

[16]  Hyunil Kim,et al.  Lifetime Prevalence of Investigating Child Maltreatment Among US Children. , 2017, American journal of public health.