Reliable Agnostic Learning

It is well known that in many applications erroneous predictions of one type or another must be avoided. In some applications, like spam detection, false positive errors are serious problems. In other applications, like medical diagnosis, abstaining from making a prediction may be more desirable than making an incorrect prediction. In this paper we consider different types of reliable classifiers suited for such situations. We formalize the notion and study properties of reliable classifiers in the spirit of agnostic learning (Haussler, 1992; Kearns, Schapire, and Sellie, 1994), a PAC-like model where no assumption is made on the function being learned. We then give two algorithms for reliable agnostic learning under natural distributions. The first reliably learns DNFs with no false positives using membership queries. The second reliably learns halfspaces from random examples with no false positives or false negatives, but the classifier sometimes abstains from making predictions.

[1]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[2]  Peter A. Flach,et al.  ROC Analysis in Artificial Intelligence, 1st International Workshop, ROCAI-2004, Valencia, Spain, August 22, 2004 , 2004, ROCAI.

[3]  Tadeusz Pietraszek,et al.  On the use of ROC analysis for the optimization of abstaining classifiers , 2007, Machine Learning.

[4]  Peter A. Flach,et al.  Delegating classifiers , 2004, ICML.

[5]  John Langford,et al.  Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[6]  Adam Tauman Kalai,et al.  Potential-Based Agnostic Boosting , 2009, NIPS.

[7]  Eyal Kushilevitz,et al.  Learning Decision Trees Using the Fourier Spectrum , 1993, SIAM J. Comput..

[8]  José Hernández-Orallo,et al.  Cautious Classifiers , 2004, ROCAI.

[9]  Robert D. Nowak,et al.  A Neyman-Pearson approach to statistical learning , 2005, IEEE Transactions on Information Theory.

[10]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[11]  Leonid A. Levin,et al.  A hard-core predicate for all one-way functions , 1989, STOC '89.

[12]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[13]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[14]  Jeffrey C. Jackson An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution , 1997, J. Comput. Syst. Sci..

[15]  Vitaly Feldman,et al.  Distribution-Specific Agnostic Boosting , 2009, ICS.

[16]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[17]  Rocco A. Servedio,et al.  Agnostically Learning Halfspaces , 2005, FOCS.

[18]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[19]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[20]  Adam Tauman Kalai,et al.  Agnostically learning decision trees , 2008, STOC.

[21]  Rocco A. Servedio,et al.  Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[22]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.