CUSTOMIZED TRAINING WITH AN APPLICATION TO MASS SPECTROMETRIC IMAGING OF CANCER TISSUE.

We introduce a simple, interpretable strategy for making predictions on test data when the features of the test data are available at the time of model fitting. Our proposal-customized training-clusters the data to find training points close to each test point and then fits an ℓ 1-regularized model (lasso) separately in each training cluster. This approach combines the local adaptivity of k-nearest neighbors with the interpretability of the lasso. Although we use the lasso for the model fitting, any supervised learning method can be applied to the customized training sets. We apply the method to a mass-spectrometric imaging data set from an ongoing collaboration in gastric cancer detection which demonstrates the power and interpretability of the technique. Our idea is simple but potentially useful in situations where the data have some underlying structure.

[1]  Z. Hall Cancer , 1906, The Hospital.

[2]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[3]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  G. Ginsburg,et al.  The path to personalized medicine. , 2002, Current opinion in chemical biology.

[6]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[7]  Luís Torgo,et al.  Clustered Partial Linear Regression , 2000, Machine Learning.

[8]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[9]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[10]  R. Cheloha,et al.  The of a Development , 2004 .

[11]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[12]  Mehryar Mohri,et al.  On Transductive Regression , 2006, NIPS.

[13]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007, Biomedical engineering online.

[14]  Bernhard Schölkopf,et al.  Transductive Classification via Local Learning Regularization , 2007, AISTATS.

[15]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[16]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[17]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[18]  Robert Tibshirani,et al.  A Framework for Feature Selection in Clustering , 2010, Journal of the American Statistical Association.

[19]  Jun Zhou,et al.  Mixing Linear SVMs for Nonlinear Classification , 2010, IEEE Transactions on Neural Networks.

[20]  Robert Tibshirani,et al.  Hierarchical Clustering With Prototypes via Minimax Linkage , 2011, Journal of the American Statistical Association.

[21]  Ning Chen,et al.  Infinite SVM: a Dirichlet Process Mixture of Large-margin Kernel Machines , 2011, ICML.

[22]  Philip H. S. Torr,et al.  Locally Linear Support Vector Machines , 2011, ICML.

[23]  Tian Min Ma,et al.  Local and personalised modelling for renal medical Decision Support System , 2012 .

[24]  David Gil Méndez,et al.  Predicting seminal quality with artificial intelligence methods , 2012, Expert Syst. Appl..

[25]  Jiawei Han,et al.  Clustered Support Vector Machines , 2013, AISTATS.

[26]  Roberto Todeschini,et al.  Quantitative Structure-Activity Relationship Models for Ready Biodegradability of Chemicals , 2013, J. Chem. Inf. Model..

[27]  Seref Sagiroglu,et al.  The development of intuitive knowledge classifier and the modeling of domain dependent data , 2013, Knowl. Based Syst..

[28]  Max A. Little,et al.  Objective Automatic Assessment of Rehabilitative Speech Treatment in Parkinson's Disease , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[29]  R. Tibshirani,et al.  Molecular assessment of surgical-resection margins of gastric cancer by mass-spectrometric imaging , 2014, Proceedings of the National Academy of Sciences.