On Localized Discrepancy for Domain Adaptation

We propose the discrepancy-based generalization theories for unsupervised domain adaptation. Previous theories introduced distribution discrepancies defined as the supremum over complete hypothesis space. The hypothesis space may contain hypotheses that lead to unnecessary overestimation of the risk bound. This paper studies the localized discrepancies defined on the hypothesis space after localization. First, we show that these discrepancies have desirable properties. They could be significantly smaller than the pervious discrepancies. Their values will be different if we exchange the two domains, thus can reveal asymmetric transfer difficulties. Next, we derive improved generalization bounds with these discrepancies. We show that the discrepancies could influence the rate of the sample complexity. Finally, we further extend the localized discrepancies for achieving super transfer and derive generalization bounds that could be even more sample-efficient on source domain.

[1]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  S. Boucheron,et al.  Theory of classification : a survey of some recent advances , 2005 .

[4]  Andreas Maurer,et al.  Bounds for Linear Multi-Task Learning , 2006, J. Mach. Learn. Res..

[5]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[6]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[7]  M. Kawanabe,et al.  Direct importance estimation for covariate shift adaptation , 2008 .

[8]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[9]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[10]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[11]  Yishay Mansour,et al.  Learning Bounds for Importance Weighting , 2010, NIPS.

[12]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[13]  Jaime G. Carbonell,et al.  A theory of transfer learning with applications to active learning , 2013, Machine Learning.

[14]  Mehryar Mohri,et al.  New Analysis and Algorithm for Learning with Drifting Distributions , 2012, ALT.

[15]  Shai Ben-David,et al.  On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples , 2012, ALT.

[16]  Sethuraman Panchanathan,et al.  Joint Transfer and Batch-mode Active Learning , 2013, ICML.

[17]  Ilja Kuzborskij,et al.  Stability and Hypothesis Transfer Learning , 2013, ICML.

[18]  Bernhard Schölkopf,et al.  Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[19]  François Laviolette,et al.  A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers , 2013, ICML.

[20]  Ruth Urner,et al.  Domain adaptation–can quantity compensate for quality? , 2014, Annals of Mathematics and Artificial Intelligence.

[21]  Mehryar Mohri,et al.  Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..

[22]  Christoph H. Lampert,et al.  A PAC-Bayesian bound for Lifelong Learning , 2013, ICML.

[23]  Shai Ben-David,et al.  Multi-task and Lifelong Learning of Kernels , 2015, ALT.

[24]  Amaury Habrard,et al.  A Theoretical Analysis of Metric Hypothesis Transfer Learning , 2015, ICML.

[25]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[26]  François Laviolette,et al.  A new PAC-Bayesian perspective on domain adaptation , 2015, ICML 2016.

[27]  Christoph H. Lampert,et al.  Lifelong Learning with Non-i.i.d. Tasks , 2015, NIPS.

[28]  Mehryar Mohri,et al.  Adaptation Algorithm and Theory Based on Generalized Discrepancy , 2014, KDD.

[29]  Ruth Urner,et al.  Active Nearest Neighbors in Changing Environments , 2015, ICML.

[30]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[31]  Ievgen Redko,et al.  Non-negative embedding for fully unsupervised domain adaptation , 2016, Pattern Recognit. Lett..

[32]  Bernhard Schölkopf,et al.  Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[33]  Massimiliano Pontil,et al.  The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..

[34]  Barnabás Póczos,et al.  Hypothesis Transfer Learning via Transformation Functions , 2016, NIPS.

[35]  Ievgen Redko,et al.  Theoretical Analysis of Domain Adaptation with Optimal Transport , 2016, ECML/PKDD.

[36]  Christoph H. Lampert,et al.  Multi-task Learning with Labeled and Unlabeled Tasks , 2016, ICML.

[37]  Nicolas Courty,et al.  Optimal Transport for Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Nicolas Courty,et al.  Joint distribution optimal transportation for domain adaptation , 2017, NIPS.

[39]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[41]  Samory Kpotufe,et al.  Marginal Singularity, and the Benefits of Labels in Covariate-Shift , 2018, COLT.

[42]  Clayton Scott,et al.  A Generalized Neyman-Pearson Criterion for Optimal Domain Adaptation , 2018, ALT.

[43]  Rajesh Ranganath,et al.  Support and Invertibility in Domain-Invariant Representations , 2019, AISTATS.

[44]  Maria-Florina Balcan,et al.  Provable Guarantees for Gradient-Based Meta-Learning , 2019, ICML.

[45]  Steve Hanneke,et al.  On the Value of Target Data in Transfer Learning , 2020, NeurIPS.

[46]  Yue Cao,et al.  Transferable Representation Learning with Deep Adaptation Networks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Yuchen Zhang,et al.  Bridging Theory and Algorithm for Domain Adaptation , 2019, ICML.

[48]  Sen Wu,et al.  Understanding and Improving Information Transfer in Multi-Task Learning , 2020, ICLR.