A Dozen Tricks with Multitask Learning

Multitask Learning is an inductive transfer method that improves generalization accuracy on a main task by using the information contained in the training signals of other related tasks. It does this by learning the extra tasks in parallel with the main task while using a shared representation; what is learned for each task can help other tasks be learned better. This chapter describes a dozen opportunities for applying multitask learning in real problems. At the end of the chapter we also make several suggestions for how to get the most our of multitask learning on real-world problems.

[1]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[2]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[3]  Jude W. Shavlik,et al.  Using Sampling and Queries to Extract Rules from Trained Neural Networks , 1994, ICML.

[4]  Anthony Stentz,et al.  Sensor fusion for autonomous outdoor navigation using neural networks , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[5]  Sebastian Thrun,et al.  Explanation-based neural network learning a lifelong learning approach , 1995 .

[6]  Rich Caruana,et al.  Promoting Poor Features to Supervisors: Some Inputs Work Better as Outputs , 1996, NIPS.

[7]  Terrence J. Sejnowski,et al.  NETtalk: a parallel network that learns to read aloud , 1988 .

[8]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[9]  Yaser S. Abu-Mostafa,et al.  Learning from hints in neural networks , 1990, J. Complex..

[10]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[11]  Petri Koistinen,et al.  Using additive noise in back-propagation training , 1992, IEEE Trans. Neural Networks.

[12]  Rich Caruana,et al.  Using Feature Selection to Find Inputs that Work Better as Extra Outputs , 1998 .

[13]  Steven C. Suddarth,et al.  Symbolic-Neural Systems and the Use of Hints for Developing Complex Systems , 1991, Int. J. Man Mach. Stud..

[14]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[15]  M. Fine,et al.  Validation of a pneumonia prognostic index using the MedisGroups Comparative Hospital Database. , 1993, The American journal of medicine.

[16]  Tom Heskes,et al.  Solving a Huge Number of Similar Tasks: A Combination of Multi-Task Learning and a Hierarchical Bayesian Approach , 1998, ICML.

[17]  Jack Mostow,et al.  Direct Transfer of Learned Information Among Neural Networks , 1991, AAAI.

[18]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[19]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[20]  Joseph Sill,et al.  Monotonicity Hints , 1996, NIPS.

[21]  Rich Caruana,et al.  Multitask pattern recognition for autonomous robots , 1998, Proceedings. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems. Innovations in Theory, Practice and Applications (Cat. No.98CH36190).

[22]  S. C. Suddarth,et al.  Rule-Injection Hints as a Means of Improving Network Performance and Learning Time , 1990, EURASIP Workshop.

[23]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[24]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[25]  Tom M. Mitchell,et al.  Using the Future to Sort Out the Present: Rankprop and Multitask Learning for Medical Risk Evaluation , 1995, NIPS.

[26]  D. Rumelhart,et al.  Generalization by weight-elimination applied to currency exchange rate prediction , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[27]  Raúl E. Valdés-Pérez,et al.  A Powerful Heuristic for the Discovery of Complex Patterned Behaviour , 1994, ICML.

[28]  Yoshua Bengio,et al.  Multi-Task Learning for Stock Selection , 1996, NIPS.

[29]  Constantin F. Aliferis,et al.  An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.

[30]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[31]  Paul W. Munro,et al.  Competition Among Networks Improves Committee Performance , 1996, NIPS.

[32]  Rich Caruana,et al.  Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[33]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .