Relative Loss Bounds for Temporal-Difference Learning
暂无分享,去创建一个
[1] T. A. A. Broadbent,et al. Survey of Applicable Mathematics , 1970, Mathematical Gazette.
[2] David Haussler,et al. On the Complexity of Iterated Shuffle , 1984, J. Comput. Syst. Sci..
[3] Hagit Attiya,et al. Computing on an anonymous ring , 1988, JACM.
[4] Bernard Widrow,et al. Adaptive Signal Processing , 1985 .
[5] David Haussler,et al. Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension , 1986, STOC '86.
[6] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.
[7] Manfred K. Warmuth,et al. Membership for Growing Context-Sensitive Grammars is Polynomial , 1986, J. Comput. Syst. Sci..
[8] Leonard Pitt,et al. Reductions among prediction problems: on the difficulty of predicting automata , 1988, [1988] Proceedings. Structure in Complexity Theory Third Annual Conference.
[9] David Haussler,et al. Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.
[10] David Haussler,et al. Equivalence of models for polynomial learnability , 1988, COLT '88.
[11] William H. Press,et al. Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .
[12] Manfred K. Warmuth. Towards Representation Independence in PAC Learning , 1989, AII.
[13] Richard J. Anderson,et al. Parallel Approximation Algorithms for Bin Packing , 1988, Inf. Comput..
[14] Manfred K. Warmuth,et al. Learning integer lattices , 1990, COLT '90.
[15] B. Bollobás. Linear analysis : an introductory course , 1990 .
[16] Philip M. Long,et al. Composite geometric concepts and polynomial predictability , 1990, COLT '90.
[17] Naoki Abe,et al. Polynomial learnability of probabilistic concepts with respect to the Kullback-Leibler divergence , 1991, COLT '91.
[18] Dean Phillips Foster. Prediction in the Worst Case , 1991 .
[19] Philip M. Long,et al. On-line learning of linear functions , 1991, STOC '91.
[20] Manfred K. Warmuth,et al. Some weak learning results , 1992, COLT '92.
[21] David Haussler,et al. How to use expert advice , 1993, STOC.
[22] Manfred K. Warmuth,et al. Gap Theorems for Distributed Computation , 1993, SIAM J. Comput..
[23] Leonard Pitt,et al. The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.
[24] David Haussler,et al. The Probably Approximately Correct (PAC) and Other Learning Models , 1993 .
[25] Manfred K. Warmuth,et al. Using experts for predicting continuous outcomes , 1994, European Conference on Computational Learning Theory.
[26] Philip M. Long,et al. Worst-case quadratic loss bounds for a generalization of the Widrow-Hoff rule , 1993, COLT '93.
[27] Philip M. Long,et al. WORST-CASE QUADRATIC LOSS BOUNDS FOR ON-LINE PREDICTION OF LINEAR FUNCTIONS BY GRADIENT DESCENT , 1993 .
[28] Manfred K. Warmuth,et al. The Distributed Bit Complexity of the Ring: From the Anonymous to the Non-anonymous Case , 1989, Inf. Comput..
[29] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[30] David Haussler,et al. Tight worst-case loss bounds for predicting with expert advice , 1994, EuroCOLT.
[31] Peter Auer,et al. Exponentially many local minima for single neurons , 1995, NIPS.
[32] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.
[33] Amara Lynn Graps,et al. An introduction to wavelets , 1995 .
[34] Manfred K. Warmuth,et al. On Weak Learning , 1995, J. Comput. Syst. Sci..
[35] Manfred K. Warmuth,et al. The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant , 1995, COLT '95.
[36] Philip M. Long,et al. Worst-case quadratic loss bounds for prediction using linear functions and gradient descent , 1996, IEEE Trans. Neural Networks.
[37] Yoram Singer,et al. On‐Line Portfolio Selection Using Multiplicative Updates , 1998, ICML.
[38] Stephen Kwek,et al. Learning of depth two neural networks with constant fan-in at the hidden nodes (extended abstract) , 1996, COLT '96.
[39] Yoram Singer,et al. Training Algorithms for Hidden Markov Models using Entropy Based Distance Functions , 1996, NIPS.
[40] Vladimir Vovk,et al. Competitive On-line Linear Regression , 1997, NIPS.
[41] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[42] Yoram Singer,et al. Batch and On-Line Parameter Estimation of Gaussian Mixtures Based on the Joint Entropy , 1998, NIPS.
[43] Claudio Gentile,et al. Linear Hinge Loss and Average Margin , 1998, NIPS.
[44] Alexander Gammerman,et al. Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.
[45] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[46] Mark Herbster,et al. Tracking the best regressor , 1998, COLT' 98.
[47] Manfred K. Warmuth,et al. Efficient Learning With Virtual Threshold Gates , 1995, Inf. Comput..
[48] Manfred K. Warmuth,et al. Predicting nearly as well as the best pruning of a planar decision graph , 2002, Theor. Comput. Sci..
[49] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[50] Manfred K. Warmuth,et al. Averaging Expert Predictions , 1999, EuroCOLT.
[51] Jürgen Forster,et al. On Relative Loss Bounds in Generalized Linear Regression , 1999, FCT.
[52] Manfred K. Warmuth,et al. Boosting as entropy projection , 1999, COLT '99.
[53] Manfred K. Warmuth,et al. Direct and Indirect Algorithms for On-line Learning of Disjunctions , 1999, EuroCOLT.
[54] Gunnar Rätsch,et al. Barrier Boosting , 2000, COLT.
[55] Manfred K. Warmuth,et al. The Last-Step Minimax Algorithm , 2000, ALT.
[56] E. Takimoto,et al. The Minimax Strategy for Gaussian Density Estimation , 2000 .
[57] Manfred K. Warmuth,et al. Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..
[58] Gunnar Rätsch,et al. Active Learning in the Drug Discovery Process , 2001, NIPS.
[59] Manfred K. Warmuth. Compressing to VC Dimension Many Points , 2003, COLT.
[60] W. Kester. Fast Fourier Transforms , 2003 .
[61] Peter Auer,et al. Tracking the Best Disjunction , 1998, Machine Learning.
[62] Manfred K. Warmuth,et al. Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.
[63] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine Learning.
[64] Manfred K. Warmuth,et al. Learning Binary Relations Using Weighted Majority Voting , 1995, Machine Learning.
[65] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[66] Manfred K. Warmuth,et al. Learning nested differences of intersection-closed concept classes , 2004, Machine Learning.
[67] Manfred K. Warmuth,et al. Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.
[68] Manfred K. Warmuth. The Optimal PAC Algorithm , 2004, COLT.
[69] Manfred K. Warmuth,et al. On the Worst-Case Analysis of Temporal-Difference Learning Algorithms , 2005, Machine Learning.
[70] Naoki Abe,et al. On the computational complexity of approximating distributions by probabilistic automata , 1990, Machine Learning.
[71] Yoram Singer,et al. A Comparison of New and Old Algorithms for a Mixture Estimation Problem , 1995, COLT '95.
[72] Nicolò Cesa-Bianchi,et al. On-line Prediction and Conversion Strategies , 1994, Machine Learning.
[73] Manfred K. Warmuth,et al. Optimum Follow the Leader Algorithm , 2005, COLT.
[74] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[75] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[76] Manfred K. Warmuth. Can Entropic Regularization Be Replaced by Squared Euclidean Distance Plus Additional Linear Constraints , 2006, COLT.