论文信息 - Optimistic Optimization of Deterministic Functions

Optimistic Optimization of Deterministic Functions

We consider a global optimization problem of a deterministic function f in a semi-metric space, given a finite budget of n evaluations. The function f is assumed to be locally smooth (around one of its global maxima) with respect to a semi-metric l We describe two algorithms based on optimistic exploration that use a hierarchical partitioning of the space at all scales. A first contribution is an algorithm, DOO, that requires the knowledge of l. We report a finite-sample performance bound in terms of a measure of the quantity of near-optimal states. We then define a second algorithm, SOO, which does not require the knowledge of the semi-metric l under which f is smooth, and whose performance is almost as good as DOO optimally-fitted.

Rémi Munos | R. Munos

[1] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[2] Eldon Hansen,et al. Global optimization using interval analysis , 1992, Pure and applied mathematics.

[3] Luc De Raedt,et al. Proceedings of the 12th European Conference on Machine Learning , 2001 .

[4] R. Horst,et al. Global Optimization: Deterministic Approaches , 1992 .

[5] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[6] Lamberto Cesari,et al. Optimization-Theory And Applications , 1983 .

[7] R. B. Kearfott. Rigorous Global Search: Continuous Problems , 1996 .

[8] J D Pinter,et al. Global Optimization in Action—Continuous and Lipschitz Optimization: Algorithms, Implementations and Applications , 2010 .

[9] Peter Auer,et al. Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.

[10] Pavel Brazdil,et al. Proceedings of the European Conference on Machine Learning , 1993 .

[11] Ernst-Georg Krause,et al. Biochemical mechanisms in heart function , 1998 .

[12] C. T. Kelley,et al. Modifications of the direct algorithm , 2001 .

[13] D. Finkel,et al. Convergence analysis of the direct algorithm , 2004 .

[14] A. Neumaier. Interval methods for systems of equations , 1990 .

[15] C. D. Perttunen,et al. Lipschitzian optimization without the Lipschitz constant , 1993 .

[16] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.

[17] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[18] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[19] Jia Yuan Yu,et al. Lipschitz Bandits without the Lipschitz Constant , 2011, ALT.

[20] Y. D. Sergeyev,et al. Global Optimization with Non-Convex Constraints - Sequential and Parallel Algorithms (Nonconvex Optimization and its Applications Volume 45) (Nonconvex Optimization and Its Applications) , 2000 .

[21] Rémi Munos,et al. Optimistic Planning of Deterministic Systems , 2008, EWRL.

[22] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[23] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.

[24] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.

[25] Aleksandrs Slivkins,et al. Multi-armed bandits on implicit metric spaces , 2011, NIPS.

[26] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[27] Csaba Szepesvári,et al. –armed Bandits , 2022 .

[28] Rémi Munos,et al. Open Loop Optimistic Planning , 2010, COLT.

[29] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[30] Bart De Schutter,et al. Optimistic planning for sparsely stochastic systems , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).