Batch Bayesian Optimization via Local Penalization

The popularity of Bayesian optimization methods for efficient exploration of parameter spaces has lead to a series of papers applying Gaussian processes as surrogates in the optimization of functions. However, most proposed approaches only allow the exploration of the parameter space to occur sequentially. Often, it is desirable to simultaneously propose batches of parameter values to explore. This is particularly the case when large parallel processing facilities are available. These facilities could be computational or physical facets of the process being optimized. E.g. in biological experiments many experimental set ups allow several samples to be simultaneously processed. Batch methods, however, require modeling of the interaction between the evaluations in the batch, which can be expensive in complex scenarios. We investigate a simple heuristic based on an estimate of the Lipschitz constant that captures the most important aspect of this interaction (i.e. local repulsion) at negligible computational overhead. The resulting algorithm compares well, in running time, with much more elaborate alternatives. The approach assumes that the function of interest, $f$, is a Lipschitz continuous function. A wrap-loop around the acquisition function is used to collect batches of points of certain size minimizing the non-parallelizable computational effort. The speed-up of our method with respect to previous approaches is significant in a set of computationally expensive experiments.

[1]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[2]  Donald R. Jones,et al.  Global versus local search in constrained optimization of computer models , 1998 .

[3]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[4]  Josef Schwarz,et al.  The Parallel Bayesian Optimization Algorithm , 2000 .

[5]  Y. D. Sergeyev,et al.  Global Optimization with Non-Convex Constraints - Sequential and Parallel Algorithms (Nonconvex Optimization and its Applications Volume 45) (Nonconvex Optimization and Its Applications) , 2000 .

[6]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[7]  Panos M. Pardalos,et al.  Encyclopedia of Optimization, Second Edition , 2009 .

[8]  R. Adler The Geometry of Random Fields , 2009 .

[9]  Nando de Freitas,et al.  A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot , 2009, Auton. Robots.

[10]  Herbert K. H. Lee,et al.  Bayesian Guided Pattern Search for Robust Local Optimization , 2009, Technometrics.

[11]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[12]  D. Ginsbourger,et al.  Dealing with asynchronicity in parallel Gaussian Process based global optimization , 2010 .

[13]  Alan Fern,et al.  Batch Bayesian Optimization via Simulation Matching , 2010, NIPS.

[14]  Michael A. Osborne Bayesian Gaussian processes for sequential prediction, optimisation and quadrature , 2010 .

[15]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[16]  Ali Jalali,et al.  Dynamic Batch Bayesian Optimization , 2011, ArXiv.

[17]  David Ginsbourger,et al.  Expected Improvements for the Asynchronous Parallel Global Optimization of Expensive Functions: Potentials and Challenges , 2012, LION.

[18]  Andreas Krause,et al.  Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[19]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[20]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[21]  Ali Jalali,et al.  Hybrid Batch Bayesian Optimization , 2012, ICML.

[22]  Nicolas Vayatis,et al.  Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration , 2013, ECML/PKDD.

[23]  Yaroslav D. Sergeyev,et al.  Acceleration of Univariate Global Optimization Algorithms Working with Lipschitz Functions and Lipschitz First Derivatives , 2013, SIAM J. Optim..

[24]  Ali Jalali,et al.  A Lipschitz Exploration-Exploitation Scheme for Bayesian Optimization , 2012, ECML/PKDD.

[25]  David Ginsbourger,et al.  Fast Computation of the Multi-Points Expected Improvement with Applications in Batch Selection , 2013, LION.

[26]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[27]  Nando de Freitas,et al.  Heteroscedastic Treed Bayesian Optimisation , 2014, ArXiv.

[28]  Neil D. Lawrence,et al.  Bayesian Optimization for Synthetic Gene Design , 2015, 1505.01627.