Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies

When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in the GP model, and A-, D-, or E-optimal design. In this paper, we tackle the combinatorial optimization problem of maximizing the mutual information between the chosen locations and the locations which are not selected. We prove that the problem of finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1-1/e) of the optimum by exploiting the submodularity of mutual information. We also show how submodularity can be used to obtain online bounds, and design branch and bound search procedures. We then extend our algorithm to exploit lazy evaluations and local structure in the GP, yielding significant speedups. We also extend our approach to find placements which are robust against node failures and uncertainties in the model. These extensions are again associated with rigorous theoretical approximation guarantees, exploiting the submodularity of the objective function. We demonstrate the advantages of our approach towards optimizing mutual information in a very extensive empirical study on two real-world data sets.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  R. Kershner The Number of Circles Covering a Set , 1939 .

[3]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[4]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[5]  D. Lindley,et al.  Bayes Estimates for the Linear Model , 1972 .

[6]  Toby J. Mitchell,et al.  Computer Construction of “D-Optimal” First-Order Designs , 1974 .

[7]  Clifford S. Stein Estimation of a covariance matrix , 1975 .

[8]  D. V. Gokhale,et al.  A Survey of Statistical Design and Linear Models. , 1976 .

[9]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[10]  A. O'Hagan,et al.  Curve Fitting and Optimal Design for Prediction , 1978 .

[11]  J. Bernardo Expected Information as Expected Utility , 1979 .

[12]  William F. Caselton,et al.  Hydrologic Networks: Information Transmission , 1980 .

[13]  R. Cook,et al.  A Comparison of Algorithms for Constructing Exact D-Optimal Designs , 1980 .

[14]  G. Nemhauser,et al.  Maximizing Submodular Set Functions: Formulations and Analysis of Algorithms* , 1981 .

[15]  W. Welch Branch-and-Bound Search for Experimental Designs Based on D Optimality and Other , 1982 .

[16]  John R. Beaumont,et al.  Studies on Graphs and Discrete Programming , 1982 .

[17]  W. Welch Branch-and-Bound Search for Experimental Designs Based on D Optimality and Other Criteria , 1982 .

[18]  Gene H. Golub,et al.  Matrix computations , 1983 .

[19]  W. F. Caselton,et al.  Optimal monitoring network designs , 1984 .

[20]  Wolfgang Maass,et al.  Approximation schemes for covering and packing problems in image processing and VLSI , 1985, JACM.

[21]  S. Luttrell The use of transinformation in the design of data sampling schemes for inverse problems , 1985 .

[22]  F. Pukelsheim Information increasing orderings in experimental design theory , 1987 .

[23]  R. K. Meyer,et al.  Constructing exact D-optimal experimental designs by simulated annealing , 1988 .

[24]  Anthony C. Atkinson,et al.  Recent Developments in the Methods of Optimum and Related Experimental Designs , 1988 .

[25]  V. Fedorov,et al.  Comparison of two approaches in the optimal design of an observation network , 1989 .

[26]  S. Schwartz,et al.  An accelerated sequential algorithm for producing D -optimal designs , 1989 .

[27]  Jerome Sacks,et al.  Designs for Computer Experiments , 1989 .

[28]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[29]  T. J. Mitchell,et al.  Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments , 1991 .

[30]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[31]  E. L. Russell Statistics in the Environmental & Earth Sciences , 1992 .

[32]  P. Guttorp,et al.  Nonparametric Estimation of Nonstationary Spatial Covariance Structure , 1992 .

[33]  Alan J. Miller,et al.  A review of some exchange algorithms for constructing discrete D-optimal designs , 1992 .

[34]  J. Zidek,et al.  An entropy-based analysis of data from selected NADP/NTN network sites for 1983–1986 , 1992 .

[35]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[36]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[37]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[38]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[39]  Maurice Queyranne,et al.  An Exact Algorithm for Maximum Entropy Sampling , 1995, Oper. Res..

[40]  A. Atkinson The Usefulness of Optimum Experimental Designs , 1996 .

[41]  Sollich Learning from minimum entropy queries in a large committee machine. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[42]  David Higdon,et al.  Non-Stationary Spatial Modeling , 2022, 2212.08043.

[43]  A. Storkey Truncated covariance matrices and Toeplitz methods in Gaussian processes , 1999 .

[44]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[45]  J. Zidek,et al.  Designing and integrating composite networks for monitoring multivariate gaussian pollution fields , 2000 .

[46]  Klaus Obermayer,et al.  Gaussian process regression: active data selection and test point rejection , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[47]  H. Wynn,et al.  Maximum entropy sampling and optimal Bayesian experimental design , 2000 .

[48]  Volker Tresp,et al.  Mixtures of Gaussian Processes , 2000, NIPS.

[49]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[50]  Toby J. Mitchell,et al.  An Algorithm for the Construction of “D-Optimal” Experimental Designs , 2000, Technometrics.

[51]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[52]  Naftali Tishby,et al.  The Information Of Observations And Application For Active Learning With Uncertainty , 2001 .

[53]  R. Kass,et al.  Shrinkage Estimators for Covariance Matrices , 2001, Biometrics.

[54]  Héctor H. González-Baños,et al.  A randomized art-gallery algorithm for sensor placement , 2001, SCG '01.

[55]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[56]  Jon Lee Maximum entropy sampling , 2001 .

[57]  Uri Lerner,et al.  Inference in Hybrid Networks: Theoretical Limits and Practical Algorithms , 2001, UAI.

[58]  N. Higham Computing the nearest correlation matrix—a problem from finance , 2002 .

[59]  W. Dunsmuir,et al.  Estimation of nonstationary spatial covariance structure , 2002 .

[60]  Gaurav S. Sukhatme,et al.  Mobile Sensor Network Deployment using Potential Fields : A Distributed , Scalable Solution to the Area Coverage Problem , 2002 .

[61]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[62]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[63]  Christopher J. Paciorek,et al.  Nonstationary Gaussian Processes for Regression and Spatial Modelling , 2003 .

[64]  Carl E. Rasmussen,et al.  Warped Gaussian Processes , 2003, NIPS.

[65]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[66]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[67]  Wei Hong,et al.  Model-Driven Data Acquisition in Sensor Networks , 2004, VLDB.

[68]  Maxim Sviridenko,et al.  A note on maximizing a submodular set function subject to a knapsack constraint , 2004, Oper. Res. Lett..

[69]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[70]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[71]  B. Mallick,et al.  Analyzing Nonstationary Spatial Data Using Piecewise Gaussian Processes , 2005 .

[72]  Stavros Toumpis,et al.  Optimal placement of nodes in large sensor networks under a general physical layer model , 2005, 2005 Second Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks, 2005. IEEE SECON 2005..

[73]  Shai Avidan,et al.  Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms , 2005, NIPS.

[74]  Robert B. Gramacy,et al.  Bayesian treed gaussian process models , 2005 .

[75]  Andreas Krause,et al.  Near-optimal sensor placements in Gaussian processes , 2005, ICML.

[76]  Jeff A. Bilmes,et al.  A Submodular-supermodular Procedure with Applications to Discriminative Structure Learning , 2005, UAI.

[77]  Carlos Guestrin,et al.  A Note on the Budgeted Maximization of Submodular Functions , 2005 .

[78]  Liam Paninski,et al.  Asymptotic Theory of Information-Theoretic Experimental Design , 2005, Neural Computation.

[79]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[80]  Chris Bailey-Kellogg,et al.  Gaussian Processes for Active Data Mining of Spatial Aggregates , 2005, SDM.

[81]  Michael I. Jordan,et al.  Robust design of biological experiments , 2005, NIPS.

[82]  M. Stein Nonstationary spatial covariance functions , 2005 .

[83]  Pramod K. Varshney,et al.  Energy-efficient deployment of Intelligent Mobile sensor networks , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[84]  Shai Avidan,et al.  Generalized spectral bounds for sparse LDA , 2006, ICML.

[85]  M. Stein,et al.  Spatial sampling design for prediction with estimated parameters , 2006 .

[86]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[87]  Dale L. Zimmerman,et al.  Optimal network design for spatial prediction, covariance parameter estimation, and empirical prediction , 2006 .

[88]  Jinbo Bi,et al.  Active learning via transductive experimental design , 2006, ICML.

[89]  C. Guestrin,et al.  Near-optimal sensor placements: maximizing information while minimizing communication cost , 2006, 2006 5th International Conference on Information Processing in Sensor Networks.

[90]  Andreas Krause,et al.  Efficient Planning of Informative Paths for Multiple Robots , 2006, IJCAI.

[91]  Andreas Krause,et al.  Nonmyopic active learning of Gaussian processes: an exploration-exploitation approach , 2007, ICML '07.

[92]  Shai Avidan,et al.  Fast Pixel/Part Selection with Sparse Eigenvectors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[93]  Abhimanyu Das,et al.  Algorithms for subset selection in linear regression , 2008, STOC.

[94]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[95]  Dong Xuan,et al.  On Deploying Wireless Sensors to Achieve Both Coverage and Connectivity , 2006, 2009 5th International Conference on Wireless Communications, Networking and Mobile Computing.