论文信息 - Gauss meets Canadian traveler: shortest-path problems with correlated natural dynamics

Gauss meets Canadian traveler: shortest-path problems with correlated natural dynamics

In a variety of real world problems from robot navigation to logistics, agents face the challenge of path optimization on a graph with unknown edge costs. These settings can be generally formalized as the Canadian Traveler Problems (CTPs) [13]. Although in many applications the edge costs have dependencies resulting from world dynamics, CTPs with such structure have received considerably less attention than those with independent edge costs, largely because the dependence structure is often problem-specific and difficult to state compactly. Yet, in a wide variety of navigation tasks, spatial correlations between edge traversal costs are governed by natural phenomena such as winds, congestion, or ocean currents, which are conveniently described with a well-understood machine learning model --- Gaussian Process (GP). In this article, we propose a synthesis of CTPs and GPs, the Gaussian Traveler Problem (GTP). In GTPs, an agent observes the costs of graph edges when traversing them, and uses the observed costs to adjust its belief over other edges via Gaussian Process updates. Examples of GTP instances include aircraft, traffic, and vessel navigation, to name just a few. Computing optimal agent behavior for a GTP turns out to be equivalent to solving a Partially Observable MDP with continuous observation space. We present an approximate algorithm for solving GTPs with efficient machine-learning and decision-making components, whose design is influenced by the challenges of real-world problems. Despite the intractability of computing an optimal policy, our experiments in the aircraft navigation scenario with real wind data demonstrate that our framework can significantly improve upon state-of-the-art techniques for planning airplane routes.

[1] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[2] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[3] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[4] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .

[5] Blai Bonet,et al. Solving POMDPs: RTDP-Bel vs. Point-based Algorithms , 2009, IJCAI.

[6] Tad McGeer,et al. Passive Dynamic Walking , 1990, Int. J. Robotics Res..

[7] Solomon Eyal Shimony,et al. Canadian traveler problem with remote sensing , 2009, IJCAI 2009.

[8] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[9] Pasquale Pace,et al. Low multipath antennas for GNSS-based attitude determination systems applied to high-altitude platforms , 2008 .

[10] Trevor Darrell,et al. Gaussian Processes for Object Categorization , 2010, International Journal of Computer Vision.

[11] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[12] Alan Olsen. Pond-Hindsight: Applying Hindsight Optimization to Partially-Observable Markov Decision Processes , 2011 .

[13] Mihalis Yannakakis,et al. Shortest Paths Without a Map , 1989, Theor. Comput. Sci..

[14] Ying Sun,et al. Gaussian Processes for Short-Term Traffic Volume Forecasting , 2010 .

[15] Geoffrey A. Hollinger,et al. Towards Improved Prediction of Ocean Processes Using Statistical Machine Learning , 2012, RSS 2012.

[16] John N. Tsitsiklis,et al. Stochastic shortest path problems with recourse , 1996, Networks.

[17] Xiaoqian Jiang,et al. Adaptive Gaussian Process for Short-Term Wind Speed Forecasting , 2010, ECAI.

[18] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[19] J. Tsitsiklis,et al. Stochastic shortest path problems with recourse , 1996 .