Learning to act in uncertain environments

The problem of decision making in an uncertain environment arises in many diverse contexts: deciding whether to keep a hard drive spinning in a net-book; choosing which advertisement to post to a Web site visitor; choosing how many newspapers to order so as to maximize profits; or choosing a route to recommend to a driver given limited and possibly out-of-date information about traffic conditions. All are sequential decision problems, since earlier decisions affect subsequent performance; all require adaptive approaches, since they involve significant uncertainty. The key issue in effectively solving problems like these is known as the exploration/exploitation trade-off: If I am at a cross-roads, when should I go in the most advantageous direction among those that I have already explored, and when should I strike out in a new direction, in the hopes I will discover something better?