TRAINING ACTION SELECTION NEURAL NETWORKS USING LOOK-AHEAD SEARCH