General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States