Reinforcement Learning From State and Temporal Differences