Adaptive importance sampling with automatic model selection in reward weighted regression (ニューロコンピューティング)