On the momentum term in gradient descent learning algorithms