How can the human visual system process a natural scene in under 150ms? Experiments and neural network models