SGD and Weight Decay Provably Induce a Low-Rank Bias in Neural Networks