On Computing the KL Divergence for Bayesian Neural Networks