A little while ago, you might have read about batch normalization being the next coolest thing since ReLu’s. Things have since moved on, but it’s worth mentioning because it has been adopted in most networks today. The original paper is pretty dense, and it’s all goodies inside of there. So, this blog post is devoted to explaining the more confusing portions of batch normalization. What follows are a few concepts that you may find interesting or may not have fully understood when reading over Ioffe and Szegedy’s paper. We hope that it’s helpful.

Via Eric Feuilleaubois