Member-only story
ML Paper Challenge Day 23 — Layer Normalisation
2 min readMay 4, 2020
Day 23: 2020.05.04
Paper: Layer Normalisation
Category: Model/Deep Learning/Technique (Layer Normalisation)
Layer Normalisation
Background
- batch normalisation requires running averages of the summed input statistics.
- However, the summed inputs to the recurrent neurons in a recurrent neural network (RNN) often vary with the length of the sequence so applying batch normalisation to RNNs appears to require different statistics for different time-steps.
-> not really feasible to apply to recurrent neural networks - the effect of batch normalisation is dependent on the mini-batch size
-> cannot be applied to online learning tasks or to extremely large distributed models where the mini-batches have to be small.
How
- transpose batch normalisation into layer normalisation by directly computing…