ML Paper Challenge Day 18 — Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
3 min readApr 30, 2020
Day 18: 2020.04.29
Paper: Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
Category: Model/Deep Learning/Speech Recognition
Model Architecture
Input: log-spectrograms of power normalised audio clips, calculated on 20ms windows
Output: alphabet of each language
Inference: CTC models paired a with language model trained on a bigger corpus of text
Batch Normalisation for Deep RNNs
Objective: To train networks using gradient descent when the size and depth increases