ML Paper Challenge Day 19 — Achieving Human Parity in Conversational Speech Recognition | by Chun-kit Ho | Medium

Member-only story
ML Paper Challenge Day 19 — Achieving Human Parity in Conversational Speech Recognition
Chun-kit Ho
·Follow
3 min read·
Apr 30, 2020
--
Papers with Code - Achieving Human Parity in Conversational Speech RecognitionConversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard…
paperswithcode.com
Day 19: 2020.04.30
Paper: Achieving Human Parity in Conversational Speech Recognition
Category: Model/Deep Learning/Speech Recognition
Model ArchitectureAchieved by combining multiple models!
3 CNNsVGG architecture
uses small (3x3) filters, is deeper, and applies up to five convolutional layers before pooling
ResNet architecture
adds highway connections, i.e. a linear transform of each layer’s input to the layer’s output
The only difference is that we apply Batch Normalisation before computing ReLU activations.
LACE (layer-wise context expansion with attention) model
a TDNN variant in which each higher layer is a weighted sum of nonlinear transformations of a window of lower layer frames 
-> each higher layer exploits broader context than lower…
--
--
Written by Chun-kit Ho134 Followers
·463 Following
cloud architect@ey | full-stack software engineer | social innovation | certified professional solutions architect in aws & gcp
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams