Member-only story

Day 29–30: 2020.05.10–11
Paper: Adam: A Method for Stochastic Optimization
Category: Model/Optimization

Adam

  • Straightforward to implement
  • Computationally efficient
  • Little memory requirements
  • Invariant to diagonal rescaling of the gradients
  • Well suited for problems that are large in terms of data and/or parameters
  • Appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients.
  • Hyper-parameters have intuitive interpretations and typically require little tuning
  • Works well in practice and compares favorably to other stochastic optimization methods

AdaMax

  • a variant of Adam based on the infinity norm

--

--

Chun-kit Ho
Chun-kit Ho

Written by Chun-kit Ho

cloud architect@ey | full-stack software engineer | social innovation | certified professional solutions architect in aws & gcp

No responses yet