Outlines
Outlines
Sitemap
- 2023-11-12: : MLPs at the PFLOP scale: Scaling MLPs: A Tale of Inductive Bias
- 2023-10-22: : BatchNorm and Loss Landscape: How does Batch Normalization help optimization?
- 2023-10-08: : The Layer Normalization paper: Layer Normalization
- 2023-09-17: : Sharp minima considered not so harmful: Sharp Minima Can Generalize For Deep Nets
- 2023-09-10: : The AMSGRAD paper: On the convergence of Adam and beyond
- 2023-08-19: : The AdamW paper: Decoupled Weight Decay Regularization
- 2023-07-30: : Mini-batch Bayesian learning: Bayesian Learning via Stochastic Gradient Langevin Dynamics
- 2023-07-23: : Batch Size vs. Learning Rate: Don’t Decay the Learning Rate, Increase the Batch Size
- 2023-06-25: : Loss landscape paper: Visualizing the Loss Landscape of Neural Nets
- 2023-04-29: : Suhail’s outline (Discriminative): A Case Against the Goto Statement
- 2023-04-29: : Chris’ outline (Generative): A Case Against the Goto Statement