areas: deep-learning
- Cramming: Training a Language Model on a Single GPU in One Day — 2022, to-read
- Denoising Diffusion Probabilistic Models — 2020, replicated
- Group Normalization — 2018, deep-read
- Attention Is All You Need — 2017, deep-read
- Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks — 2016, deep-read
- Layer Normalization — 2016, deep-read
- Unsupervised Domain Adaptation by Backpropagation — 2015, deep-read
- Neural Machine Translation by Jointly Learning to Align and Translate — 2015, deep-read
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift — 2015, deep-read
- Adam: A Method for Stochastic Optimization — 2015, deep-read