status: deep-read¶

LLaMA: Open and Efficient Foundation Language Models — 2023, deep-read
Training language models to follow instructions with human feedback — 2022, deep-read
LoRA: Low-Rank Adaptation of Large Language Models — 2022, deep-read
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models — 2022, deep-read
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows — 2021, deep-read
How to Train State-Of-The-Art Models Using TorchVision’s Latest Primitives — 2021, deep-read
Language Models are Few-Shot Learners — 2020, deep-read
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer — 2020, deep-read
Language Models are Unsupervised Multitask Learners — 2019, deep-read
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding — 2019, deep-read
Improving Language Understanding by Generative Pre-Training — 2018, deep-read
Group Normalization — 2018, deep-read
Deep contextualized word representations — 2018, deep-read
Density Estimation Using Real-NVP — 2017, deep-read
Attention Is All You Need — 2017, deep-read
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks — 2016, deep-read
Layer Normalization — 2016, deep-read
Unsupervised Domain Adaptation by Backpropagation — 2015, deep-read
Neural Machine Translation by Jointly Learning to Align and Translate — 2015, deep-read
Deep Residual Learning for Image Recognition — 2015, deep-read
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift — 2015, deep-read
Adam: A Method for Stochastic Optimization — 2015, deep-read
Sequence to Sequence Learning with Neural Networks — 2014, deep-read
Dropout: A Simple Way to Prevent Neural Networks from Overfitting — 2014, deep-read
Efficient Estimation of Word Representations in Vector Space — 2013, deep-read