- Improving Language Understanding by Generative Pre-Training[code] [blog]
- Language Models are Unsupervised Multitask Learners [code] [blog]
- Multi-Task Deep Neural Networks for Natural Language Understanding
- Language GANs Falling Short
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [tutorial1][tutorial2][tutorial3][code]
- Get To The Point: Summarization with Pointer-Generator Networks
- Semi-supervised Sequence Learning
- Pointer Networks
- QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
- Universal Transformers
- Dual Learning for Machine Translation
- A Survey of the Usages of Deep Learning in Natural Language Processing
- Language Generation with Recurrent Generative Adversarial Networks without Pre-training
- Convolutional Sequence to Sequence Learning
- Transformer-XL: Language Modeling with Longer-Term Dependency
- A Primer on Neural Network Models for Natural Language Processing
- Text Understanding from Scratch
- ELMo: Deep contextualized word representations [allennlp]
- Adaptive Input Representations for Neural Language Modeling
- Toward Controlled Generation of Text
- Unsupervised Natural Language Generation with Denoising Autoencoders
- Adversarial Generation of Natural Language
- Deep Reinforcement Learning for Dialogue Generation
- A Unified Architecture for Natural Language Processing [classic]
- Reasoning about Entailment with Neural Attention
- Learning End-to-End Goal-Oriented Dialog
- Neural Approaches to Conversational AI
- Machine Learning for Dialog State Tracking: A Review
- Neural Machine Translation of Rare Words with Subword Units
- Improving Neural Machine Translation Models with Monolingual Data
- On Tree-Based Neural Sentence Modeling
- Word Translation Without Parallel Data
- Language as a latent variable: Discrete generative models for sentence compression
- Deep Learning in Neural Networks: An Overview
- Paraphrase Generation with Deep Reinforcement Learning
- Spectral Normalization for Generative Adversarial Networks
- MaskGAN: Better Text Generation via Filling in the______
- A Note on the Inception Score
- Self-Attention Generative Adversarial Networks [SAGAN]
- SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient [SeqGAN]
- Wasserstein GAN [WGAN]
- The relativistic discriminator: a key element missing from standard GAN [blog1][blog2][code][Relativistic GANs]
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [InfoGAN]
- Conditional Generative Adversarial Nets [CoGAN]
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (CycleGAN)
- Autoencoding beyond pixels using a learned similarity metric [VAEGAN]
- Auto-Encoding Variational Bayes (VAE)
- BEGAN: Boundary Equilibrium Generative Adversarial Networks [BEGAN]
- Unsupervised Image-to-Image Translation Networks (UNIT) [code]
- Multimodal Unsupervised Image-to-Image Translation (MUNIT) [code]
- Coupled Generative Adversarial Networks [code][coGAN]
- Diverse Image-to-Image Translation via Disentangled Representations (MUNIT's Bro)
- Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data [MUNIT's Bro]
- XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings (MUNIT's Bro)
- Toward Multimodal Image-to-Image Translation [code] [BicyleGAN]
- Recycle-GAN: Unsupervised Video Retargeting
- Adversarial Autoencoders
- Wasserstein Auto-Encoders
- Large Scale GAN Training for High Fidelity Natural Image Synthesis
- Local Image-to-Image Translation via Pixel-wise Highway Adaptive Instance Normalization
- Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
- Bayesian GAN
- Progressive Growing of GANs for Improved Quality, Stability, and Variation [code]
- Adversarially Learned Inference
- Improved Variational Inference with Inverse Autoregressive Flow
- CyCADA: Cycle-Consistent Adversarial Domain Adaptation
- Disentangling by Factorising
- Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks
- Generating a Fusion Image: One’s Identity and Another’s Shape
- Glow: Generative Flow with Invertible 1×1 Convolutions [blog] [code]
- Are GANs Created Equal? A Large-Scale Study
- Adversarially Regularized Autoencoders
- Generalized Denoising Auto-Encoders as Generative Models
- Adversarial examples for generative models
- GAN Q-learning [code]
- AdaGAN: Boosting Generative Models
- Unrolled Generative Adversarial Networks [code]
- Do GANs actually learn the distribution? An empirical study
- StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [code]
- CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms [code]
- Variational Inference: A Review for Statisticians
- Variational Inference with Normalizing Flows
- Generative Adversarial Imitation Learning
- CariGANs: Unpaired Photo-to-Caricature Translation [website]
- A Style-Based Generator Architecture for Generative Adversarial Networks
- CartoonGAN: Generative Adversarial Networks for Photo Cartoonization
- Towards Principled Methods for Training Generative Adversarial Networks
- Improved Training of Wasserstein GANs
- Improved Techniques for Training GANs
- Differentiable Learning-to-Normalize via Switchable Normalization
- Layer Normalization
- Instance Normalization: The Missing Ingredient for Fast Stylization
- Group Normalization
- Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
- On the Convergence of Adam and Beyond
- An overview of gradient descent optimization algorithms
- Taskonomy: Disentangling Task Transfer
- Neural Ordinary Differential Equations
- Squeeze and Excitation Networks
- Pixel Recurrent Neural Networks
- WaveNet: A Generative Model for Raw Audio [Blog]
- Parallel WaveNet: Fast High-Fidelity Speech Synthesis
- Tacotron: Towards End-to-End Speech Synthesis
- Tacotron series
- Deep Voice: Real-time Neural Text-to-Speech
- Deep Voice 2: Multi-Speaker Neural Text-to-Speech
- Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
- Neural Voice Cloning with a Few Samples
- Dynamic Routing Between Capsules
- Deep Learning Techniques for Music Generation – A Survey
- Understanding deep learning requires rethinking generalization
- Neural Turing Machines[Blog1][Blog2]
- Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation
- Opening the black box of deep learning
- Opening the Black Box of Deep Neural Networks via Information
- Everybody Dance Now[Blog]
- Mask R-CNN [blog]
- Real-time Object Detection with YOLO, YOLOv2 and now YOLOv3
- Object Recognition: [part1][part2][part3]
- Reinforcement Learning:
- Generative Models:
- Glow: Better Reversible Generative Models
- Reinforcement Learning with Prediction-Based Rewards
- A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [article]
- Best Papers ACL 2018 / Best Paper Honourable Mentions
- NLP is fun
- Gumbel softmax
- DeepStack, AlphaZero, TRPO
- Generalized Language Models
- The Artificial Intelligence Wiki
- A Knowledge Base for the FB Group Artificial Intelligence and Deep Learning
- SOTA Links
- Papers with Code