Know a great resource on Transformers? Send it to us at!

Tags: Neural Networks (MLPs), Layer Normalization, Residual Connections


Attention Is All You Need

  • Original Paper introducing transformers


Transformers and Self-Attention

  • Ashish Vaswani and Anna Huang, Stanford University, Winter 2019

Transformers Lecture

  • Videos on self-attention, the model itself, and a few famous Transformer models

Attention and Transformer Networks

  • By Pascal Poupart, Professor at the University Of Waterloo

The Transformer for Language Understanding

  • A code-based lecture by Rachel Thomas of Fast.AI


Transformer Neural Networks

  • Explains transformers and compares them to RNNs and LSTMs

Attention Is All You Need

  • Walks through and explains the original paper

Generative Adversarial Networks (Paper Explained)

  • A walkthrough of the original GAN paper

An Illustrated Guide to Transformers

  • Visual-based walkthrough of the Transformer Model


The Illustrated Transformer

  • A blog post explained transformers with visuals

Attention? Attention!

  • A detailed walkthrough of attention mechanisms before explaining Transformers

The Annotated Transformer

  • A blog post explaining Transformers step-by-step with pytorch code

Transformers from Scratch

  • An explanation of modern transformers without some of the historical baggage

What Are Transformer Models?

  • Explaining Transformers in Q&A format

The Transformer Family

  • A detailed walkthrough of different transformers proposed after the original

Code Examples

Tensorflow Transformer Implementation Example

  • Tensorflow tutorial of Transformer model for translating Portugeuse text to English

Text Classificiation with Transformer

  • A keras tutorial implementing a transformer block

Sequence-to-Sequence Modeling with Transformers

  • A Transformer model tutorial in pytorch



  • Pytorch API for a transformer model


Hugging Face Transformers

  • An api for state of the art Natural Language Processing tasks in pytorch and tensorflow

  • Paper for the api

  • github here

Happy Transformer

  • An API built on top of hugging face for state-of-the-art NLP models