Sparse Transformers

Know a great resource on Sparse Transformers? Send it to us at contactbackprop@gmail.com!

Tags: Transformers

Papers

Generating Long Sequences with Sparse Transformers

  • Paper introducing Sparse Transformers

Posts

Generative Modeling with Sparse Transformers

  • Blog post by OpenAI about Sparse Transformers


Sparse Attention Matrix Factorization (Sparse Transformers)

  • Quick look at sparse transformers by Lilian Weng

Code Examples

Sparse Attention

  • Code with sparse attention primitives provided by OpenAI with their paper


Sparse Transformer

  • An implementation of Sparse Transformers


Distribution Augmentation for Generative Modeling

  • Sparse Transformer code that achieved state-of-the-art performance on cifar-10

Paper corresponding to this repo

APIs

Blocksparse Package

  • A package containing Tensorflow Ops and corresponding GPU kernels for block-sparse matrix multiplication used for implementing sparse transformers

Know a great resource on Sparse Transformers? Send it to us at contactbackprop@gmail.com!