Pinned Loading
-
attention_with_linear_biases
attention_with_linear_biases PublicCode for the ALiBi method for transformer language models (ICLR 2022)
-
shortformer
shortformer PublicCode for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.
-
YouMayNotNeedAttention
YouMayNotNeedAttention PublicCode for the Eager Translation Model from the paper You May Not Need Attention
-
sandwich_transformer
sandwich_transformer PublicThis repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.
-
UsingTheOutputEmbedding
UsingTheOutputEmbedding PublicCode for the EACL paper "Using the Output Embedding to Improve Language Models" by Ofir Press and Lior Wolf
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.