Keras implementation of the Compressive Transformer (with Multihead Attention) by Rae et. al
[Work in progress.]
As specified in https://arxiv.org/pdf/1911.05507.pdf.
(And further exemplified in https://deepmind.com/blog/article/A_new_model_and_dataset_for_long-range_memory.)
As per usual, strongly suggested to create a virtual environment of your liking before installing the dependencies:
# using Anaconda:
conda create --name compressive-transformer python=3.8
source activate compressive-transformer
The required packages can then be installed by running
make install
python ct.py train
Runtime configurations - for tokenization, model options, etc. - can be configured in ct/config/default.py
. omegaconf is used for configuration.
A simple documentation of the code, together with some additional examples can be found in docs/build/index.html
.