⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
python
nlp
fast
translation
deep-learning
inference
pytorch
transformer
question-answering
quantization
onnx
t5
onnxruntime
fastt5
quantized-onnx-models
inference-speed
-
Updated
Apr 24, 2023 - Python