v1.13.0: ONNX weight deduplication, ONNX export and ORT extension
Deduplicate Embedding / LM head weight in the ONNX export
Workaround for a bug in the PyTorch ONNX export that does not deduplicate the Embedding and LM head shared weight: pytorch/pytorch#108342. For small enough models, this results in up to 50% ONNX serialized model size decrease.
- Fix PyTorch tied weights being duplicated in the exported ONNX models by @fxmarty in #1326
- Fix initializer detection for weight deduplication by @fxmarty in #1333
Extended ONNX Runtime support
ONNX Runtime integration now supports Pix2Struct and MPT architectures. Donut now supports IO Binding. Encoder-Decoder models are now supported as well.
- Pix2Struct onnxruntime support by @krathul in #1296
- Add MPT onnx and ORT support by @jiqing-feng in #1161
- Donut iobinding by @IlyasMoutawwakil in #1209
- Add encoder decoder model by @mht-sharma in #851
Extended ONNX export: MPT, TIMM models, Encoder-Decoder
Additionally, the model SAM is now be default exported as a vision_encoder.onnx, and prompt_encoder_mask_decoder.onnx.
- Add MPT onnx and ORT support by @jiqing-feng in #1161
- Adds ONNX Export Support for Timm Models by @mht-sharma in #965
- Add encoder decoder model by @mht-sharma in #851
- Fix SAM ONNX export requirements with transformers 4.32, export vision encoder separately by @fxmarty in #1301
BetterTransformer supports Falcon
- [
BetterTransformer
] Add falcon toBetterTransformer
by @younesbelkada in #1343
Major bugfix: ability to set GPTQ Exllama kernel maximum length in the transformers integration
The function exllama_set_max_input_length
from auto-gptq
can now be used with Transformers GPTQ models.
Other changes and bugfixes
-
Update version to 1.12.1.dev0 following release by @fxmarty in #1312
-
Improve BetterTransformer backward compatibility by @fxmarty in #1314
-
fix typo in log message by @AAnirudh07 in #1322
-
Support customize dtype for dummy generators by @JingyaHuang in #1307
-
Fix opset custom onnx export by @mht-sharma in #1331
-
Replace mpt to ernie custom export by @mht-sharma in #1332
-
send both negative prompt embeds to ORT SDXL by @ssube in #1339
-
add vae image processor by @echarlaix in #1219
-
add negative prompt test by @echarlaix in #1347
-
Add GPT BigCode to the BT documentation by @fxmarty in #1356
-
Add text2text-generation-with-past test for encoder-decoder model by @mht-sharma in #1338
-
Fix sentence transformer export by @mht-sharma in #1366
New Contributors
- @krathul made their first contribution in #1296
- @AAnirudh07 made their first contribution in #1322
- @jiqing-feng made their first contribution in #1161
- @ssube made their first contribution in #1339
Full Changelog: v1.12.0...v1.13.0