Release v1.12.0: AutoGPTQ integration, extended BetterTransformer support · huggingface/optimum

AutoGPTQ integration

Part of AutoGPTQ library has been integrated in Optimum, with utilities to ease the integration in other Hugging Face libraries. Reference: https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization

BetterTransformer now supports BLOOM and GPT-BigCode architectures.

Update bug report template by @fxmarty in #1266
Fix ORTModule uses fp32 model issue by @jingyanwangms in #1264
Fix build PR doc workflow by @fxmarty in #1270
Avoid triggering stop job on label by @fxmarty in #1274
Update version following 1.11.1 patch by @fxmarty in #1275
Fix fp16 ONNX detection for decoder models by @fxmarty in #1276
Update version following 1.11.2 patch by @regisss in #1291
Pin tensorflow<=2.12.1 by @fxmarty in #1305
ONNX: disable text-generation models for sequence classification & fixes for transformers 4.32 by @fxmarty in #1308
Fix staging tests following transformers 4.32 release by @fxmarty in #1309
More fixes following transformers 4.32 release by @fxmarty in #1311

Full Changelog: v1.11.2...v1.12.0