Skip to content

Commit e62d8bd

Browse files
Support Transformers 4.43 (#856)
* install from pr * updates * fix * update TRANSFORMERS_MAX_VERSION * fix sdpa in training * fix whisper * fix * whisper calibration checks * fix OVTrainerTextClassificationTrainingTest's expected fake quantize * fix OVCLIExportTestCase's expected_int4 * update min ci transformers version to 4.37 * fix OVQuantizerTest's expected fake quantize * reorder_cache * fix expected compressed matmuls * fix test_exporters_cli_int4_with_local_model_and_default_config * fix qwen custom modeling test * fix failing ipex tests * fix ipex * fix the last ipex failing test_compare_with_and_without_past_key_values * use minimal prepare_inputs_for_generation in OVModelForSpeechSeq2Seq * keeping compatibility with transformers 4.36 * keep support of whisper using WhisperGenerationMixin.generate a,d dummy model fix * trigger * fix * device property * standardize .device and ._device attributes/properties * fix * fix * revert Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * use falcon * torch.device property always cpu * style * resolve conflicts * decoder_attention_mask for older versions * optimum main * limit inc transformers version * fix pipeline missing dtype * add dtype for seq to seq models * pass phi beam search test and skip internlm2 * fix for internlm2 --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
1 parent c284187 commit e62d8bd

18 files changed

+348
-558
lines changed

.github/workflows/test_ipex.yml

Lines changed: 24 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -20,23 +20,29 @@ jobs:
2020
strategy:
2121
fail-fast: false
2222
matrix:
23-
python-version: [3.8, 3.9]
24-
transformers-version: [4.39.0, 4.41.2]
25-
os: [ubuntu-latest]
23+
python-version: [3.9]
24+
transformers-version: ["4.39.0", "4.43.*"]
25+
ipex-version: ["2.2.0", "2.3.*"]
26+
include:
27+
- python-version: 3.8
28+
transformers-version: 4.39.0
29+
ipex-version: 2.2.0
2630

27-
runs-on: ${{ matrix.os }}
31+
runs-on: ubuntu-latest
2832
steps:
29-
- uses: actions/checkout@v2
30-
- name: Setup Python ${{ matrix.python-version }}
31-
uses: actions/setup-python@v2
32-
with:
33-
python-version: ${{ matrix.python-version }}
34-
- name: Install dependencies
35-
run: |
36-
python -m pip install --upgrade pip
37-
pip install torch torchaudio torchvision --extra-index-url https://download.pytorch.org/whl/cpu
38-
pip install .[ipex,tests]
39-
pip install transformers==${{ matrix.transformers-version }}
40-
- name: Test with Pytest
41-
run: |
42-
pytest tests/ipex/
33+
- uses: actions/checkout@v2
34+
- name: Setup Python ${{ matrix.python-version }}
35+
uses: actions/setup-python@v2
36+
with:
37+
python-version: ${{ matrix.python-version }}
38+
- name: Install dependencies
39+
run: |
40+
python -m pip install --upgrade pip
41+
pip install torch==${{ matrix.ipex-version }} --extra-index-url https://download.pytorch.org/whl/cpu
42+
pip install intel_extension_for_pytorch==${{ matrix.ipex-version }}
43+
pip install Pillow parameterized
44+
pip install transformers[testing]==${{ matrix.transformers-version }}
45+
pip install .[ipex]
46+
- name: Test with Pytest
47+
run: |
48+
pytest tests/ipex/

.github/workflows/test_openvino.yml

Lines changed: 29 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -21,36 +21,37 @@ jobs:
2121
fail-fast: false
2222
matrix:
2323
python-version: ["3.8", "3.12"]
24-
transformers-version: ["4.36.0", "4.42.*"]
24+
transformers-version: ["4.36.0", "4.43.*"]
2525
os: [ubuntu-latest]
2626

2727
runs-on: ${{ matrix.os }}
2828
steps:
29-
- uses: actions/checkout@v4
30-
- name: Setup Python ${{ matrix.python-version }}
31-
uses: actions/setup-python@v5
32-
with:
33-
python-version: ${{ matrix.python-version }}
34-
- name: Install dependencies
35-
run: |
36-
python -m pip install --upgrade pip
37-
# install PyTorch CPU version to avoid installing CUDA packages on GitHub runner without GPU
38-
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
39-
pip install transformers==${{ matrix.transformers-version }}
40-
pip install .[openvino,openvino-tokenizers,tests,diffusers] onnxruntime
41-
- name: Test with Pytest
42-
env:
43-
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
44-
run: |
45-
pytest tests/openvino/ --ignore tests/openvino/test_modeling_basic.py --durations=0
46-
- name: Test basic
47-
run: |
48-
pip uninstall -y nncf
49-
pytest tests/openvino/test_modeling_basic.py
50-
- name: Test openvino-nightly
51-
run: |
52-
pip uninstall -y openvino
53-
pip install openvino-nightly
54-
python -c "from optimum.intel import OVModelForCausalLM; OVModelForCausalLM.from_pretrained('hf-internal-testing/tiny-random-gpt2', export=True, compile=False)"
55-
optimum-cli export openvino -m hf-internal-testing/tiny-random-gpt2 gpt2-ov
29+
- uses: actions/checkout@v4
30+
- name: Setup Python ${{ matrix.python-version }}
31+
uses: actions/setup-python@v5
32+
with:
33+
python-version: ${{ matrix.python-version }}
5634

35+
- name: Install dependencies
36+
run: |
37+
python -m pip install --upgrade pip
38+
# install PyTorch CPU version to avoid installing CUDA packages on GitHub runner without GPU
39+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
40+
pip install .[openvino,openvino-tokenizers,tests,diffusers] onnxruntime
41+
pip install transformers==${{ matrix.transformers-version }}
42+
43+
- name: Test with Pytest
44+
env:
45+
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
46+
run: |
47+
pytest tests/openvino/ --ignore tests/openvino/test_modeling_basic.py --durations=0
48+
- name: Test basic
49+
run: |
50+
pip uninstall -y nncf
51+
pytest tests/openvino/test_modeling_basic.py
52+
- name: Test openvino-nightly
53+
run: |
54+
pip uninstall -y openvino
55+
pip install openvino-nightly
56+
python -c "from optimum.intel import OVModelForCausalLM; OVModelForCausalLM.from_pretrained('hf-internal-testing/tiny-random-gpt2', export=True, compile=False)"
57+
optimum-cli export openvino -m hf-internal-testing/tiny-random-gpt2 gpt2-ov

.github/workflows/test_openvino_basic.yml

Lines changed: 35 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name: OpenVINO - Basic Test
33
on:
44
workflow_dispatch:
55
schedule:
6-
- cron: '41 1 * * *' # run every day at 1:41
6+
- cron: "41 1 * * *" # run every day at 1:41
77
push:
88
branches:
99
- v*-release
@@ -23,36 +23,42 @@ jobs:
2323
# Testing lower and upper bound of supported Python versions
2424
# This also ensures that the test fails if dependencies break for Python 3.7
2525
python-version: ["3.8", "3.12"]
26-
optimum: ['optimum', 'git+https://github.com/huggingface/optimum.git']
2726
os: ["ubuntu-22.04", "windows-latest"]
27+
transformers-version: ["4.43.*"]
28+
include:
29+
- python-version: "3.12"
30+
os: "ubuntu-22.04"
31+
transformers-version: "4.36.0"
2832

2933
runs-on: ${{ matrix.os }}
3034

3135
steps:
32-
- uses: actions/checkout@v4
33-
- name: Setup Python ${{ matrix.python-version }}
34-
uses: actions/setup-python@v5
35-
with:
36-
python-version: ${{ matrix.python-version }}
37-
38-
- name: Install dependencies
39-
run: |
40-
# Install openvino manually to prevent dependency conflicts when .[openvino] pins
41-
# optimum or transformers to a specific version
42-
# Install PyTorch CPU to prevent unnecessary downloading/installing of CUDA packages
43-
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
44-
pip install .[tests] openvino ${{ matrix.optimum}}
45-
46-
- name: Pip freeze
47-
run: pip freeze
48-
49-
- name: Test with Pytest
50-
run: |
51-
pytest tests/openvino/test_modeling_basic.py
52-
53-
- name: Slow tests
54-
run: |
55-
pip install nncf
56-
pytest tests/openvino -s -m "run_slow" --durations=0
57-
env:
58-
RUN_SLOW: 1
36+
- uses: actions/checkout@v4
37+
- name: Setup Python ${{ matrix.python-version }}
38+
uses: actions/setup-python@v5
39+
with:
40+
python-version: ${{ matrix.python-version }}
41+
42+
- name: Install dependencies
43+
run: |
44+
python -m pip install --upgrade pip
45+
# Install PyTorch CPU to prevent unnecessary downloading/installing of CUDA packages
46+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
47+
# Install openvino manually to prevent dependency conflicts when .[openvino] pins
48+
# optimum or transformers to a specific version
49+
pip install .[tests] openvino
50+
pip install transformers==${{ matrix.transformers-version }}
51+
52+
- name: Pip freeze
53+
run: pip freeze
54+
55+
- name: Test with Pytest
56+
run: |
57+
pytest tests/openvino/test_modeling_basic.py
58+
59+
- name: Slow tests
60+
run: |
61+
pip install nncf
62+
pytest tests/openvino -s -m "run_slow" --durations=0
63+
env:
64+
RUN_SLOW: 1

optimum/exporters/ipex/model_patcher.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131

3232
# Please also update in the setup.py and .github/workflows/test_ipex.yml if you change the transformers version
3333
_TRANSFORMERS_MIN_VERSION = "4.39.0"
34-
_TRANSFORMERS_MAX_VERSION = "4.41.2"
34+
_TRANSFORMERS_MAX_VERSION = "4.43.99"
3535

3636
_IPEX_EXPORTED_ARCH = ("LlamaForCausalLM",)
3737
_IPEX_EXPORTED_TASK = ("text-generation",)

optimum/intel/ipex/modeling_base.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -475,9 +475,11 @@ def __init__(
475475
self._reorder_cache = _ipex_reorder_cache
476476
else:
477477
# Check if _reorder_cache is a static method
478-
if isinstance(self.model_cls.__dict__["_reorder_cache"], staticmethod):
478+
if "_reorder_cache" in self.model_cls.__dict__ and isinstance(
479+
self.model_cls.__dict__["_reorder_cache"], staticmethod
480+
):
479481
self._reorder_cache = self.model_cls._reorder_cache
480-
else:
482+
elif "_reorder_cache" in self.model_cls.__dict__:
481483
self._reorder_cache = self.model_cls._reorder_cache.__get__(self)
482484

483485
if is_transformers_version(">=", "4.38.0") and model_type in {"llama", "phi", "persimmon"}:

optimum/intel/openvino/modeling.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,6 @@ def __init__(self, model: openvino.runtime.Model, config: transformers.Pretraine
129129
# Avoid warnings when creating a transformers pipeline
130130
AutoConfig.register(self.base_model_prefix, AutoConfig)
131131
self.auto_model_class.register(AutoConfig, self.__class__)
132-
self.device = torch.device("cpu")
133132

134133
def to(self, device: str):
135134
"""

optimum/intel/openvino/modeling_base.py

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
from typing import Dict, Optional, Union
2121

2222
import openvino
23+
import torch
2324
from huggingface_hub import hf_hub_download
2425
from huggingface_hub.constants import HUGGINGFACE_HUB_CACHE
2526
from openvino import Core, convert_model
@@ -34,7 +35,7 @@
3435
from ...exporters.openvino import export, main_export
3536
from ..utils.import_utils import is_nncf_available
3637
from .configuration import OVConfig, OVDynamicQuantizationConfig, OVWeightQuantizationConfig
37-
from .utils import ONNX_WEIGHTS_NAME, OV_XML_FILE_NAME, _print_compiled_model_properties
38+
from .utils import ONNX_WEIGHTS_NAME, OV_TO_PT_TYPE, OV_XML_FILE_NAME, _print_compiled_model_properties
3839

3940

4041
core = Core()
@@ -77,16 +78,27 @@ def __init__(
7778
model = self._reshape(model, -1, -1, height, width)
7879

7980
input_names = {}
81+
input_dtypes = {}
8082
for idx, key in enumerate(model.inputs):
8183
names = tuple(key.get_names())
8284
input_names[next((name for name in names if "/" not in name), names[0])] = idx
85+
input_dtypes[
86+
next((name for name in names if "/" not in name), names[0])
87+
] = key.get_element_type().get_type_name()
8388
self.input_names = input_names
89+
self.input_dtypes = input_dtypes
8490

8591
output_names = {}
92+
output_dtypes = {}
8693
for idx, key in enumerate(model.outputs):
8794
names = tuple(key.get_names())
8895
output_names[next((name for name in names if "/" not in name), names[0])] = idx
96+
output_dtypes[
97+
next((name for name in names if "/" not in name), names[0])
98+
] = key.get_element_type().get_type_name()
99+
89100
self.output_names = output_names
101+
self.output_dtypes = output_dtypes
90102

91103
self.model = model
92104
self.request = None
@@ -103,6 +115,27 @@ def __init__(
103115
if enable_compilation:
104116
self.compile()
105117

118+
@property
119+
def device(self) -> torch.device:
120+
"""
121+
`torch.device`: The device on which the module is (for torch compatibility).
122+
"""
123+
return torch.device("cpu")
124+
125+
@property
126+
def dtype(self) -> Optional[torch.dtype]:
127+
for dtype in self.input_dtypes.values():
128+
torch_dtype = OV_TO_PT_TYPE.get(dtype)
129+
if torch_dtype.is_floating_point:
130+
return torch_dtype
131+
132+
for dtype in self.output_dtypes.values():
133+
torch_dtype = OV_TO_PT_TYPE.get(dtype)
134+
if torch_dtype.is_floating_point:
135+
return torch_dtype
136+
137+
return None
138+
106139
@staticmethod
107140
def load_model(
108141
file_name: Union[str, Path],

optimum/intel/openvino/modeling_base_seq2seq.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,8 @@ def _reshape(self, model: openvino.runtime.Model, batch_size: int, sequence_leng
350350
shapes[inputs][0] = batch_size if not is_decoder else -1
351351
if inputs.get_any_name().startswith("past_key_values"):
352352
shapes[inputs][2] = -1
353+
elif inputs.get_any_name().startswith("cache_position"):
354+
shapes[inputs][0] = sequence_length
353355
elif is_decoder and not inputs.get_any_name().startswith("encoder"):
354356
shapes[inputs][1] = -1
355357
else:

optimum/intel/openvino/modeling_diffusion.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
import numpy as np
2626
import openvino
2727
import PIL
28+
import torch
2829
from diffusers import (
2930
DDIMScheduler,
3031
LMSDiscreteScheduler,
@@ -420,10 +421,6 @@ def to(self, device: str):
420421

421422
return self
422423

423-
@property
424-
def device(self) -> str:
425-
return self._device.lower()
426-
427424
@property
428425
def height(self) -> int:
429426
height = self.unet.model.inputs[0].get_partial_shape()[2]
@@ -629,21 +626,25 @@ def _compile(self):
629626
if (
630627
"CACHE_DIR" not in self.ov_config.keys()
631628
and not str(self._model_dir).startswith(gettempdir())
632-
and "gpu" in self.device.lower()
629+
and "GPU" in self._device
633630
):
634631
self.ov_config["CACHE_DIR"] = os.path.join(self._model_dir, self._model_name, "model_cache")
635632

636-
logger.info(f"Compiling the {self._model_name} to {self.device} ...")
637-
self.request = core.compile_model(self.model, self.device, self.ov_config)
633+
logger.info(f"Compiling the {self._model_name} to {self._device} ...")
634+
self.request = core.compile_model(self.model, self._device, self.ov_config)
638635
# OPENVINO_LOG_LEVEL can be found in https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_AUTO_debugging.html
639636
if "OPENVINO_LOG_LEVEL" in os.environ and int(os.environ["OPENVINO_LOG_LEVEL"]) > 2:
640-
logger.info(f"{self.device} SUPPORTED_PROPERTIES:")
637+
logger.info(f"{self._device} SUPPORTED_PROPERTIES:")
641638
_print_compiled_model_properties(self.request)
642639

643640
@property
644-
def device(self):
641+
def _device(self) -> str:
645642
return self.parent_model._device
646643

644+
@property
645+
def device(self) -> torch.device:
646+
return self.parent_model.device
647+
647648

648649
class OVModelTextEncoder(OVModelPart):
649650
def __init__(
@@ -715,7 +716,7 @@ def __call__(self, latent_sample: np.ndarray):
715716
return list(outputs.values())
716717

717718
def _compile(self):
718-
if "GPU" in self.device:
719+
if "GPU" in self._device:
719720
self.ov_config.update({"INFERENCE_PRECISION_HINT": "f32"})
720721
super()._compile()
721722

@@ -736,7 +737,7 @@ def __call__(self, sample: np.ndarray):
736737
return list(outputs.values())
737738

738739
def _compile(self):
739-
if "GPU" in self.device:
740+
if "GPU" in self._device:
740741
self.ov_config.update({"INFERENCE_PRECISION_HINT": "f32"})
741742
super()._compile()
742743

0 commit comments

Comments
 (0)