Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AWQ models #1049

Merged
merged 34 commits into from
Dec 23, 2024
Merged
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
64f64b0
Support AWQ models
mvafin Dec 4, 2024
86d9328
Add tests
mvafin Dec 5, 2024
decbcc2
Add dependencies
mvafin Dec 5, 2024
9fb1da4
Fix tests
mvafin Dec 11, 2024
04d0cf9
enable awq export only if ov support it
eaidova Dec 17, 2024
b51cdee
Merge pull request #1 from eaidova/ea/awq_fix
eaidova Dec 17, 2024
df97004
fix style (#2)
eaidova Dec 17, 2024
cf2fc8b
disable awq and gptq install for old torch (#3)
eaidova Dec 17, 2024
ae8c7db
Merge branch 'main' into mvafin/support_awq
eaidova Dec 17, 2024
f0f7a72
separate common quant models patching and gptq (#4)
eaidova Dec 18, 2024
ab6ac99
disable windows install (#5)
eaidova Dec 18, 2024
ff66f43
skip logits check for quantized models (#6)
eaidova Dec 19, 2024
3b73f17
Merge branch 'main' into mvafin/support_awq
eaidova Dec 19, 2024
e8be988
fix test after rebase
eaidova Dec 19, 2024
5d8bcb7
fix testing condition for 2024.6 and unpatch in case if failed
eaidova Dec 19, 2024
cf3aad4
Fix qwen2-vl tests (#1084)
nikita-savelyevv Dec 19, 2024
106a5b7
Skip private mdoel loading test for external contributors (#1082)
echarlaix Dec 19, 2024
cda4908
Fix reshaping unet if timestep is 0d tensor (#1083)
eaidova Dec 19, 2024
8ef3997
Disable kv cache compression for fp vlm (#1080)
eaidova Dec 19, 2024
8e5573f
Support AWQ models
mvafin Dec 4, 2024
0d7f4bf
Add tests
mvafin Dec 5, 2024
ae544af
Add dependencies
mvafin Dec 5, 2024
013081c
Fix tests
mvafin Dec 11, 2024
b7cd49d
enable awq export only if ov support it
eaidova Dec 17, 2024
da3bd88
fix style (#2)
eaidova Dec 17, 2024
0a0c7aa
disable awq and gptq install for old torch (#3)
eaidova Dec 17, 2024
55dad0c
separate common quant models patching and gptq (#4)
eaidova Dec 18, 2024
c05aaf0
disable windows install (#5)
eaidova Dec 18, 2024
40cd57f
skip logits check for quantized models (#6)
eaidova Dec 19, 2024
9ddc5a8
fix test after rebase
eaidova Dec 19, 2024
a241a7d
fix testing condition for 2024.6 and unpatch in case if failed
eaidova Dec 19, 2024
b0e4860
add necessary packages in test_openvino_full
eaidova Dec 20, 2024
630d36a
Merge branch 'mvafin/support_awq' of https://github.com/mvafin/optimu…
eaidova Dec 20, 2024
7607f45
fix code style after rebase (#7)
eaidova Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions optimum/exporters/openvino/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ def main_export(
trust_remote_code=trust_remote_code,
)
quantization_config = getattr(config, "quantization_config", None)
do_gptq_patching = quantization_config and quantization_config["quant_method"] == "gptq"
do_gptq_patching = quantization_config and quantization_config["quant_method"] in ["gptq", "awq"]
model_type = config.model_type.replace("_", "-")
if model_type not in TasksManager._SUPPORTED_MODEL_TYPE:
custom_architecture = True
Expand Down Expand Up @@ -291,7 +291,6 @@ def main_export(
if (
dtype is None
and framework == "pt"
and not do_gptq_patching
and (
task.startswith("text-generation")
or getattr(config, "model_type", None) in MULTI_MODAL_TEXT_GENERATION_MODELS
Expand All @@ -311,7 +310,6 @@ def main_export(
loading_kwargs["torch_dtype"] = dtype
# Patch the modules to export of GPTQ models w/o GPU
if do_gptq_patching:
torch.set_default_dtype(torch.float32)
orig_cuda_check = torch.cuda.is_available
torch.cuda.is_available = lambda: True

Expand Down
Loading