Skip to content

Commit

Permalink
Merge branch 'nm/fp8_impl' of https://github.com/KodiaqQ/optimum-intel
Browse files Browse the repository at this point in the history
…into nm/fp8_impl
  • Loading branch information
KodiaqQ committed Jan 8, 2025
2 parents 3174ef0 + 710f50a commit 022908a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/openvino/export.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ Models larger than 1 billion parameters are exported to the OpenVINO format with
</Tip>


Besides weight-only quantization, you can also apply full model quantization including activations by setting `--quant-mode` to preffered precision. This will quantize both weights and activations of Linear, Convolutional and some other layers to selected mode. Currently this is only supported for speech-to-text models. Please see example below.
Besides weight-only quantization, you can also apply full model quantization including activations by setting `--quant-mode` to preffered precision. This will quantize both weights and activations of Linear, Convolutional and some other layers to selected mode. Please see example below.

```bash
optimum-cli export openvino -m openai/whisper-large-v3-turbo --quant-mode int8 --dataset librispeech --num-samples 32 --smooth-quant-alpha 0.9 ./whisper-large-v3-turbo
Expand Down

0 comments on commit 022908a

Please sign in to comment.