Skip to content

Commit

Permalink
move to export section
Browse files Browse the repository at this point in the history
  • Loading branch information
echarlaix committed Jun 25, 2024
1 parent af16d86 commit fd53502
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 14 deletions.
13 changes: 8 additions & 5 deletions docs/source/openvino/export.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,6 @@ Optional arguments:
Do not add converted tokenizer and detokenizer OpenVINO models.
```

### Quantization

You can also apply fp16, 8-bit or 4-bit weight-only quantization on the Linear, Convolutional and Embedding layers when exporting your model by setting `--weight-format` to respectively `fp16`, `int8` or `int4`:

```bash
Expand All @@ -111,17 +109,20 @@ Models larger than 1 billion parameters are exported to the OpenVINO format with

</Tip>

Once the model is exported, you can now [load your OpenVINO model](inference) by replacing the `AutoModelForXxx` class with the corresponding `OVModelForXxx` class.

## When loading your model

You can also load your PyTorch checkpoint and convert it to the OpenVINO format on-the-fly, by setting `export=True` when loading your model.

To easily save the resulting model, you can use the `save_pretrained()` method, which will save both the BIN and XML files describing the graph. It is useful to save the tokenizer to the same directory, to enable easy loading of the tokenizer for the model.

```python
from optimum.intel import OVModelForCausalLM

model = OVModelForCausalLM.from_pretrained("gpt2", export=True)
model.save_pretrained("ov_model")

save_directory = "ov_model"
model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)
```

## After loading your model
Expand All @@ -133,3 +134,5 @@ from optimum.exporters.openvino import export_from_model
model = AutoModelForCausalLM.from_pretrained("gpt2")
export_from_model(model, output="ov_model", task="text-generation-with-past")
```

Once the model is exported, you can now [load your OpenVINO model](inference) by replacing the `AutoModelForXxx` class with the corresponding `OVModelForXxx` class.
9 changes: 0 additions & 9 deletions docs/source/openvino/inference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,6 @@ Once [your model was exported](export), you can load it by replacing the `AutoMo

See the [reference documentation](reference) for more information about parameters, and examples for different tasks.

To easily save the resulting model, you can use the `save_pretrained()` method, which will save both the BIN and XML files describing the graph. It is useful to save the tokenizer to the same directory, to enable easy loading of the tokenizer for the model.

```python
# Save your model
save_directory = "openvino_distilbert"
model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)
```

As shown in the table below, each task is associated with a class enabling to automatically load your model.

| Task | Auto Class |
Expand Down

0 comments on commit fd53502

Please sign in to comment.