We provide step-by-step examples that demonstrate how to use various features of Model Navigator.
For the sake of readability and accessibility, we use a simple torch.nn.Linear
model as an example.
These examples illustrate how to optimize, test and deploy the model on
the PyTriton and Triton Inference Server.
- Optimize model
- Optimize model and verify model
- Optimize model and save package
- Load and optimize package
- Optimize and server model on PyTriton
- Optimize and serve model on Triton Inference Server
- Optimize model and use for offline inference
- Optimize PyTorch QAT model
- Custom configuration for optimize
- Inplace Optimize of single model
- Inplace Optimize of models pipeline
Inside the example/models directory you can find ready to use example models in various frameworks.
Python
:
PyTorch
:
- BART (Inplace Optimize)
- BERT
- Linear Model
- ResNet50
- ResNet50 (Inplace Optimize)
- Stable Diffusion (Inplace Optimize)
- Whisper (Inplace Optimize)
TensorFlow
:
JAX
:
ONNX
: