Skip to content

Commit

Permalink
update 224e9d3
Browse files Browse the repository at this point in the history
  • Loading branch information
GHA committed Sep 6, 2024
0 parents commit 1192aca
Show file tree
Hide file tree
Showing 279 changed files with 39,728 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: ea9a007e6fce8dc20bdca6e53623d139
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
Binary file added _images/accuracy-inference-time-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/class-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/classification-metrics-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/confusion-matrix.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/confusion_matrix.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/cpu_memory_usage.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/cpu_usage.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/inference_time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/kenning-zephyr-runtime-tflite.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/kenning-zephyr-runtime-tvm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/kenninglogo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pipeline-manager-kenningflow-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pipeline-manager-visualisation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pruning-nni-classification-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pruning-nni-gpu-mem-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pruning-nni-gpu-usage-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pruning-tf-classification-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/report-mosaic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/utilization-comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
486 changes: 486 additions & 0 deletions _sources/cmd-usage.md.txt

Large diffs are not rendered by default.

91 changes: 91 additions & 0 deletions _sources/dl-deployment-stack.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Deep Learning deployment stack

This chapter lists and describes typical actions performed on deep learning models before deployment on target devices.

## From training to deployment

A deep learning application deployed on IoT devices usually goes through the following process:

* a dataset is prepared for a deep learning process,
* evaluation metrics are specified based on a given dataset and outputs,
* data in the dataset undergoes analysis, data loaders that perform the preprocessing are implemented,
* the deep learning model is either designed from scratch or a baseline is selected from a wide selection of existing pre-trained models for a given deep learning application (classification, detection, semantic segmentation, instance segmentation, etc.) and adjusted to a particular use case,
* a loss function and a learning algorithm are specified along with the deep learning model,
* the model is trained, evaluated and improved,
* the model is compiled to a representation that is applicable to a given target,
* the model is executed on a target device.

## Dataset preparation

If a model is not available or it is trained for a different use case, the model needs to be trained or re-trained.

Each model requires a dataset - a set of sample inputs (audio signals, images, video sequences, OCT images, other sensors) and, usually, also outputs (association to class or classes, object location, object mask, input description).
Datasets are usually split into the following categories:

* training dataset - the largest subset that is used to train a model,
* validation dataset - a relatively small set that is used to verify model performance after each training epoch (the metrics and loss function values show if any overfitting occurred during the training process),
* test dataset - the subset that acts as the final evaluation of a trained model.

It is required that the test dataset and the training dataset are mutually exclusive, so that the evaluation results are not biased in any way.

Datasets can be either designed from scratch or found in e.g.:

* [Kaggle datasets](https://www.kaggle.com),
* [Google Dataset Search](https://datasetsearch.research.google.com),
* [Dataset list](https://www.datasetlist.com),
* Universities' pages,
* [Open Images Dataset](https://storage.googleapis.com/openimages/web/index.html),
* [Common Voice Dataset](https://commonvoice.mozilla.org/en).

## Model preparation and training

Currently, the most popular approach is to find an existing model that fits a given problem and perform transfer learning to adapt the model to the requirements.
In transfer learning, the existing model's final layers are slightly modified to adapt to a new problem. These updated final layers of the model are trained using the training dataset.
Finally, some additional layers are unfrozen and the training is performed on a larger number of parameters at a very small learning rate - this process is called fine-tuning.

Transfer learning provides a better starting point for the training process, allows to train a correctly performing model with smaller datasets and reduces the time required to train a model.
The intuition behind this is that there are multiple common features between various objects in real-life environments, and the features learned from one deep learning scenario can be then reused in another scenario.

Once a model is selected, it requires adequate data input preprocessing in order to perform valid training.
The input data should be normalized and resized to fit input tensor requirements.
In case of the training dataset, especially if it is quite small, applying reasonable data augmentations like random brightness, contrast, cropping, jitters or rotations can significantly improve the training process and prevent the network from overfitting.

In the end, a proper training procedure needs to be specified.
This step includes:

* loss function specification for the model.
Some weights regularizations can be specified, along with the loss function, to reduce the chance of overfitting
* optimizer specification (like Adam, Adagrad).
This involves setting hyperparameters properly or adding schedules and automated routines to set those hyperparameters (i.e. scheduling the learning rate value, or using LR-Finder to set the proper learning rate for the scenario)
* number of epochs specification or scheduling, e.g. early stopping can be introduced.
* providing some routines for quality metrics measurements
* providing some routines for saving intermediate models during training (periodically, or the best model according to a particular quality measure)

## Model optimization

A successfully trained model may require some optimizations in order to run on given IoT hardware.
The optimizations may involve precision of weights, computational representation, or model structure.

Models are usually trained with FP32 precision or mixed precision (FP32 + FP16, depending on the operator).
Some targets, on the other hand, may significantly benefit from changing the precision from FP32 to FP16, INT8 or INT4.
The optimizations here are straightforward for the FP16 precision, but the integer-based quantizations require dataset calibration to reduce precision without a significant loss in a model's quality.

Other optimizations change the computational representation of the model by e.g. layer fusion or specialized operators for convolutions of a particular shape, among others.

In the end, there are algorithmic optimizations that change the entire model structure, like weights pruning, conditional computation, model distillation (the current model acts as a teacher that is supposed to improve the quality of a much smaller model).

If these model optimizations are applied, the optimized models should be evaluated using the same metrics as the original model.
This is required in order to find any drops in quality.

## Model compilation and deployment

Deep learning compilers can transform model representation to:

* a source code for a different programming language, e.g. [Halide](https://halide-lang.org), C, C++, Java, that can be later used on a given target,
* a machine code utilizing available hardware accelerators with e.g. OpenGL, OpenCL, CUDA, TensorRT, ROCm libraries,
* an FPGA bitstream,
* other targets.

Those compiled models are optimized to perform as efficiently as possible on given target hardware.

In the final step, the models are deployed on a hardware device.
227 changes: 227 additions & 0 deletions _sources/gallery/displaying-information-example.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
# Displaying information about available classes

The Kenning project provides several scripts for assessing information about classes (such as [](dataset-api), [](modelwrapper-api), [](optimizer-api)).

Below, we provide an overview of means to display this information.

First, make sure that Kenning is installed:
```bash
pip install "kenning @ git+https://github.com/antmicro/kenning.git"
```

## Kenning list

`kenning list` lists all available classes, grouping them by the base class (group of modules).

The script can be executed as follows:

```bash
kenning list
```

This will return a list similar to the one below:

```
Optimizers (in kenning.optimizers):

kenning.optimizers.onnx.ONNXCompiler
kenning.optimizers.tensorflow_optimizers.TensorFlowOptimizer
kenning.optimizers.tvm.TVMCompiler
kenning.optimizers.iree.IREECompiler
kenning.optimizers.tensorflow_pruning.TensorFlowPruningOptimizer
kenning.optimizers.tensorflow_clustering.TensorFlowClusteringOptimizer
kenning.optimizers.nni_pruning.NNIPruningOptimizer
kenning.optimizers.tflite.TFLiteCompiler
kenning.optimizers.model_inserter.ModelInserter

Datasets (in kenning.datasets):

kenning.datasets.random_dataset.RandomizedClassificationDataset
kenning.datasets.coco_dataset.COCODataset2017
kenning.datasets.open_images_dataset.OpenImagesDatasetV6
kenning.datasets.helpers.detection_and_segmentation.ObjectDetectionSegmentationDataset
kenning.datasets.magic_wand_dataset.MagicWandDataset
kenning.datasets.common_voice_dataset.CommonVoiceDataset
kenning.datasets.pet_dataset.PetDataset
kenning.datasets.random_dataset.RandomizedDetectionSegmentationDataset
kenning.datasets.imagenet_dataset.ImageNetDataset
kenning.datasets.visual_wake_words_dataset.VisualWakeWordsDataset

Modelwrappers (in kenning.modelwrappers):

kenning.modelwrappers.instance_segmentation.pytorch_coco.PyTorchCOCOMaskRCNN
kenning.modelwrappers.object_detection.darknet_coco.TVMDarknetCOCOYOLOV3
kenning.modelwrappers.instance_segmentation.yolact.YOLACTWithPostprocessing
kenning.modelwrappers.classification.tensorflow_imagenet.TensorFlowImageNet
kenning.modelwrappers.instance_segmentation.yolact.YOLACTWrapper
kenning.modelwrappers.object_detection.yolo_wrapper.YOLOWrapper
kenning.modelwrappers.frameworks.tensorflow.TensorFlowWrapper
kenning.modelwrappers.classification.tflite_magic_wand.MagicWandModelWrapper
kenning.modelwrappers.classification.tflite_person_detection.PersonDetectionModelWrapper
kenning.modelwrappers.instance_segmentation.yolact.YOLACT
kenning.modelwrappers.frameworks.pytorch.PyTorchWrapper
kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2
kenning.modelwrappers.object_detection.yolov4.ONNXYOLOV4
kenning.modelwrappers.classification.pytorch_pet_dataset.PyTorchPetDatasetMobileNetV2

...

```

The output of the command can be limited by providing one or more positional arguments representing module groups:
* `optimizers`,
* `runners`,
* `dataproviders`,
* `datasets`,
* `modelwrappers`,
* `onnxconversions`,
* `outputcollectors`,
* `runtimes`.

The command can also be used to list available runtimes:

```bash
kenning list runtimes
```

Which will return a list similar to the one below::

```
Runtimes (in kenning.runtimes):

kenning.runtimes.iree.IREERuntime
kenning.runtimes.tflite.TFLiteRuntime
kenning.runtimes.pytorch.PyTorchRuntime
kenning.runtimes.tvm.TVMRuntime
kenning.runtimes.onnx.ONNXRuntime
kenning.runtimes.renode.RenodeRuntime
```

More verbose information is available with `-v` and `-vv` flags. They will display dependencies, descriptions and other information for each class.

## Kenning info

`kenning info` displays more detailed information about a particular class.
This information is especially useful when creating JSON scenario configurations.
The command displays the following:

* docstrings
* dependencies, along with information on availability in the current Python environment
* supported input and output formats
* argument structure used in JSON

Let's consider a scenario where we want to compose a Kenning flow utilizing a YOLOv4 ModelWrapper.
Execute the following command:

```bash
kenning info kenning.modelwrappers.object_detection.yolov4.ONNXYOLOV4
```

This will display all the necessary information about the class:

```
Class: ONNXYOLOV4

Input/output specification:
* input
* shape: (1, 3, keyparams['width'], keyparams['height'])
* dtype: float32
* output
* shape: (1, 255, (keyparams['width'] // (8 * (2 ** 0))), (keyparams['height'] // (8 * (2 ** 0))))
* dtype: float32
* output.3
* shape: (1, 255, (keyparams['width'] // (8 * (2 ** 1))), (keyparams['height'] // (8 * (2 ** 1))))
* dtype: float32
* output.7
* shape: (1, 255, (keyparams['width'] // (8 * (2 ** 2))), (keyparams['height'] // (8 * (2 ** 2))))
* dtype: float32
* detection_output
* type: List[DetectObject]

Dependencies:
* torch
* numpy
* onnx
* torch.nn.functional

Arguments specification:
* classes
* argparse_name: --classes
* convert-type: builtins.str
* type
* string
* description: File containing Open Images class IDs and class names in CSV format to use (can be generated using kenning.scenarios.open_images_classes_extractor) or class type
* default: coco
* model_path
* argparse_name: --model-path
* convert-type: kenning.utils.resource_manager.ResourceURI
* type
* string
* description: Path to the model
* required: True
```

```{note}
By default, the command only performs static code analysis.
For example, some values in the input/output specification are expressions because the command did not import or evaluate any values.
This is done to allow for missing dependencies.
```

### Loading a class with arguments

To gain access to more detailed information, the `--load-class-with-args` argument can be used.
Provided that all dependencies are satisfied, the script will load the verified module to collect more detailed information about available settings.

In the example above, the ONNXYOLOV4 configuration specifies that the `model_path` argument is required.
All dependencies are available as there is no warning message.

To load a class with arguments, run this command:

```bash
kenning info kenning.modelwrappers.object_detection.yolov4.ONNXYOLOV4 \
--load-class-with-args \
--model-path kenning:///models/detection/yolov4.onnx
```

```
Class: ONNXYOLOV4

Input/output specification:
* input
* shape: (1, 3, 608, 608)
* dtype: float32
* output
* shape: (1, 255, 76, 76)
* dtype: float32
* output.3
* shape: (1, 255, 38, 38)
* dtype: float32
* output.7
* shape: (1, 255, 19, 19)
* dtype: float32
* detection_output
* type: List[DetectObject]

Dependencies:
* onnx
* numpy
* torch.nn.functional
* torch

Arguments specification:
* classes
* argparse_name: --classes
* convert-type: builtins.str
* type
* string
* description: File containing Open Images class IDs and class names in CSV format to use (can be generated using kenning.scenarios.open_images_classes_extractor) or class type
* default: coco
* model_path
* argparse_name: --model-path
* convert-type: kenning.utils.resource_manager.ResourceURI
* type
* string
* description: Path to the model
* required: True
```

Loading

0 comments on commit 1192aca

Please sign in to comment.