update d6c7b4d

antmicro · Jan 18, 2024 · 8c95bbc · 8c95bbc
commit 8c95bbc
Show file tree

Hide file tree

Showing 273 changed files with 37,418 additions and 0 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: 3626347df4535d9c382d62268fe93b89
+tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/.nojekyll b/.nojekyll
diff --git a/_images/accuracy-inference-time-comparison.png b/_images/accuracy-inference-time-comparison.png
diff --git a/_images/class-flow.png b/_images/class-flow.png
diff --git a/_images/classification-metrics-comparison.png b/_images/classification-metrics-comparison.png
diff --git a/_images/confusion-matrix.png b/_images/confusion-matrix.png
diff --git a/_images/confusion_matrix.png b/_images/confusion_matrix.png
diff --git a/_images/cpu_memory_usage.png b/_images/cpu_memory_usage.png
diff --git a/_images/cpu_usage.png b/_images/cpu_usage.png
diff --git a/_images/inference_time.png b/_images/inference_time.png
diff --git a/_images/kenninglogo.png b/_images/kenninglogo.png
diff --git a/_images/pipeline-manager-kenningflow-example.png b/_images/pipeline-manager-kenningflow-example.png
diff --git a/_images/pipeline-manager-visualisation.png b/_images/pipeline-manager-visualisation.png
diff --git a/_images/pruning-clustering-tf-classification-comparison.png b/_images/pruning-clustering-tf-classification-comparison.png
diff --git a/_images/pruning-nni-classification-comparison.png b/_images/pruning-nni-classification-comparison.png
diff --git a/_images/pruning-nni-gpu-mem-comparison.png b/_images/pruning-nni-gpu-mem-comparison.png
diff --git a/_images/pruning-nni-gpu-usage-comparison.png b/_images/pruning-nni-gpu-usage-comparison.png
diff --git a/_images/pruning-tf-classification-comparison.png b/_images/pruning-tf-classification-comparison.png
diff --git a/_images/report-mosaic.png b/_images/report-mosaic.png
diff --git a/_images/utilization-comparison.png b/_images/utilization-comparison.png
diff --git a/_sources/cmd-usage.md.txt b/_sources/cmd-usage.md.txt
diff --git a/_sources/dl-deployment-stack.md.txt b/_sources/dl-deployment-stack.md.txt
@@ -0,0 +1,91 @@
+# Deep Learning deployment stack
+
+This chapter lists and describes typical actions performed on deep learning models before deployment on target devices.
+
+## From training to deployment
+
+A deep learning application deployed on IoT devices usually goes through the following process:
+
+* a dataset is prepared for a deep learning process,
+* evaluation metrics are specified based on a given dataset and outputs,
+* data in the dataset undergoes analysis, data loaders that perform the preprocessing are implemented,
+* the deep learning model is either designed from scratch or a baseline is selected from a wide selection of existing pre-trained models for a given deep learning application (classification, detection, semantic segmentation, instance segmentation, etc.) and adjusted to a particular use case,
+* a loss function and a learning algorithm are specified along with the deep learning model,
+* the model is trained, evaluated and improved,
+* the model is compiled to a representation that is applicable to a given target,
+* the model is executed on a target device.
+
+## Dataset preparation
+
+If a model is not available or it is trained for a different use case, the model needs to be trained or re-trained.
+
+Each model requires a dataset - a set of sample inputs (audio signals, images, video sequences, OCT images, other sensors) and, usually, also outputs (association to class or classes, object location, object mask, input description).
+Datasets are usually split into the following categories:
+
+* training dataset - the largest subset that is used to train a model,
+* validation dataset - a relatively small set that is used to verify model performance after each training epoch (the metrics and loss function values show if any overfitting occurred during the training process),
+* test dataset - the subset that acts as the final evaluation of a trained model.
+
+It is required that the test dataset and the training dataset are mutually exclusive, so that the evaluation results are not biased in any way.
+
+Datasets can be either designed from scratch or found in e.g.:
+
+* [Kaggle datasets](https://www.kaggle.com),
+* [Google Dataset Search](https://datasetsearch.research.google.com),
+* [Dataset list](https://www.datasetlist.com),
+* Universities' pages,
+* [Open Images Dataset](https://storage.googleapis.com/openimages/web/index.html),
+* [Common Voice Dataset](https://commonvoice.mozilla.org/en).
+
+## Model preparation and training
+
+Currently, the most popular approach is to find an existing model that fits a given problem and perform transfer learning to adapt the model to the requirements.
+In transfer learning, the existing model's final layers are slightly modified to adapt to a new problem. These updated final layers of the model are trained using the training dataset.
+Finally, some additional layers are unfrozen and the training is performed on a larger number of parameters at a very small learning rate - this process is called fine-tuning.
+
+Transfer learning provides a better starting point for the training process, allows to train a correctly performing model with smaller datasets and reduces the time required to train a model.
+The intuition behind this is that there are multiple common features between various objects in real-life environments, and the features learned from one deep learning scenario can be then reused in another scenario.
+
+Once a model is selected, it requires adequate data input preprocessing in order to perform valid training.
+The input data should be normalized and resized to fit input tensor requirements.
+In case of the training dataset, especially if it is quite small, applying reasonable data augmentations like random brightness, contrast, cropping, jitters or rotations can significantly improve the training process and prevent the network from overfitting.
+
+In the end, a proper training procedure needs to be specified.
+This step includes:
+
+* loss function specification for the model.
+  Some weights regularizations can be specified, along with the loss function, to reduce the chance of overfitting
+* optimizer specification (like Adam, Adagrad).
+  This involves setting hyperparameters properly or adding schedules and automated routines to set those hyperparameters (i.e. scheduling the learning rate value, or using LR-Finder to set the proper learning rate for the scenario)
+* number of epochs specification or scheduling, e.g. early stopping can be introduced.
+* providing some routines for quality metrics measurements
+* providing some routines for saving intermediate models during training (periodically, or the best model according to a particular quality measure)
+
+## Model optimization
+
+A successfully trained model may require some optimizations in order to run on given IoT hardware.
+The optimizations may involve precision of weights, computational representation, or model structure.
+
+Models are usually trained with FP32 precision or mixed precision (FP32 + FP16, depending on the operator).
+Some targets, on the other hand, may significantly benefit from changing the precision from FP32 to FP16, INT8 or INT4.
+The optimizations here are straightforward for the FP16 precision, but the integer-based quantizations require dataset calibration to reduce precision without a significant loss in a model's quality.
+
+Other optimizations change the computational representation of the model by e.g. layer fusion or specialized operators for convolutions of a particular shape, among others.
+
+In the end, there are algorithmic optimizations that change the entire model structure, like weights pruning, conditional computation, model distillation (the current model acts as a teacher that is supposed to improve the quality of a much smaller model).
+
+If these model optimizations are applied, the optimized models should be evaluated using the same metrics as the original model.
+This is required in order to find any drops in quality.
+
+## Model compilation and deployment
+
+Deep learning compilers can transform model representation to:
+
+* a source code for a different programming language, e.g. [Halide](https://halide-lang.org), C, C++, Java, that can be later used on a given target,
+* a machine code utilizing available hardware accelerators with e.g. OpenGL, OpenCL, CUDA, TensorRT, ROCm libraries,
+* an FPGA bitstream,
+* other targets.
+
+Those compiled models are optimized to perform as efficiently as possible on given target hardware.
+
+In the final step, the models are deployed on a hardware device.
diff --git a/_sources/gallery/displaying-information-example.md.txt b/_sources/gallery/displaying-information-example.md.txt
@@ -0,0 +1,227 @@
+# Displaying information about available classes
+
+The Kenning project provides several scripts for assessing information about classes (such as [](dataset-api), [](modelwrapper-api), [](optimizer-api)).
+
+Below, we provide an overview of means to display this information.
+
+First, make sure that Kenning is installed:
+```bash
+pip install "kenning @ git+https://github.com/antmicro/kenning.git"
+```
+
+## Kenning list
+
+`kenning list` lists all available classes, grouping them by the base class (group of modules).
+
+The script can be executed as follows:
+
+```bash
+kenning list
+```
+
+This will return a list similar to the one below:
+
+```
+Optimizers (in kenning.optimizers):
+
+    kenning.optimizers.onnx.ONNXCompiler
+    kenning.optimizers.tensorflow_optimizers.TensorFlowOptimizer
+    kenning.optimizers.tvm.TVMCompiler
+    kenning.optimizers.iree.IREECompiler
+    kenning.optimizers.tensorflow_pruning.TensorFlowPruningOptimizer
+    kenning.optimizers.tensorflow_clustering.TensorFlowClusteringOptimizer
+    kenning.optimizers.nni_pruning.NNIPruningOptimizer
+    kenning.optimizers.tflite.TFLiteCompiler
+    kenning.optimizers.model_inserter.ModelInserter
+
+Datasets (in kenning.datasets):
+
+    kenning.datasets.random_dataset.RandomizedClassificationDataset
+    kenning.datasets.coco_dataset.COCODataset2017
+    kenning.datasets.open_images_dataset.OpenImagesDatasetV6
+    kenning.datasets.helpers.detection_and_segmentation.ObjectDetectionSegmentationDataset
+    kenning.datasets.magic_wand_dataset.MagicWandDataset
+    kenning.datasets.common_voice_dataset.CommonVoiceDataset
+    kenning.datasets.pet_dataset.PetDataset
+    kenning.datasets.random_dataset.RandomizedDetectionSegmentationDataset
+    kenning.datasets.imagenet_dataset.ImageNetDataset
+    kenning.datasets.visual_wake_words_dataset.VisualWakeWordsDataset
+
+Modelwrappers (in kenning.modelwrappers):
+
+    kenning.modelwrappers.instance_segmentation.pytorch_coco.PyTorchCOCOMaskRCNN
+    kenning.modelwrappers.object_detection.darknet_coco.TVMDarknetCOCOYOLOV3
+    kenning.modelwrappers.instance_segmentation.yolact.YOLACTWithPostprocessing
+    kenning.modelwrappers.classification.tensorflow_imagenet.TensorFlowImageNet
+    kenning.modelwrappers.instance_segmentation.yolact.YOLACTWrapper
+    kenning.modelwrappers.object_detection.yolo_wrapper.YOLOWrapper
+    kenning.modelwrappers.frameworks.tensorflow.TensorFlowWrapper
+    kenning.modelwrappers.classification.tflite_magic_wand.MagicWandModelWrapper
+    kenning.modelwrappers.classification.tflite_person_detection.PersonDetectionModelWrapper
+    kenning.modelwrappers.instance_segmentation.yolact.YOLACT
+    kenning.modelwrappers.frameworks.pytorch.PyTorchWrapper
+    kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2
+    kenning.modelwrappers.object_detection.yolov4.ONNXYOLOV4
+    kenning.modelwrappers.classification.pytorch_pet_dataset.PyTorchPetDatasetMobileNetV2
+
+...
+
+```
+
+The output of the command can be limited by providing one or more positional arguments representing module groups:
+* `optimizers`,
+* `runners`,
+* `dataproviders`,
+* `datasets`,
+* `modelwrappers`,
+* `onnxconversions`,
+* `outputcollectors`,
+* `runtimes`.
+
+The command can also be used to list available runtimes:
+
+```bash
+kenning list runtimes
+```
+
+Which will return a list similar to the one below::
+
+```
+Runtimes (in kenning.runtimes):
+
+    kenning.runtimes.iree.IREERuntime
+    kenning.runtimes.tflite.TFLiteRuntime
+    kenning.runtimes.pytorch.PyTorchRuntime
+    kenning.runtimes.tvm.TVMRuntime
+    kenning.runtimes.onnx.ONNXRuntime
+    kenning.runtimes.renode.RenodeRuntime
+```
+
+More verbose information is available with `-v` and `-vv` flags. They will display dependencies, descriptions and other information for each class.
+
+## Kenning info
+
+`kenning info` displays more detailed information about a particular class.
+This information is especially useful when creating JSON scenario configurations.
+The command displays the following:
+
+* docstrings
+* dependencies, along with information on availability in the current Python environment
+* supported input and output formats
+* argument structure used in JSON
+
+Let's consider a scenario where we want to compose a Kenning flow utilizing a YOLOv4 ModelWrapper.
+Execute the following command:
+
+```bash
+kenning info kenning.modelwrappers.object_detection.yolov4.ONNXYOLOV4
+```
+
+This will display all the necessary information about the class:
+
+```
+Class: ONNXYOLOV4
+
+Input/output specification:
+* input
+  * shape: (1, 3, keyparams['width'], keyparams['height'])
+  * dtype: float32
+* output
+  * shape: (1, 255, (keyparams['width'] // (8 * (2 ** 0))), (keyparams['height'] // (8 * (2 ** 0))))
+  * dtype: float32
+* output.3
+  * shape: (1, 255, (keyparams['width'] // (8 * (2 ** 1))), (keyparams['height'] // (8 * (2 ** 1))))
+  * dtype: float32
+* output.7
+  * shape: (1, 255, (keyparams['width'] // (8 * (2 ** 2))), (keyparams['height'] // (8 * (2 ** 2))))
+  * dtype: float32
+* detection_output
+  * type: List[DetectObject]
+
+Dependencies:
+* torch
+* numpy
+* onnx
+* torch.nn.functional
+
+Arguments specification:
+* classes
+  * argparse_name: --classes
+  * convert-type: builtins.str
+  * type
+    * string
+  * description: File containing Open Images class IDs and class names in CSV format to use (can be generated using kenning.scenarios.open_images_classes_extractor) or class type
+  * default: coco
+* model_path
+  * argparse_name: --model-path
+  * convert-type: kenning.utils.resource_manager.ResourceURI
+  * type
+    * string
+  * description: Path to the model
+  * required: True
+```
+
+```{note}
+By default, the command only performs static code analysis.
+For example, some values in the input/output specification are expressions because the command did not import or evaluate any values.
+This is done to allow for missing dependencies.
+```
+
+### Loading a class with arguments
+
+To gain access to more detailed information, the `--load-class-with-args` argument can be used.
+Provided that all dependencies are satisfied, the script will load the verified module to collect more detailed information about available settings.
+
+In the example above, the ONNXYOLOV4 configuration specifies that the `model_path` argument is required.
+All dependencies are available as there is no warning message.
+
+To load a class with arguments, run this command:
+
+```bash
+kenning info kenning.modelwrappers.object_detection.yolov4.ONNXYOLOV4 \
+  --load-class-with-args \
+  --model-path kenning:///models/detection/yolov4.onnx
+```
+
+```
+Class: ONNXYOLOV4
+
+Input/output specification:
+* input
+  * shape: (1, 3, 608, 608)
+  * dtype: float32
+* output
+  * shape: (1, 255, 76, 76)
+  * dtype: float32
+* output.3
+  * shape: (1, 255, 38, 38)
+  * dtype: float32
+* output.7
+  * shape: (1, 255, 19, 19)
+  * dtype: float32
+* detection_output
+  * type: List[DetectObject]
+
+Dependencies:
+* onnx
+* numpy
+* torch.nn.functional
+* torch
+
+Arguments specification:
+* classes
+  * argparse_name: --classes
+  * convert-type: builtins.str
+  * type
+    * string
+  * description: File containing Open Images class IDs and class names in CSV format to use (can be generated using kenning.scenarios.open_images_classes_extractor) or class type
+  * default: coco
+* model_path
+  * argparse_name: --model-path
+  * convert-type: kenning.utils.resource_manager.ResourceURI
+  * type
+    * string
+  * description: Path to the model
+  * required: True
+```
+
diff --git a/_sources/gallery/pipeline-manager-example.md.txt b/_sources/gallery/pipeline-manager-example.md.txt
@@ -0,0 +1,59 @@
+# Visualizing Kenning data flows with Pipeline Manager
+
+[Pipeline Manager](https://github.com/antmicro/kenning-pipeline-manager) is a GUI tool that helps visualize and edit data flows.
+
+This chapter describes how to set up Pipeline Manager and use it with Kenning graphs.
+
+Pipeline Manager is application-agnostic and does not assume any properties of the application it is working with.
+Kenning, however, implements a Pipeline Manager client which provides tools for creating complex Kenning pipelines and flows, while also allowing for running and saving these configurations directly from Pipeline Manager's editor.
+
+![](img/pipeline-manager-visualisation.png)
+
+## Installing Pipeline Manager
+
+Kenning requires extra dependencies in order to run the Pipeline Manager integration.
+To install them, run:
+
+```bash
+pip install "kenning[pipeline_manager] @ git+https://github.com/antmicro/kenning.git"
+```
+
+## Running Pipeline Manager with Kenning
+
+Start the Pipeline Manager client with:
+
+```bash timeout=10
+kenning visual-editor --file-path measurements.json --workspace-dir ./workspace
+```
+
+The `--file-path` option specifies where the results of model benchmarking or the runtime data will be stored.
+
+For runtime data, the following arguments are available:
+
+* `--spec-type` - type of Kenning scenario to be run, can be either `pipeline` (for [optimization and deployment pipeline](../json-scenarios)) or `flow` (for creating [runtime scenarios](../kenning-flow)).
+`pipeline` is the default type.
+* `--host` - Pipeline Manager server address, default: `127.0.0.1`
+* `--port` - Pipeline Manager server port, default: `9000`
+* `--verbosity` - log verbosity
+
+## Using Pipeline Manager
+
+In its default configuration, the web application is available under `http://127.0.0.1:5000/`.
+
+![](./img/pipeline-manager-kenningflow-example.png)
+
+Below, you can find an example Pipeline Manager workflow:
+
+* `Load File` - option available from the drop-down menu on the top left, loads a JSON configuration describing a Kenning scenario.
+
+  For instance, `scripts/jsonconfigs/sample-tflite-pipeline.json` available in Kenning is a basic configuration for an [ Kenning example use case for benchmarking using a native framework](./tflite_tvm.md#benchmarking-a-model-using-a-native-framework).
+
+* Graph editing - adding or removing nodes, editing connections, node options, etc.
+* `Validate` -  validates and returns the information whether the scenario is valid.
+
+  For example, it will return an error when two optimizers in a chain are incompatible with each other.
+
+* `Run` - creates and runs the optimization pipeline or [Kenning runtime flow](../kenning-flow).
+* `Save file` - saves current JSON scenario to a specified path.
+
+More information about how to work with Pipeline Manager is available in the [Pipeline Manager documentation](https://antmicro.github.io/kenning-pipeline-manager/introduction.html).