ML-EXray: Visibility to ML Deployment on the Edge

ML-EXray is an open-source research library for ML execution monitoring and debugging on edge devices. It provides visibility into layer-level details of the ML execution, and helps developers analyze and debug cloud-to-edge deployment issues. It includes a suite of instrumentation APIs for ML execution logging and an end-to-end deployment validation library. Users and app developers can catch complicated deployment issues just by writing a few lines of instrumentation and assertion code.

For more details, please see our MLSys paper: https://arxiv.org/abs/2111.04779

Workflow Overview

Instrument your mobile apps using ML-EXray APIs to produce EXray logs (green) from your edge devices.
Plays back the same data using a reference pipeline (often extracted from the training pipeline) to produce the reference EXray logs (blue).
ML-EXray compares edge logs with reference logs and reports potential issues like accuracy degradation, per-layer output drift, and user-defined assertion results.

Customization

Assertion functions: users can define custom debugging assertions to validate suspected issues, such as input channel arrangement, orientation, and normalization range.
Log Elements: users can instrument the app at any point throughout the pipeline to log custom variables for different debugging purposes. (e.g. intermediate preproccesing values as shown in red dots)
Reference Pipelines: users can provide alternative pipelines as references, such as a previous successful deployment pipeline on a different device.

Getting Started

ML-EXray provides a suite of instrumentation and analysis APIs in C++, JAVA, and Python corresponding to tensorflow lite APIs. This documentation focuses on giving examples of using the tool to debug edge ML deployment issues. For complete API documentations, please refer to each individual page (C++ API, JAVA API, Python API).

We use an image classifier app as an example to illustrate the process of ML-EXray catching potential issues and bugs in an edge ML pipeline. To this end, we provide the following:

An android image classifier app using various TFLite ML models
A correct reference pipeline of these models in python
Models and their quantized version, as well as conversion scripts
Scripts to analyze and reproduce results in the ML-EXray paper

Tutorial Index

Try the app and get the logs
Debug model quantization issues (colab)
Visualize preprocessing bug impact
Evaluate per-layer latency on heterogeneous hardward

Preparation: EXray logs

ML-EXray instrumentation APIs generate EXray logs to help user debug potential deployment issues. For example, in each run, the android app generate a trace of logs for each inference (Same for JAVA and Python APIs). Currently, user can use ML-EXray to log the following information

Input/Output: Model input/output, Per-layer input/output, custom intermediate values (e.g. pre-processing input/output)
Performance metrics: end-to-end latency, per-layer latency, memory
Periferal sensors (Android only): Orientation, Inertial Motion, Lighting, etc.

ML-EXray also includes scripts to parse the logs and perform meaningful analysis.

Trace download all:

We provide example traces from our experiments under the data folder for demo purpose.

Dataset	Trace
Imagenet2012_1	download (2.72G)
Imagenet2012_100	download (2.82G)

The folder architecture is as follows

trace_[DatasetName]
  - [ModelName]
    - [Config]_[PreprocessOption]

For example

trace_imagenet2012_100
  - MobileNetV1
    - Cloud
    - CloudQuant
    - Edge_BGR
    - Edge_Bilinear
    - EdgeQuant_[0,1]
  - MobileNetV2
    - Cloud
    - CloudQuant
    - Edge
    - EdgeRefOp    
    - EdgeQuant_Bilinear

An Android image classifier app

Using the binary

Install the binaries on your edge devices. We provide binaries for 4 architectures, arm, arm64, x86, and x86_64.

Binary download all:

Kernels	arm64	arm	x86_64	x86
Optimized OpResolver	arm64	arm	x86_64	x86
Reference OpResolver	arm64	arm	x86_64	x86

The binary comes with 8 models in both floating-point version and quantized version. To test your own models, please build from source and put your model in the examples/android/TFliteCppApp/assets folder.

Models included in the app can be downloaded here

Model	Float	Int8 (Quantized)
MobileNet v1	MobileNetV1_imagenet_224	MobileNetV1_imagenet_quant_224
MobileNet v2	MobileNetV2_imagenet_224	MobileNetV2_imagenet_quant_224
MobileNet v3 Large	MobileNetV3_Large_imagenet_224	MobileNetV3_Large_imagenet_quant_224
MobileNet v3 Small	MobileNetV3_Small_imagenet_224	MobileNetV3_Small_imagenet_quant_224
Inception v3	InceptionV3_imagenet_299	InceptionV3_imagenet_quant_299
Densenet 121	DenseNet121_imagenet_224	DenseNet121_imagenet_quant_224
Resnet50 v2	ResNet50V2_imagenet_224	ResNet50V2_imagenet_quant_224

Downloaded models must be put in models/ConvertedModels/[Modelname]/...tflite. For example,

models
|--ConvertedModels
|  |--MobilenetV1
|  |  |--MobilenetV1_imagenet_224.tflite
|  |  |--MobilenetV1_imagenet_quant_224.tflite
|  |--MobilenetV2

Configure the app for running options

Hardward: CPU, GPU, NNAPI
Logging level: None, IO, Embedding, Per-layer
Resizing options: interpolation, area-averaging
Channel arrangement: RGB, BGR
Normalization range: [0.0,1.0], [-1.0,1.0]
Rotation and number of threads

Cloud connection currently disabled for public tests.

Building from source

ML-EXray is built from bazel, complied in android studio with bazel plugin. While compiling, make sure to specify bazel flags matching the device architecture, e.g. --config=android_x86_64 or --config=android_arm64 or --config=android_x86 or --config=android_arm

Prerequisites and tested version

Android Studio Arctic Fox 2020.3.1
Android Studio Bazel plugin
Android SDK 11.0
Android SDK Build-tools 30.0.3
Android NDK 21.4.7075529
Java runtime environment (OpenJDK JRE)
JDK (default-jdk)
Android Emulator 30.8.4 (need qemu_kvm)

Build the app

Load the project in android studio as a bazel project using bazel import
Download android SDK and NDK from the SDK manager of android studio
Specify SDK NDK path in WORKSPACE file

android_ndk_repository(
    name = "androidndk",
    path = "<path-to-android-root>/Sdk/ndk/21.4.7075529",
)

android_sdk_repository(
    name = "androidsdk",
    path = "<path-to-android-root>/Sdk",
)

Compile the bazel target: //examples/android/TFliteCppApp:edgemlinsight with arch flag e.g. --config=android_arm64,
or equivalently use bazel commands as follows

bazel build //examples/android/TFliteCppApp:edgemlinsight
--config=android_arm64

Running the app on android emulator

Download android image from emulator manager in android studio.
Create an emulated device using the image
Install qemu_kvm for virtualization
Make sure the user has ownership of /dev/kvm, o.w. emulator will be terminated

sudo chown $USER /dev/kvm

Compile the app with --config matching the emulator architecture, e.g. --config=android_x86_64
Run the app on the emulator device (AVD)

Pre-processing bugs and impact on accuracy

To evaluate classification accuracy from EXray logs, please refer to the follwoing scripts

python3 scripts/analysis/TopKAccuracy.py

We provided examples trace (in ./data folder) of the same model running in different configurations (quantized vs. unquantized, BGR input vs RGB input, normalization range of [0,1] vs [-1,1]). The script evaluate and compare accuracies of these configurations. .

To evaluate your model with different configurations on your own dataset

Load your benchmark dataset (e.g. data/0_data/imagenet2012_1) to your android device under /sdcard/edgeml/
Choose your model to run the app.
Use adb to pull the logs from the phone
Evaluate the accuracy against your labels

Debug quantization issues

Using the app

Load your image dataset (follow the format in the image data folder (e.g. data/0_data/imagenet2012_1)) into you android device under /sdcard/edgeml/
On the app, choose the data source as local and the pasted image folder.
Run any model in both quantized and floating point version.
Run analysis script comparing the logs

The following scripts reproduce the result of Mobilenet v2 and v3. Each compares the quantized model's per-layer output against a correct cloud reference pipeline. Note the quantized model can be run using optimized kernels and reference kernels. See more details in the paper.

# For Mobilenet v2
python3 scripts/analysis/mobilenetv2_quant_refop.py
# For Mobilenet v3
python3 scripts/analysis/mobilenetv3_quant_refop.py

Using Python API

For a full tutorial, please check the colab

To play back the same dataset using python API, please check the python API documentation

You can use the same scripts above to visualize the per-layer differences between each run.

Evaluate per-layer latency

We build on top of TFLite benchmark tool to evaluate per-layer latency, categorized by layer type. This information is logged in summary.log in each exray trace.

Evaluate system overhead

Please inspect EXray logs for end-to-end latency with and without logging enabled. To disable logging, specify the argument of EdgeMLMonitor_Native with no_logging_=True.

Citation

If ML-EXray helps your edge ML deployment, please cite:

@inproceedings{mlexray,
 author = {Qiu, Hang and Vavelidou, Ioanna and Li, Jian and Pergament, Evgenya and Warden, Pete and Chinchali, Sandeep and Asgar, Zain and Katti, Sachin},
 booktitle = {Proceedings of Machine Learning and Systems},
 editor = {D. Marculescu and Y. Chi and C. Wu},
 pages = {337--351},
 title = {ML-EXray: Visibility into ML Deployment on the Edge},
 url = {https://proceedings.mlsys.org/paper/2022/file/76dc611d6ebaafc66cc0879c71b5db5c-Paper.pdf},
 volume = {4},
 year = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data/0_data		data/0_data
docker/ImageClassificationApp		docker/ImageClassificationApp
docs		docs
edgeml		edgeml
examples/android		examples/android
mediapipe		mediapipe
model		model
scripts		scripts
third_party		third_party
.bazelrc		.bazelrc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
WORKSPACE		WORKSPACE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-EXray: Visibility to ML Deployment on the Edge

Workflow Overview

Customization

Getting Started

Tutorial Index

Preparation: EXray logs

Trace download all:

An Android image classifier app

Using the binary

Binary download all:

Building from source

Prerequisites and tested version

Build the app

Running the app on android emulator

Pre-processing bugs and impact on accuracy

Debug quantization issues

Using the app

Using Python API

Evaluate per-layer latency

Evaluate system overhead

Citation

About

Releases

Packages

Languages

License

hangqiu/ML-EXray

Folders and files

Latest commit

History

Repository files navigation

ML-EXray: Visibility to ML Deployment on the Edge

Workflow Overview

Customization

Getting Started

Tutorial Index

Preparation: EXray logs

Trace download all:

An Android image classifier app

Using the binary

Binary download all:

Building from source

Prerequisites and tested version

Build the app

Running the app on android emulator

Pre-processing bugs and impact on accuracy

Debug quantization issues

Using the app

Using Python API

Evaluate per-layer latency

Evaluate system overhead

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages