Tengine Lite with VeriSilicon TIM-VX User Manual

1. Brief

TIM-VX is a software integration module provided by VeriSilicon to facilitate deployment of Neural-Networks on OpenVX enabled ML accelerators.

Tengine Lite has supported to integrate with TIM-VX Library of Verisilicon to inference CNN by Khadas VIM3(Amlogic A311D).

2. How to Build

For some special reasons, only supported on Khadas VIM3 or x86_64 simulator to work the following steps, currently.

Download Source code of TIM-VX

$ git clone https://github.com/VeriSilicon/TIM-VX.git

Download Tengine Lite

$ git clone https://github.com/OAID/Tengine.git tengine-lite
$ cd tengine-lite

2.1 Prepare for x86_64 simulator platform

non-cross-compilation

$ cd <tengine-lite-root-dir>
$ mkdir -p ./3rdparty/tim-vx/lib/x86_64
$ mkdir -p ./3rdparty/tim-vx/include
$ cp -rf ../TIM-VX/include/*    ./3rdparty/tim-vx/include/
$ cp -rf ../TIM-VX/src    ./source/device/tim-vx/
$ cp -rf ../TIM-VX/prebuilt-sdk/x86_64_linux/include/*    ./3rdparty/tim-vx/include/
$ cp -rf ../TIM-VX/prebuilt-sdk/x86_64_linux/lib/*    ./3rdparty/tim-vx/lib/x86_64/
$ rm ./source/device/tim-vx/src/tim/vx/*_test.cc

Build Tengine

$ export LD_LIBRARY_PATH=<tengine-lite-root-dir>/3rdparty/tim-vx/lib/x86_64

$ cd <tengine-lite-root-dir>
$ mkdir build && cd build
$ cmake -DTENGINE_ENABLE_TIM_VX=ON ..
$ make -j4

2.2 Prepare for Khadas VIM3 platform

Prepare for VIM3 prebuild sdk:

$ wget -c https://github.com/VeriSilicon/TIM-VX/releases/download/v1.1.28/aarch64_A311D_D312513_A294074_R311680_T312233_O312045.tgz
$ tar zxvf aarch64_A311D_D312513_A294074_R311680_T312233_O312045.tgz
$ mv aarch64_A311D_D312513_A294074_R311680_T312233_O312045 prebuild-sdk-a311d

$ cd <tengine-lite-root-dir>
$ mkdir -p ./3rdparty/tim-vx/lib/aarch64
$ mkdir -p ./3rdparty/tim-vx/include
$ cp -rf ../TIM-VX/include/*    ./3rdparty/tim-vx/include/
$ cp -rf ../TIM-VX/src    ./source/device/tim-vx/
$ cp -rf ../prebuild-sdk-a311d/include/*    ./3rdparty/tim-vx/include/
$ cp -rf ../prebuild-sdk-a311d/lib/*    ./3rdparty/tim-vx/lib/aarch64/
$ rm ./source/device/tim-vx/src/tim/vx/*_test.cc

2.2.1 cross-compilation

TOOLCHAIN_FILE in the /toolchains

$ export LD_LIBRARY_PATH=<tengine-lite-root-dir>/3rdparty/tim-vx/lib/aarch64

$ cd <tengine-lite-root-dir>
$ mkdir build && cd build
$ cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/aarch64-linux-gnu.toolchain.cmake -DTENGINE_ENABLE_TIM_VX=ON ..
$ make -j4

2.2.2 non-cross-compilation

Check for galcore:

$ sudo dmesg | grep Galcore

if ( Galcore version < 6.4.3.p0.286725 )

$ rmmod galcore
$ insmod galcore.ko

Check for libOpenVX.so*:

$ sudo find / -name "libOpenVX.so*"

if ( libOpenVX.so version < libOpenVX.so.1.3.0 in /usr/lib )

$ cd <tengine-lite-root-dir>
$ mkdir -p Backup
$ mv /usr/lib/libOpenVX.so* ./Backup
$ cp -rf ../prebuild-sdk-a311d/lib/libOpenVX.so* /usr/lib

Build Tengine

$ cd <tengine-lite-root-dir>
$ mkdir build && cd build
$ cmake -DTENGINE_ENABLE_TIM_VX=ON ..
$ make -j4

2.3 Prepare for NXP platform

non-cross-compilation

$ cd <tengine-lite-root-dir>
$ mkdir build && cd build
$ cmake -DTENGINE_ENABLE_TIM_VX=ON ..
$ make -j4

3. Demo

3.1 Depned librarys

build-tim-vx-arm64/install/lib/
└── libtengine-lite.so

On the Khadas VIM3, it need to replace those libraries in the /lib/

3.2 Set uint8 Inference mode

TIM-VX Library needs the uint8 network model

/* set runtime options */
struct options opt;
opt.num_thread = num_thread;
opt.cluster = TENGINE_CLUSTER_ALL;
opt.precision = TENGINE_MODE_UINT8;
opt.affinity = 0;

3.3 Result

[khadas@Khadas tengine-lite]# ./example/tm_classification_timvx -m squeezenet_uint8.tmfile -i cat.jpg -r 1 -s 0.017,0.017,0.017 -r 10
Tengine plugin allocator TIMVX is registered.
Image height not specified, use default 227
Image width not specified, use default  227
Mean value not specified, use default   104.0, 116.7, 122.7
tengine-lite library version: 1.2-dev
TIM-VX prerun.

model file : squeezenet_uint8.tmfile
image file : cat.jpg
img_h, img_w, scale[3], mean[3] : 227 227 , 0.017 0.017 0.017, 104.0 116.7 122.7
Repeat 10 times, thread 1, avg time 2.95 ms, max_time 3.42 ms, min_time 2.76 ms
--------------------------------------
34.786182, 278
33.942883, 287
33.732056, 280
32.045452, 277
30.780502, 282

4. Support list

Vendor	Devices
Amlogic	A311D, S905D3
NXP	i.MX 8M Plus
JLQ	JA310
X86-64	Simulator

5. The uint8 quantization model

The TIM-VX NPU backend needs the uint8 tmfile as it's input model file, you can quantize the tmfile from float32 to uint8 from here.

Tengine Post Training Quantization Tools
Download the uint8 quant tool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

npu_tim-vx_user_manual.md

npu_tim-vx_user_manual.md

Tengine Lite with VeriSilicon TIM-VX User Manual

1. Brief

2. How to Build

Download Source code of TIM-VX

Download Tengine Lite

2.1 Prepare for x86_64 simulator platform

2.2 Prepare for Khadas VIM3 platform

2.3 Prepare for NXP platform

3. Demo

3.1 Depned librarys

3.2 Set uint8 Inference mode

3.3 Result

4. Support list

5. The uint8 quantization model

Files

npu_tim-vx_user_manual.md

Latest commit

History

npu_tim-vx_user_manual.md

File metadata and controls

Tengine Lite with VeriSilicon TIM-VX User Manual

1. Brief

2. How to Build

Download Source code of TIM-VX

Download Tengine Lite

2.1 Prepare for x86_64 simulator platform

2.2 Prepare for Khadas VIM3 platform

2.3 Prepare for NXP platform

3. Demo

3.1 Depned librarys

3.2 Set uint8 Inference mode

3.3 Result

4. Support list

5. The uint8 quantization model