v0.2-pre-apache-incubation
NOTE: This is a release pre apache incubation
This release comes with a complete set of TOPI support for NNVM compiler, which allows compilation of end to end workloads. We also make major improvements in supporting new backends: ROCm for AMDGPUs and ARM GPU. Check out previous blogs that describes these major improvements in detail!
- Backend support
- Support LLVM mainline(4.0, 5.0, 6.0)
- Support ROCM stack for AMD GPUs
- More robust OpenCL support for ARM GPUs
- Android RPC runtime
- Multi-threading optimization for ARM
- multi-threaded depthwise
- multi-threaded conv2d
- New schedule primitives
- storage_align for shared memory alignment
- double_buffer
- UnrollLoop : more robust version of unroll loop, count maximum steps that can be unrolled.
- Full set of TOPI operators
- Introduce tvm.target to specify target options for compilation better.
- broadcast/ reduction operators
- pooling and global pooling
- Generic target support for topi
- schedule with external libraries
- End to end deep learning pipelines for CPU, GPU, ARM GPU
- Tutorials
- How to load compiled module in any language runtime
- How to use java runtime
- Contrib library: MIOpen, CuDNN
- Ongoing items that contains functioning pieces
- WebGL backend
- C++ compiler support
- MPS DNN
- low bit support, introduced popcount