This repository has been archived by the owner on Aug 5, 2022. It is now read-only.
Releases: intel/caffe
Releases · intel/caffe
Caffe_v1.0.7
- Features
- [Multi-node] Support weight gradient compression for better scaling efficiency of models with large FC layers, like VGG
- [Multi-node] Integrate LARS (Layer-wise Adaptive Rate Scaling) and apply to Alexnet with BatchNorm layers on 32K global batch size
- Enable pinning internal thread to cores for more stable training performance, e.g., data loader thread
- Merge pull request of supporting Flow LRCN from Github
- Support label smoothing regularization (idea from Inception-V3)
- Bug fixings
- [Multi-node] Fix learning rate message and start first iteration from zero on multi-node to be consistent with single node
- Bug fixes on single node
- Misc
- Upgrade MKLML to 2018.0.1.20171007 and MLSL to V2
- Enhance installation and benchmarking scripts
- Update the optimized models
Caffe_v1.0.6
- Support DCGan and Faster RCNN
- Upgrade MKL-DNN version to v0.11
- Enable in-place batch normalization
- Enhance scripts for installation and benchmarking
Known issues:
- MKL-DNN compilation failure on Ubuntu 16.04
Caffe_v1.0.5
- Switch default engine to MKLDNN
- Support Faster RCNN under MKL2017 engine
- Support asynchronized SGD (experimental feature)
- Refine model zoo for multi-node training
Caffe_v1.0.4a
- Improve user experience on build and installation with single script
- Provide best performance configurations and ease-of-use script to measure performance
Caffe_v1.0.4
- Improve multi-node training performance significantly
- Support batch normalization statistics for multi-node training on large batch size
- Support computation fusion in stochastic gradient descent update
- Add warm-up iterations before performance measurement
- Add initial multi-node training scripts
Caffe_v1.0.3a
This release includes:
- Upgrade MKL-DNN to golden release with QFMA support
- Construct model zoo for multi-node training on Intel Caffe
Known issues:
- SSD may have problems with mklml_lnx_2018.0.20170720
- Scoring performance has some drop on topologies like resnet and googlenet_v2
Caffe_v1.0.3
This release includes:
- Support large batch size multi-node training (experimental feature)
ResNet-50: up to 256 nodes with 8K mini-batch
Caffe_v1.0.2
This release includes:
- Support convolution and relu fusion in training forward path (4% ~ 5% performance improvement on KNM)
- Reach performance parity between MKL2017 and MKL-DNN after MKL-DNN upgrade
- Support data augmentation for scale jittering
- Support input data type for multi-node training
- Support parallel compilation for MKL-DNN with 5X speedup
- Bug fixing for MKL-DNN integration and multi-node training
Caffe_v1.0.1
This release includes:
- Support configurable convolution algorithms (winograd and direct) under MKL-DNN
- Support batch normal and scale layer fusion in convolution layer
- Improve performance from MKL-DNN upgrade with optimizations on AVX512
- Support multi-node training with model parallelism (experimental feature)
- Support deconvolution layer under MKL2017
- Integrate MLSL deployment in Makefile
Caffe_v1.0.0
This release includes:
- Improve the performance 20% ~ 50% with MKL-DNN engine
- Upgrade MKL engine to mklml_lnx_2018.0.20170425
- Support single batch size optimization for default CAFFE engine