Official TensorFlow implementation of ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image Compression.
Swin Transformer
ConvNeXt
Learning-based Codecs
Image Compression
TensorFlow
Please do not hesitate to open an issue to inform of any problem you may find within this repository. Also, you can email me for questions or comments.
- This repository is built upon the official TensorFlow implementation of Channel-Wise Autoregressive Entropy Models for Learned Image Compression. This baseline is referred to as Conv-ChARM
- We provide lightweight versions of the models by removing the latent residual prediction (LRP) transform and slicing latent means and scales, as done in the Tensorflow reimplementation of SwinT-ChARM from the original paper TRANSFORMER-BASED TRANSFORM CODING.
- Refer to the TensorFlow Compression (TFC) library to build your own ML models with end-to-end optimized data compression built in.
- Refer to the API documentation for a complete classes and functions description of the TensorFlow Compression (TFC) library.
Python >= 3.6
tensorflow_compression
tensorflow_datasets
tensorflow_addons
einops
All packages used in this repository are listed in requirements.txt. To install those, run:
pip install -r requirements.txt
convnext-charm
│
├── conv-charm.py # Conv-ChARM Model
├── conv-charm_lrp.py # Conv-ChARM Model with latent residual prediction (LRP)
├── convnext-charm.py # ConvNeXt-ChARM Model
├── convnext-charm_lrp.py # ConvNeXt-ChARM Model with latent residual prediction (LRP)
├── swint-charm.py # SwinT-ChARM Model
├── swint-charm_lrp.py # SwinT-ChARM Model with latent residual prediction (LRP)
├── utils.py # Utility scripts
│
├── layers/
│ └── convNext.py/ # ConvNeXt block layers
│ └── swinTransformer.py/ # Swin Transformer block layers
|
└── figures/ # Documentation figures
Every model can be trained and tested individually using:
python convnext-charm.py train
python convnext-charm.py evaluate
Table 1. BD-rate↓ performance of BPG (4:4:4), SwinT-ChARM, and ConvNeXt-ChARM compared to the VTM-18.0 for the four considered datasets.
Dataset | BPG444 | SwinT-ChARM | ConvNeXt-ChARM |
---|---|---|---|
Kodak | 20.73% | -3.47% | -4.90% |
Tecnick | 27.03% | -6.52% | -7.56% |
JPEG-AI | 28.14% | -0.23% | -1.17% |
CLIC21 | 26.54% | -5.86% | -7.36% |
Average | 25.61% | -4.02% | -5.24% |
If you use this library for research purposes, please cite:
@inproceedings{ghorbel2023convnextcharm,
title={ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image Compression},
author={Ghorbel, Ahmed and Hamidouche, Wassim and Luce, Morin},
booktitle={},
year={2023}
}
This project is licensed under the MIT License. See LICENSE for more details