Abstract: The entropy bottleneck introduced by Ballé et al. is a common component used in many learned compression models. It encodes a transformed latent representation using a static distribution whose parameters are learned during training. However, the actual distribution of the latent data may vary wildly across different inputs. The static distribution attempts to encompass all possible input distributions, thus fitting none of them particularly well. This unfortunate phenomenon, sometimes known as the amortization gap, results in suboptimal compression. To address this issue, we propose a method that dynamically adapts the encoding distribution to match the latent data distribution for a specific input. First, our model estimates a better encoding distribution for a given input. This distribution is then compressed and transmitted as an additional side-information bitstream. Finally, the decoder reconstructs the encoding distribution and uses it to decompress the corresponding latent data. Our method achieves a Bjøntegaard-Delta (BD)-rate gain of -7.10% on the Kodak test dataset when applied to the standard fully-factorized architecture. Furthermore, considering computational complexity, the transform used by our method is an order of magnitude cheaper in terms of Multiply-Accumulate (MAC) operations compared to related side-information methods such as the scale hyperprior.
- Authors: Mateen Ulhaq and Ivan V. Bajić
- Affiliation: Simon Fraser University
- Links: Accepted at ICIP 2024. [Paper]. [BibTeX citation].
Please cite this work as:
@inproceedings{ulhaq2024encodingdistributions,
title = {Learned Compression of Encoding Distributions},
author = {Ulhaq, Mateen and Baji\'{c}, Ivan V.},
booktitle = {Proc. IEEE ICIP},
year = {2024},
}