From dcf1cbcff2139f0353492ac23b74187fcf3e8945 Mon Sep 17 00:00:00 2001 From: Felix Dangel <48687646+f-dangel@users.noreply.github.com> Date: Tue, 12 Dec 2023 08:07:23 -0600 Subject: [PATCH] [DOC] Update changelog, prepare `v0.0.2` (#68) * [DOC] Update changelog, prepare `v0.0.2` * [FMT] Add `.md` extension to changelog, auto-format * [ADD] Forgot to add `changelog.md` * [FIX] Balance parentheses * [DOC] Add link to arXiv submission --- README.md | 2 +- changelog | 23 ---------- changelog.md | 62 ++++++++++++++++++++++++++ docs/examples/example_05_structures.py | 4 +- singd/optim/optimizer.py | 6 +-- 5 files changed, 68 insertions(+), 29 deletions(-) delete mode 100644 changelog create mode 100644 changelog.md diff --git a/README.md b/README.md index 665a2e6..44ea125 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ This package contains the official PyTorch implementation of our **memory-efficient and numerically stable KFAC** variant, termed SINGD -([paper](TODO Insert arXiv link)). +([paper](http://arxiv.org/abs/2312.05705)). The main feature is a `torch.optim.Optimizer` which works like most PyTorch optimizers and is compatible with: diff --git a/changelog b/changelog deleted file mode 100644 index edbf255..0000000 --- a/changelog +++ /dev/null @@ -1,23 +0,0 @@ -# Changelog - -All notable changes to this project will be documented in this file. - -The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), -and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). - -## [Unreleased] - -### Added - -### Changed - -### Deprecated - -### Fixed - -## [0.0.1] - 2023-10-31 - -Initial release - -[unreleased]: https://github.com/f-dangel/singd/compare/v0.0.1...HEAD -[0.0.1]: https://github.com/f-dangel/singd/releases/tag/v0.0.1 diff --git a/changelog.md b/changelog.md new file mode 100644 index 0000000..de6de95 --- /dev/null +++ b/changelog.md @@ -0,0 +1,62 @@ +# Changelog + +All notable changes to this project will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic +Versioning](https://semver.org/spec/v2.0.0.html). + +## [Unreleased] + +### Added + +### Changed + +### Deprecated + +### Fixed + +## [0.0.2] - 2023-12-11 + +This release adds support for neural networks with in-place activations and also +comes with performance improvements for convolutions, as well as improvements +regarding numerical stability in half precision. + +### Added + +New features: + +- Support `Conv2d` layers with `dilation != 1` + ([PR](https://github.com/f-dangel/singd/pull/51)) +- Support neural networks with inplace activation functions + ([PR](https://github.com/f-dangel/singd/pull/63)) + +Performance improvements: + +- Speed up input processing for `Conv2d` with `groups != 1` + ([PR](https://github.com/f-dangel/singd/pull/59)) +- Speed up computation of averaged patches for KFAC-reduce + (`kfac_approx='reduce'`) in `Conv2d` using the tensor network approach of + Dangel, 2023 ([PR](https://github.com/f-dangel/singd/pull/61)) + +### Changed + +- Move un-scaling of `H_C` into the update step to improve numerical stability + when using half precision + gradient scaling + ([PR](https://github.com/f-dangel/singd/pull/67)) + +### Deprecated + +No deprecations + +### Fixed + +No bug fixes + +## [0.0.1] - 2023-10-31 + +Initial release + +[unreleased]: https://github.com/f-dangel/singd/compare/v0.0.2...HEAD +[0.0.2]: https://github.com/f-dangel/singd/releases/tag/v0.0.2 +[0.0.1]: https://github.com/f-dangel/singd/releases/tag/v0.0.1 diff --git a/docs/examples/example_05_structures.py b/docs/examples/example_05_structures.py index cfd43bd..7941b77 100644 --- a/docs/examples/example_05_structures.py +++ b/docs/examples/example_05_structures.py @@ -26,8 +26,8 @@ # [`structures`](https://readthedocs.org/projects/singd/api/). The first entry # specifies the structure of $\mathbf{K}$ and its momentum # $\mathbf{m}_\mathbf{K}$, while the second entry specifies the structure of -# $\mathbf{C}$ and its momentum $\mathbf{m}_\mathbf{C}$ (see the [paper](TODO -# Insert link to arXiv submission) for details). It is even possible to specify +# $\mathbf{C}$ and its momentum $\mathbf{m}_\mathbf{C}$ (see the +# [paper](http://arxiv.org/abs/2312.05705) for details). It is even possible to specify # structures on a per-layer basis (see # [this](https://singd.readthedocs.io/en/latest/generated/gallery/example_03_param_groups/) # example). diff --git a/singd/optim/optimizer.py b/singd/optim/optimizer.py index 1315171..47f3537 100644 --- a/singd/optim/optimizer.py +++ b/singd/optim/optimizer.py @@ -26,7 +26,7 @@ class SINGD(Optimizer): """Structured inverse-free natural gradient descent. - The algorithm is introduced in [this paper](TODO Insert arXiv link) and + The algorithm is introduced in [this paper](http://arxiv.org/abs/2312.05705) and extends the inverse-free KFAC algorithm from [Lin et al. (ICML 2023)](https://arxiv.org/abs/2302.09738) with structured pre-conditioner matrices. @@ -104,8 +104,8 @@ def __init__( ): # noqa: D301 """Structured inverse-free natural gradient descent optimizer. - Uses the empirical Fisher. See the [paper](TODO Insert arXiv link) for the - notation. + Uses the empirical Fisher. See the [paper](http://arxiv.org/abs/2312.05705) for + the notation. Args: model: The neural network whose parameters (or a subset thereof) will be