This is the implementation of an additional experiment with tensorflow in the following paper.
paper: https://arxiv.org/pdf/1802.08241.pdf
Implementation by the author: https://github.com/amirgholami/HessianFlow
speakerdeck: https://speakerdeck.com/mtkwt/hessian-based-analysis
- Docker version 19.03.2
- docker-compose version 1.24.1
In the paper, they use the Hessian matrix w.r.t input, but this repository uses the Hessian matrix w.r.t weight parameters for analysis. As in the paper, I conducted the experiment using CIFAR-10 for image classification. The model architecture is as follows.
Dataset | Model architecture |
---|---|
CIFAR-10 | Conv(3,3,64) - Conv(3,3,64) - MaxPool(2,2) - Conv(3,3,128) - Conv(3,3,128) - MaxPool(2,2) - Dense(256) - Dense(256) - Softmax(10) |
$ docker-compose build
$ UID=${UID} GID=${GID} docker-compose run --rm app /bin/bash
(docker_container)$ python experiment_cifar10_sgd.py