Skip to content

Framework to evaluate deep neural networks with latent space performance metrics

License

Notifications You must be signed in to change notification settings

igor-buzhinsky/latent-space-nn-evaluation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This toolset implements a framework to measure the performance of feed-forward artificial neural network classifiers with generative models. This implementation only concerns image classification. The framework is described in the following arXiv preprint:

Datasets with built-in support: MNIST, CelebA (gender classification), LSUN (scene type classification: bedrooms vs. church outdoors), ImageNet-1k.

For CelebA and LSUN, here are some examples of approximately minimum adversarial perturbations in latent spaces for a non-robust (NR) and a robust (R) classifier:

Dependencies

To run the toolset, you need Python 3 and PyTorch. Dependencies are listed in requirements.txt. To reproduce some of our experiments with ImageNet, you will also need the robustness package. To install all the dependencies, you may run install.sh. Alternatively, if you would like to minimize the size of the installation, you may try installing packages only when you run into import errors. Many packages are needed for PIONEER generative autoencoder training which you may not need to run.

Running: MNIST

The starting point is to run the toolset on MNIST, as all trained models are small and already included into this repository.

You can work with the following command-line scripts and Jupyter notebooks:

  • Adversarial.py: calculation of latent space performance metrics, including the search of latent adversarial perturbations.
  • Adversarial.ipynb: this notebook shows some features supported by Adversarial.py in a more user-friendly form. The target dataset is specified in one of the top cells.
  • ClassifierTraining.py: auxiliary script that implements classifier training and evaluation of their robustness in the original space.
  • ClassifierTraining.ipynb: this notebook shows some features supported by ClassifierTraining.py in a more user-friendly form. The target dataset is specified in one of the top cells (CelebA and LSUN only). In addition, this notebook shows image generation with robust classifiers, which is post visible on MNIST and CelebA.
  • MNIST.ipynb: this notebook is the adaptation of ClassifierTraining.ipynb for MNIST. In addition, it contains code to train class-specific MNIST WGANs.

Some examples of running the aforementioned .py scripts are given in files ClassifierTrainingExperiments.py, ClassifierEvaluationExperiments.py, AdversarialExperiments.py, LatentAccuracyExperiments.py. These scripts can also be used to reproduce the experiments from our paper.

Running: CelebA and LSUN

CelebA and LSUN classifier models are included into the repository, but classifier training/evaluation for these datasets will require downloading these datasets. This will be done automatically for CelebA during the first run, but you will need to download LSUN on your own. You need to have the following directories:

Evaluation of latent performance metrics for CelebA and LSUN will also require pretrained PIONEER models. You will need to download the following files (~670 MB each) and place them at the following locations:

Running: ImageNet

To work with ImageNet-1k (1000 classes):

  • Download the archive ILSVRC2012_img_val.tar with the validation data from http://www.image-net.org/challenges/LSVRC/2012/downloads (you will need to register and accept some terms). Unpack this archive to data/ImageNet/ILSVRC2012_img_val (so that all the images are in this folder, without subfolders). The corresponding reference labels are already included in this repository.
  • Download this pretrained BigGAN model. Unpack the archive to biggan/weights/BigGAN_I128_hdf5_seed0_Gch96_Dch96_bs1_nDa8_nGa8_Glr1.0e-04_Dlr4.0e-04_Gnlinplace_relu_Dnlinplace_relu_Ginitortho_Dinitortho_Gattn64_Dattn64_Gshared_hier_ema (so that .pth files are inside this folder, without subfolders). For convenience, the relevant code of BigGAN is already copied to this repository. If you need BiGAN on its own, use the code from the original repository.
  • Download the robust ImageNet classifier and make it available as imagenet-models/ImageNet.pt. Other (non-robust) classifiers will be downloaded automatically upon first access.
  • Check the notebook AdversarialImageNet.ipynb or the scripts AdversarialImageNet.py, AdversarialImageNetExperiments.py.

Working with other image datasets

To work with a custom dataset, you need to implement a new dataset wrapper in latentspace/datasets.py. Add support to this dataset in latentspace/cnn.py (implement new architecture if needed). Train classifiers for this dataset with ClassifierTraining.py. Then, you will need to train generative models for each image class. Three existing options are provided by latentspace/generative.py:

  • class WGAN: WGANs + image reconstruction with gradient descent (Adam). This is the simplest and the most lightweight option, but this way of reconstruction for more complex datasets will take longer and may be less precise. An example of training MNIST WGANS is given in MNIST.ipynb.
  • class BigGAN: BigGAN + image reconstruction with gradient descent (Adam). BigGAN can be replaced with other class-conditional GAN.
  • class PIONEER: PIONEER autoencoder. A copy of PIONEER (slightly modified) is included into this repository (pioneer/src), but if you need PIONEER on its own, take it from the original repository. Model training and loading is memory-intensive.

Alternatively, you can implement a different subclass of GenerativeModel. Even if you implement only generation (generate/decode) but not approximation in the latent space (encode), evaluation of latent space metrics based on generation should work.

About

Framework to evaluate deep neural networks with latent space performance metrics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published