Adversarial Coreset Selection

Hadi M. Dolatabadi, Sarah Erfani, and Christopher Leckie 2022

This repository contains the official implementation of the ECCV 2022 paper $\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training. An extended version of our work has also been published in IJCV, entitled Adversarial Coreset Selection for Efficient Robust Training.

Abstract: Neural networks are vulnerable to adversarial attacks: adding well-crafted, imperceptible perturbations to their input can modify their output. Adversarial training is one of the most effective approaches in training robust models against such attacks. However, it is much slower than vanilla training of neural networks since it needs to construct adversarial examples for the entire training data at every iteration, hampering its effectiveness. Recently, Fast Adversarial Training (FAT) was proposed that can obtain robust models efficiently. However, the reasons behind its success are not fully understood, and more importantly, it can only train robust models for $\ell_\infty$-bounded attacks as it uses FGSM during training. In this paper, by leveraging the theory of coreset selection, we show how selecting a small subset of training data provides a general, more principled approach toward reducing the time complexity of robust training. Unlike existing methods, our approach can be adapted to a wide variety of training objectives, including TRADES, $\ell_p$-PGD, and Perceptual Adversarial Training (PAT). Our experimental results indicate that our approach speeds up adversarial training by 2-3 times while experiencing a slight reduction in the clean and robust accuracy.

Requirements

To install requirements:

pip install -r requirements.txt

Repository Structure

Path	Description
master	The main folder containing the repository.
├ configs	Config files containing the settings.
├ cords	Coreset selection modules.
├ misc	Miscellaneous files.
├ perceptual_advex	Perceptual adversarial training modules.
└ scripts	Training scripts for different adversarial training objectives.
├ robust_train_FPAT.py	Perceptual adversarial training (CIFAR-10 and ImageNet-12).
├ robust_train_l2.py	$\ell_2$-PGD adversarial training (SVHN).
├ robust_train_linf.py	$\ell_\infty$-PGD adversarial training (CIFAR-10).
└ robust_train_TRADES.py	TRADES adversarial training (CIFAR-10).
├ run_train_FPAT.py	Runner module for perceptual adversarial training.
├ run_train_l2.py	Runner module for $\ell_2$-PGD adversarial training.
├ run_train_linf.py	Runner module for $\ell_\infty$-PGD adversarial training.
└ run_train_TRADES.py	Runner module for TRADES adversarial training.

Efficient Adversarial Training with Coreset Selection

To train a robust neural network using coreset selection, first, decide the training objective (here, we provide the code for $\ell_2$ and $\ell_\infty$-PGD, Perceptual and TRADES adversarial training). Once decided, different versions of coreset selection can be used to enable fast, robust learning. In this repository, we provide support for different variants of CRAIG and GradMatch as in the official cords repository. Namely, we provide the ADVERSARIAL versions of the following coreset selection methods:

Command	Method Description
CRAIG	The plain CRAIG method.
CRAIGPB	The batch-wise version of the CRAIG method.
CRAIG-Warm	The CRAIG method with warm-start.
CRAIGPB-Warm	The batch-wise version of the CRAIG method with warm-start.
GradMatch	The plain GradMatch method.
GradMatchPB	The batch-wise version of the GradMatch method.
GradMatch-Warm	The GradMatch method with warm-start.
GradMatchPB-Warm	The batch-wise version of the GradMatch method with warm-start.

To train a model, we need to run:

python run_train_<OBJ>.py \
        --dataset <DATASET> \
        --cnfg_dir <CONFIG_FILE> \
        --ckpt_dir <CHECKPOINT_PATH> \
        --attack_type <ATTACK> \
        --epsilon <ATTACK_EPS> \
        --alpha <ATTACK_STEP> \
        --attack_iters <ITERS> \
        --lr <LEARNING_RATE> \
        --epochs <NUM_EPOCHS> \
        --frac <CORESET_SIZE> \
        --freq <SELECTION_FREQ> \
        --kappa <WARM_START_FACTOR>

where the parameters' definition is given below:

Command	Method Description
`OBJ`	Training objective (from `[FPAT, l2, linf, TRADES]`).
`DATASET`	Training dataset (currently, each objective can be run on certain datasets only).
`CONFIG_FILE`	Configuration file (a few examples are given in `./configs` folder.)
`CHECKPOINT_PATH`	The save\load path for the trained model.
`ATTACK`	Attack type for coreset construction.
`ATTACK_EPS`	Maximum perturbation norm.
`ATTACK_STEP`	The step-size of attack generation steps.
`ITERS`	Total number of iterations for attack generation.
`LEARNING_RATE`	The classifier learning rate.
`NUM_EPOCHS`	Total number of epochs.
`CORESET_SIZE`	The size of the coreset ($0 \leq c \leq 1$).
`SELECTION_FREQ`	Frequency of coreset selection (in epochs).
`WARM_START_FACTOR`	The warm-start factor.

For instance, let us say we want to run $\ell_\infty$ adversarial training with the batch-wise version of the GradMatch method with warm-start. Also, assume that we want to gain a 2x training time gain. We set the coreset size ($c$) to 50% (half of the data will be actively used for training.) We want an update frequency of 20 epochs for coresets, and set the $\kappa$ factor to 0.5. This factor determines the number of warm-start epochs with a relationship of $T_{\rm w} = \mathrm{round}(c \cdot \mathrm{int}(\kappa \cdot E))$. To run this training objective, we run:

python run_train_linf.py \
        --dataset cifar10 \
        --cnfg_dir configs/config_gradmatchpb-warm_cifar10_robust.py \
        --ckpt_dir /GradMatch_Example/ \
        --attack_type PGD \
        --epsilon 8 \
        --alpha 1.25 \
        --attack_iters 10 \
        --lr 0.01 \
        --epochs 120 \
        --frac 0.5 \
        --freq 20 \
        --kappa 0.5

Full Adversarial Training

To provide a ground for FAIR COMPARISON, this repository would allow adversarial training with the entire training data as well. To this end, one just needs to set the type of the dss_strategy in their config to Full (e.g., see the ./configs/config_full_cifar10_robust.py config for an example.) For instance, the previous training comman can be changed to the below one in this case:

python run_train_linf.py \
        --dataset cifar10 \
        --cnfg_dir configs/config_full_cifar10_robust.py \
        --ckpt_dir /GradMatch_Example/ \
        --attack_type PGD \
        --epsilon 8 \
        --alpha 1.25 \
        --attack_iters 10 \
        --lr 0.01 \
        --epochs 120 \
        --frac 0.5 \
        --freq 20 \
        --kappa 0.5

Results

The primary results of this work are given in the table below. Note that the running time heavily depends on the GPU device and the exact versions of each software. Hence, we recommend consistently running the code from scratch to compare training efficiency.

Table: Clean (ACC) and robust (RACC) accuracy, and total training time (T) of different adversarial training methods. All the hyper-parameters were kept the same as full training for each objective. In each case, we evaluate the robust accuracy using an attack with similar attributes as the training objective. More detail can be found in the paper. The results are averaged over 5 runs.

Objective	Data	Training	Performance Measures
Objective	Data	Training	ACC (%)	RACC (%)	T (mins)
TRADES	CIFAR-10	Adv. CRAIG (Ours)	83.03	41.45	179.20
		Adv. GradMatch (Ours)	83.07	41.52	178.73
		Full Adv. Training	85.41	44.19	344.29
$\ell_\infty$-PGD	CIFAR-10	Adv. CRAIG (Ours)	80.37	45.07	148.01
		Adv. GradMatch (Ours)	80.67	45.23	148.03
		Full Adv. Training	83.14	41.39	292.87
$\ell_2$-PGD	SVHN	Adv. CRAIG (Ours)	95.42	49.68	130.04
		Adv. GradMatch (Ours)	95.57	50.41	125.53
		Full Adv. Training	95.32	53.02	389.46

Acknowledgement

This repository is mainly built upon an older version of CORDS, COResets and Data Subset selection and Perceptual Adversarial Robustness. We thank the authors of these two repositories.

Citation

If you have found our code or paper beneficial to your research, please consider citing them as:

@inproceedings{dolatabadi2022unleashing,
  title={$\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training},
  author={Hadi Mohaghegh Dolatabadi and Sarah Erfani and Christopher Leckie},
  booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
  year={2022}
}

@article{dolatabadi2022adversarial,
  title={Adversarial coreset selection for efficient robust training},
  author={Hadi Mohaghegh Dolatabadi and Sarah Erfani and Christopher Leckie},
  journal={International Journal of Computer Vision (IJCV)},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial Coreset Selection

Requirements

Repository Structure

Efficient Adversarial Training with Coreset Selection

Full Adversarial Training

Results

Acknowledgement

Citation

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
configs		configs
cords		cords
misc		misc
perceptual_advex		perceptual_advex
scripts		scripts
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_train_FPAT.py		run_train_FPAT.py
run_train_TRADES.py		run_train_TRADES.py
run_train_l2.py		run_train_l2.py
run_train_linf.py		run_train_linf.py

License

hmdolatabadi/ACS

Folders and files

Latest commit

History

Repository files navigation

Adversarial Coreset Selection

Requirements

Repository Structure

Efficient Adversarial Training with Coreset Selection

Full Adversarial Training

Results

Acknowledgement

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages