Keshigeyan Chandrasegaran 1 /
Ngoc‑Trung Tran 1 /
Alexander Binder 2,3 /
Ngai‑Man Cheung 1
1 Singapore University of Technology and Design (SUTD)
2 Singapore Institute of Technology (SIT)
3 University of Oslo (UIO)
ECCV 2022 Oral
Project |
ECCV Paper |
Pre-trained Models
- Abstract
- About the code
- Running the code
- Sensitivity Assessment results using Forensic feature map dropout
- Color Robust (CR) Universal Detector Results
- FAQ
- Citation
- Acknowledgements
- References
Visual counterfeits are increasingly causing an existential conundrum in mainstream media with rapid evolution in neural image synthesis methods. Though detection of such counterfeits has been a taxing problem in the image forensics community, a recent class of forensic detectors – universal detectors – are able to surprisingly spot counterfeit images regardless of generator architectures, loss functions, training datasets, and resolutions. This intriguing property suggests the possible existence of transferable forensic features (T-FF) in universal detectors. In this work, we conduct the first analytical study to discover and understand T-FF in universal detectors. Our contributions are 2-fold: 1)We propose a novel forensic feature relevance statistic (FF-RS) to quantify and discover T-FF in universal detectors and, 2) Our qualitative and quantitative investigations uncover an unexpected finding: color is a critical T-FF in universal detectors.
This codebase is written in Pytorch. We also provide the Docker file to run our code. The codebase is clearly documented with clear instructions to run all the code. The code is structured as follows:
lrp/
: Base Pytorch module containing LRP implementations for ResNet and EfficientNet architectures. This includes all Pytorch wrappers.fmap ranking/
: Pytorch module to calculate FF-RS (ω) for counterfeit detectionsensitivity assessment/
: Pytorch module to perform sensitivity assessments for T-FF and color ablation.patch extraction/
: Pytorch module to extract LRP-max response image regions for every T-FF.activation histograms/
: Pytorch module to calculate maximum spatial activation for images for every T-FF.utils/
: Contains all utilities, helper functions and plotting functions.
✔️ Pytorch
✔️ DockerFile
Install dependencies, download ForenSynths dataset and pre-trained models.
- Create a new virtual environment and install all the dependencies
pip3 install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111
- Download and unzip ForenSynths test dataset by Wang et al. here [1]. We assume the dataset is saved at
/mnt/data/CNN_synth_testset/
- Download and unzip ForenSynths validation dataset by Wang et al. here [1]. We assume the dataset is saved at
/mnt/data/progan_val/
- Download pre-trained models and place it in
weights/
directory. We also include the weights for ResNet-50 model published by Wang et al [1] as well. The models are available here.
Universal Detector [1] | ImageNet Classifier | Guided-GradCAM [2] | LRP [3] | ||
---|---|---|---|---|---|
ResNet-50 | ✔️ | ✔️ | python src/gradcam.py --arch resnet50 --classifier ud |
||
EfficientNet-B0 | ✔️ | ✔️ | python src/gradcam.py --arch efb0 --classifier ud |
||
ResNet-50 | ✔️ | ✔️ | python src/gradcam.py --arch resnet50 --classifier imagenet |
||
EfficientNet-B0 | ✔️ | ✔️ | python src/gradcam.py --arch efb0 --classifier imagenet |
||
ResNet-50 | ✔️ | ✔️ | python src/ud_lrp.py --arch resnet50 --classifier ud |
||
EfficientNet-B0 | ✔️ | ✔️ | python src/ud_lrp_efb0.py --arch efb0 --classifier ud |
||
ResNet-50 | ✔️ | ✔️ | python src/imagenet_lrp.py |
||
EfficientNet-B0 | ✔️ | ✔️ | python src/imagenet_lrp_efb0.py --arch efb0 --classifier imagenet |
We include step-by-step instructions for running / reproducing all the results / analysis in our paper. We use ResNet-50 detector and StyleGAN2 images for the following examples. To run analysis using EfficientNetB0 detector, simply pass --arch efb0 --topk 27
. To analyse other unseen GANs (i.e.: BigGAN), simply pass --gan_name biggan
.
-
Calculate FF-RS (ω) for ProGAN validation set : For ease, we provide pre-calculated FF-RS (ω) values for all detectors in the
fmap_relevances/
directory, so you may skip this step if you wish.python src/rank_fmaps.py --arch resnet50 --blur_jpg 0.5 --bsize 16 --dataset_dir /mnt/data/ --gan_name progan_val --have_classes True --num_real 50 --num_fake 50 --save_pt_files False
-
Perform Sensitiity Assessments on StyleGAN2 (Unseen GAN)
python src/transfer_sensitivity_analysis.py --arch resnet50 --blur_jpg 0.5 --bsize 256 --dataset_dir /mnt/data/CNN_synth_testset/ --gan_name stylegan2 --have_classes 1 --num_instances 200 --topk 114
-
Extract LRP-max patches for StyleGAN2 (Unseen GAN)
python src/get_max_activation_rankings.py --arch resnet50 --blur_jpg 0.5 --bsize 256 --dataset_dir /mnt/data/CNN_synth_testset/ --gan_name stylegan2 --have_classes 1 --num_instances 200 --topk 114
python src/extract_max_activation_patches.py --arch resnet50 --blur_jpg 0.5 --bsize 64 --dataset_dir /mnt/data/CNN_synth_testset/ --gan_name stylegan2 --have_classes 1 --num_instances 20 --topk 114
-
Create patch collage for StyleGAN2 (Fig 1 in Main paper)
python src/patch_collage.py --arch resnet50 --blur_jpg 0.5 --gan_name stylegan2 --num_instances 5
-
Generate Box-Whisker Plots for Color ablation using StyleGAN2 counterfeits (Fig 4 in Main paper)
python src/grayscale_sensitivity_whisker_plots.py --arch resnet50 --blur_jpg 0.5 --bsize 256 --dataset_dir /mnt/data/CNN_synth_testset/ --gan_name stylegan2 --have_classes 1 --num_instances 200
-
Statistical test (Mood's median test) over maximum activaton histograms for T-FF (Fig 6 in Main paper)
python src/median_test_activation_histograms.py --arch resnet50 --blur_jpg 0.5 --bsize 128 --dataset_dir /mnt/data/CNN_synth_testset/ --gan_name stylegan2 --have_classes 1 --num_instances 200 --topk 114
-
Calculate percentage of color conditional TFF for StyleGAN2
python src/find_color_conditional_percentage.py --arch resnet50 --blur_jpg 0.5 --gan_name stylegan2 --topk 114
- Clone the repository by Wang et al. [1] available here
- Replace the
data/datasets.py
file with ourcr_ud/datasets.py
file - Replace the
networks/trainer.py
file with ourcr_ud/trainer.py
file - Please follow exact instructions by Wang et. al [1] to train the classifier.
- For training EfficientNet-B0, use the argument
--arch efficientnet-B0
AP / Acc | ProGAN | StyleGAN2 | StyleGAN | BigGAN | CycleGAN | StarGAN | GauGAN | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | |
baseline | 100.0 | 100.0 | 100.0 | 99.1 | 95.5 | 95.0 | 99.3 | 96.0 | 95.6 | 90.4 | 83.9 | 85.1 | 97.9 | 93.4 | 92.6 | 97.5 | 94.0 | 89.3 | 98.8 | 93.9 | 96.4 |
top-k | 69.8 | 99.4 | 3.2 | 55.3 | 89.4 | 11.3 | 56.6 | 90.6 | 13.7 | 55.4 | 86.4 | 18.3 | 61.2 | 91.4 | 17.4 | 72.6 | 89.4 | 35.9 | 71.0 | 95.0 | 18.8 |
random-k | 100.0 | 99.9 | 96.1 | 98.6 | 89.4 | 96.9 | 98.7 | 91.4 | 96.1 | 88.0 | 79.4 | 85.1 | 96.6 | 81.0 | 96.2 | 97.0 | 88.0 | 91.7 | 98.7 | 91.9 | 97.1 |
low-k | 100.0 | 100.0 | 100.0 | 99.1 | 95.6 | 95.0 | 99.3 | 96.0 | 95.6 | 90.4 | 83.9 | 85.1 | 97.9 | 93.4 | 92.6 | 97.5 | 94.0 | 89.3 | 98.8 | 93.9 | 96.4 |
AP and Acc | ProGAN | StyleGAN2 | StyleGAN | BigGAN | CycleGAN | StarGAN | GauGAN | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | AP | Real | GAN | |
baseline | 100.0 | 100.0 | 100.0 | 95.9 | 95.2 | 85.4 | 99.0 | 96.1 | 94.3 | 84.4 | 79.7 | 75.9 | 97.3 | 89.6 | 93.0 | 96.0 | 92.8 | 85.5 | 98.3 | 94.1 | 94.4 |
top-k | 50.0 | 100.0 | 0.0 | 54.5 | 94.3 | 7.0 | 52.1 | 97.3 | 2.6 | 53.5 | 97.4 | 3.8 | 47.5 | 100.0 | 0.0 | 50.0 | 100.0 | 0.0 | 46.2 | 100.0 | 0.0 |
random-k | 100.0 | 99.9 | 100.0 | 96.5 | 91.9 | 89.8 | 99.2 | 91.2 | 97.5 | 84.5 | 59.4 | 89.1 | 96.9 | 82.6 | 95.8 | 96.7 | 82.5 | 93.3 | 98.1 | 87.8 | 96.2 |
low-k | 100.0 | 100.0 | 100.0 | 95.3 | 88.7 | 88.3 | 98.9 | 90.8 | 96.1 | 83.5 | 70.8 | 80.8 | 96.6 | 85.2 | 94.1 | 95.4 | 91.0 | 85.4 | 98.1 | 91.2 | 96.4 |
All these results can be reproduced using our pre-calculated forensic feature map relevances.
AP | ProGAN | StyleGAN2 | StyleGAN | BigGAN | CycleGAN | StarGAN | GauGAN |
---|---|---|---|---|---|---|---|
Baseline [1] (RGB) | 100.0 | 99.1 | 99.3 | 90.4 | 97.9 | 97.5 | 98.8 |
Baseline [1] (Color-ablated) | 99.9 | 89.1 | 96.7 | 75.2 | 84.2 | 89.2 | 97.6 |
CR-Detector (Ours) (RGB) | 100.0 | 98.5 | 99.5 | 89.9 | 96.6 | 96.2 | 99.5 |
CR-Detector (Ours) (Color-ablated) | 100.0 | 98.0 | 99.6 | 87.6 | 91.1 | 95.4 | 99.4 |
AP | ProGAN | StyleGAN2 | StyleGAN | BigGAN | CycleGAN | StarGAN | GauGAN |
---|---|---|---|---|---|---|---|
Baseline [1] (RGB) | 100.0 | 99.0 | 99.0 | 84.4 | 97.3 | 96.0 | 98.3 |
Baseline [1] (Color-ablated) | 99.9 | 91.0 | 91.0 | 68.4 | 86.5 | 91.8 | 93.7 |
CR-Detector (Ours) (RGB) | 100.0 | 98.1 | 98.1 | 82.3 | 95.7 | 95.9 | 99.0 |
CR-Detector (Ours) (Color-ablated) | 100.0 | 98.8 | 98.8 | 81.0 | 91.3 | 94.8 | 98.8 |
Minor numerical changes in FF-RS (ω) values when repeating the experiments
Do note you might expect very small numerical changes when you recalculate FF-RS (ω) values due to random resized image crops being used in the computation. We use random resized crops to avoid missing any boundary artifacts in CNN-generated images.
High memory (RAM) used during FF-RS (ω) calculation
This is expected since all feature maps for each sample is stored. We plan to release a lightweight version in the future. For ease, we have released all pre-calculated FF-RS (ω) values in the fmap_relevances/
directory for both ResNet-50 and EfficientNet-B0 detectors.
Which LRP-rule is used in this implementation?
We use
Did you use any pre-training methods?
Yes, following Wang et al [1], we use supervised ImageNet-1K initialization and fine-tune all the layers.
How does the
$\omega$ distribution look like?
We show the
@InProceedings{Chandrasegaran_2022_ECCV,
author = {Chandrasegaran, Keshigeyan and Tran, Ngoc-Trung and Binder, Alexander and Cheung, Ngai-Man},
title = {Discovering Transferable Forensic Features for CNN-generated Images Detection},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {Oct},
year = {2022}
We gratefully acknowledge the following works:
- CNN-generated images are surprisingly easy to spot...for now [1] : https://github.com/peterwang512/CNNDetection
- Pytorch Advanced Machine Learning Explainability methods : https://github.com/jacobgil/pytorch-grad-cam
- EfficientNet (Pytorch) : https://github.com/lukemelas/EfficientNet-PyTorch
- EfficientNet LRP (Pytorch) : https://github.com/AlexBinder/LRP_EfficientnetB0
- Experiment Tracking with Weights and Biases : https://www.wandb.com/
Special thanks to Lingeng Foo and Timothy Liu for valuable discussion.
[1] Wang, Sheng-Yu, et al. "CNN-generated images are surprisingly easy to spot... for now." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[2] Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization." Proceedings of the IEEE international conference on computer vision. 2017.
[3] Bach S, Binder A, Montavon G, Klauschen F, Müller KR, et al. (2015) On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLOS ONE 10(7): e0130140. https://doi.org/10.1371/journal.pone.0130140