Skip to content

Various techniques to generate adversarial images, that manipulate image classifiers to misclassify images

Notifications You must be signed in to change notification settings

Abhiram-29/MisclassifyMe

Repository files navigation

Adversarial Attack Implementation: FGSM and PGD with VGG16

Here, we use two different algorithms, FGSM(Fast Gradient Sign Method) and PGD(Projected Gradient Descent) to create images, that are identical to the input for the human eye but, fools the image classification models into misclassifying the image as something else. These algorithms work on almost all image classification model, but in my project I have used the VGG16 classifier as an example. The attacks can be targeted(Misclassify the input image to a specified target label) or untargeted ( Misclassify the input image to any label other than its original label). The algorithm for both attacks remains the same, but the definition of the loss function changes based on the attack.

FGSM(Fast Gradient Sign Method)

This method was introduced by Goodfellow et.al. This algorithm generates the adversarial image in a single step making it highly efficient and is used extensively to create adversarial examples to train robust classifiers. This algorithm works by calculating the cross-entropy loss of the image classifier and using gradient descent to modify the image to increase the loss further (in case of untargeted attacks). In targeted attacks, this algorithm aims to improve the probability of the target variable. Each pixel of the image is only modified by a small value($\epsilon$) so the changes are imperceptible to the human eye.

Treeing Walker Coonhound
Input image(Treeing Walker Coonhound)
Chihuahua
Generated FGSM petrubation
Chihuahua
Image with petrubation applied(classified as Chihuahua)
Chihuahua
Actual image of a Chihuahua
Treeing Walker Coonhound
Input image(Magpi)
Chihuahua
Generated FGSM petrubation
Chihuahua
Image with petrubation applied(classified as Crayfish)
Chihuahua
Actual image of a Crayfish

PGF(Projected Gradient Descent)

This algorithm was introduced by Simon Geisler et.al and is built of top of FGSM. It is an iterative algorithm, that tries to find the smallest change to bring about misclassification. This method produces better quality images and has a higher success rate for targeted misclassification when compared to FGSM,but it has higher compute requirements when compared to FGMS due to its iterative nature.

The images generated below are generated using PGD, with the target set. Hyperparameters: 500 epochs and learning rate: 0.001

Treeing Walker Coonhound
Input image(Treeing Walker Coonhound)
Chihuahua
Image with petrubation applied(classified as wippet,target was wippet)
Chihuahua
Actual image of a wippet
Treeing Walker Coonhound
Input image(Magpi)
Chihuahua
Image with petrubation applied(classified as koala, target was koala)
Chihuahua
Actual image of a koala

Clone and run

git clone https://github.com/Abhiram-29/MisclassifyMe.git

cd MisclassifiyMe.git

# install dependencies
pip install -r requirements.txt

About

Various techniques to generate adversarial images, that manipulate image classifiers to misclassify images

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published