Here, we use two different algorithms, FGSM(Fast Gradient Sign Method) and PGD(Projected Gradient Descent) to create images, that are identical to the input for the human eye but, fools the image classification models into misclassifying the image as something else. These algorithms work on almost all image classification model, but in my project I have used the VGG16 classifier as an example. The attacks can be targeted(Misclassify the input image to a specified target label) or untargeted ( Misclassify the input image to any label other than its original label). The algorithm for both attacks remains the same, but the definition of the loss function changes based on the attack.
This method was introduced by Goodfellow et.al. This algorithm generates the adversarial image in a single step making it highly efficient and is used extensively to create adversarial examples to train robust classifiers.
This algorithm works by calculating the cross-entropy loss of the image classifier and using gradient descent to modify the image to increase the loss further (in case of untargeted attacks). In targeted attacks, this algorithm aims to improve the probability of the target variable.
Each pixel of the image is only modified by a small value(
This algorithm was introduced by Simon Geisler et.al and is built of top of FGSM. It is an iterative algorithm, that tries to find the smallest change to bring about misclassification. This method produces better quality images and has a higher success rate for targeted misclassification when compared to FGSM,but it has higher compute requirements when compared to FGMS due to its iterative nature.
The images generated below are generated using PGD, with the target set. Hyperparameters: 500 epochs and learning rate: 0.001
git clone https://github.com/Abhiram-29/MisclassifyMe.git
cd MisclassifiyMe.git
# install dependencies
pip install -r requirements.txt








