This repo hosts a short literature review on how neural networks are easily fooled and looks at a few methods of building neural networks that are robust against adversarial attacks. The literature review is in report.pdf
and the slides for a presentation I gave on the topic are in presentation.pdf
.
Copied from the abstract:
Deep neural networks, and in particular convolutional neural networks, have seen great success over recent years and have achieved state-of-the-art performance in a wide range of tasks. However, recent work has shown that these networks are vulnerable to adversarial attacks which cause these networks to fail spectacularly. In the case of image data perturbations as small as a single pixel have been shown to be enough to fool these networks into assigning incorrect labels with a high level of confidence, and even just small random translations and rotations can fool convolutional neural networks in a similar way. A growing body of work has started on addressing these issues and offer ways of mitigating the effects of adversarial attacks. In this literature review I will give a brief overview of adversarial attacks and the methods proposed that aim to build neural networks robust against these attacks.