In a classification task using a CNN, understanding the discriminating behaviour of some layers is desirable. There are a number of ways that allow to investigate the dynamics of layers.
With Class Activation Maps (CAM) we can inspect which areas of an image contribute the most to the final classification (link to original paper by Zhou et al., 2016).
Grad-CAM is a generalization of the CAM method and was introduced by Selvaraju et al., 2017.
The idea of Grad-CAM is to produce a hotmap of the most sensitive portion of an image for its classification using gradients. Following Selvaraju's notation, given:
- c the predicted class of image
- yc the model score
- A the activation (feature) map of one layer of the network, indicized by k. E.g. Ak=6 refers to the 6-th channel of A.
- ∂yc/∂Ak the derivative of the score w.r.t. the k-th layer of the feature map A.
We then calculate:
Every coeffient αk represents the (global averaged pooled) gradient of the predicted score yc with respect to the feature map Ak. These coeffients carry the 'importance' of the feature map as per the final score.
We then calculate a linear combination of the activation maps weighted by these coeffientients, and ReLu the result.
Note that this map will have the spatial dimension of the activation maps Ak, and not of the input image.
You may consider to investigate different layers. Typically the last conv layer of a CNN architecture is chosen as it carries a decoded representation of the image while preserving its spatial information.