# Interpretability of Models
You need to know what features the computer is picking up otherwise you will end up with models that are grossly over fitted. A model could be picking up on background noise to classify an image instead of the main part of the image itself.
This is a deep topic that can take a whole course by itself
## Class Activation Map
A class activation map is a matrix that shows what part of the image the model was paying attention to when it was classifying the image.
Class activation maps can be generated from the last convolution layer and the output of the [[Pooling#Global Average Pooling|Global Average Pooling]] .
Steps to create CAM are:
1. Take the last features from CNN
2. Zoom /upscale the layer using `scipy.ndimage.zoom()` to the size of the final image
3. Get the prediction of the model(if it is a classification problem)
4. Get the weights of the dense network connected to the CNN that is leading to the specific class prediction
5. do a dot product of the the zoomed in features and the weights of the prediction
## Saliency Map
It is like importance of the features not just where the pixels are. Gradient of the loss with respect to the input image. This tells how the loss would change for small values of change in the pixels
```python
# Siberian Husky's class ID in ImageNet
class_index = 251
# If you downloaded the cat, use this line instead
# class_index = 282 # Tabby Cat in ImageNet
# number of classes in the model's training data
num_classes = 1001
# convert to one hot representation to match our softmax activation in the model definition
expected_output = tf.one_hot([class_index] * image.shape[0], num_classes)
with tf.GradientTape() as tape:
# cast image to float
inputs = tf.cast(image, tf.float32)
# watch the input pixels
tape.watch(inputs)
# generate the predictions
predictions = model(inputs)
# get the loss
loss = tf.keras.losses.categorical_crossentropy(
expected_output, predictions
)
# get the gradient with respect to the inputs
gradients = tape.gradient(loss, inputs)
# reduce the RGB image to grayscale
grayscale_tensor = tf.reduce_sum(tf.abs(gradients), axis=-1)
# normalize the pixel values to be in the range [0, 255].
# the max value in the grayscale tensor will be pushed to 255.
# the min value will be pushed to 0.
normalized_tensor = tf.cast(
255
* (grayscale_tensor - tf.reduce_min(grayscale_tensor))
/ (tf.reduce_max(grayscale_tensor) - tf.reduce_min(grayscale_tensor)),
tf.uint8,
)
# remove the channel dimension to make the tensor a 2d tensor
normalized_tensor = tf.squeeze(normalized_tensor)
```
As you can see from the code the key is to run the model with gradient tape and inputs to be a watch variable. Then you can calculate gradients with respect to the inputs. Then you can flatten the gradients (across various channels) and plot them on top of the original image to visualize the regions that are having the most impact(gradient)
![[Pasted image 20210320183125.png]]
## GradCAM
This is a combination of Class Activation Maps and Saliency Maps. GradCAM is a gradient weighted class activation map.
```python
def get_CAM(processed_image, actual_label, layer_name='block5_conv3'):
model_grad = Model([model.inputs],
[model.get_layer(layer_name).output, model.output])
with tf.GradientTape() as tape:
conv_output_values, predictions = model_grad(processed_image)
# watch the conv_output_values
tape.watch(conv_output_values)
## Use binary cross entropy loss
## actual_label is 0 if cat, 1 if dog
# get prediction probability of dog
# If model does well,
# pred_prob should be close to 0 if cat, close to 1 if dog
pred_prob = predictions[:,1]
# make sure actual_label is a float, like the rest of the loss calculation
actual_label = tf.cast(actual_label, dtype=tf.float32)
# add a tiny value to avoid log of 0
smoothing = 0.00001
# Calculate loss as binary cross entropy
loss = -1 * (actual_label * tf.math.log(pred_prob + smoothing) + (1 - actual_label) * tf.math.log(1 - pred_prob + smoothing))
print(f"binary loss: {loss}")
# get the gradient of the loss with respect to the outputs of the last conv layer
grads_values = tape.gradient(loss, conv_output_values)
grads_values = K.mean(grads_values, axis=(0,1,2))
conv_output_values = np.squeeze(conv_output_values.numpy())
grads_values = grads_values.numpy()
# weight the convolution outputs with the computed gradients
for i in range(512):
conv_output_values[:,:,i] *= grads_values[i]
heatmap = np.mean(conv_output_values, axis=-1)
heatmap = np.maximum(heatmap, 0)
heatmap /= heatmap.max()
del model_grad, conv_output_values, grads_values, loss
return heatmap
```