# Exercise 02: Image classification with a CNN

## Goal

You will create and train a convolutional neural network to classify MRI scans of the brain. One class contains scans with a brain tumor and the other scans without tumor. The images are availabe in two folders. The folder ```yes``` contains the images with a tumor and the folder ```no``` those without tumor.

What you will learn in the exercise:

* how to prepare the data
  * split into train and validation data
  * use data augmentation
  * normalize the data
* Create and train a convolutional neural network
* Compute metrics to evaluate the model

## Exercise

Write the code in the empty cells below, following the instructions and execute them. The cells that are already filled in, you just have to execute (shift + enter).


The next cell switches of warnings, so that we do not get distracted by them. 

In [None]:
import warnings
warnings.filterwarnings('ignore')

We import the python modules we will use in the exercise.

* **tensorflow:** Create and use ANNs.
* **numpy:** Vector and matrix calculations
* **classification_report:** Report metrics to evaluate the model
* **confusion_matrix:** Calculate a confusion matrix to evaluate the model
* **ImageDataGenerator:** A tool, that makes it easy to access and prepare the data.
* **pyplot:** Create plots
* **Counter** We will use the counter to calculate the ratio between the sizes of the positive and negative data

In [None]:
import tensorflow as tf
import numpy as np
from sklearn.metrics import classification_report, confusion_matrix
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
from collections import Counter
print(tf.keras.__version__)

## Exercise 2.1

We create an image data generator, that rescales the intensity values in the images, does data augmentation by flipping the images and by applying a small zoom. It also splits the data into a train and a validation set.

Set the missing values for the parameters in the cell below.

In [None]:
datagen = ImageDataGenerator(
    rescale=1.0/255,
    horizontal_flip=True,
    vertical_flip=True,
    zoom_range=0.1,
    fill_mode='reflect',
    validation_split=0.15,
)

## Exercise 2.2

We create an image generator for the trainig set. The generator will create a flow from a directory. This means it will load the next image when it is needed.

* Set the path to the input directory, the directory which contains the ```yes``` and ```no``` folders.
* Provide a list of the names of the two classes, i.e. the names of the subfolders containing the images
* Resize the images in x and y so that all images will have the same size
* Set the batch_size


In [None]:
path = '/home/baecker/Documents/mri/2024/ml-dl/data/brain-tumor/archive/brain_tumor_dataset/'
classes = ['no', 'yes']
target_sizes = (256, 256)
batch_size = 30
train_generator = datagen.flow_from_directory(
        path,
        classes = classes,
        target_size = target_sizes,  
        batch_size = batch_size,
        subset = 'training',
        color_mode = 'grayscale',
        shuffle = True,
        class_mode = 'binary')

We create another image generator for the validation dataset. Since we have set a validation split in the image data generator and used the ```subsets``` training and validation, the training and validation images will not overlap. We shuffle the images in the training set for each epoch, but not for the validation set, since it's not necessary.

In [None]:
validation_generator = datagen.flow_from_directory(
    path,
    classes = classes,
    target_size = target_sizes,
    batch_size = batch_size,
    subset='validation',
    color_mode = 'grayscale',
    shuffle = False,
    class_mode='binary') 

Each element in the train generator is a tupel consisting of a batch of images and a list containing the corresponding labels no and yes encoded as the numbers 0 and 1.

In [None]:
print(train_generator[0][0].shape)
print(train_generator[0][1])

We display the first and second pairs of ground truth and data.

In [None]:
print(train_generator[0][1][0])
plt.imshow(train_generator[0][0][0])
plt.show()
print(train_generator[0][1][1])
plt.imshow(train_generator[0][0][1])
plt.show()

## Exercise 2.3

Create a CNN consisting of a convolutional part followed by a Flatten layer and a fully connected part. Display the summary of the model at the end using ``model.summary()``. The summary can help you to check if the aritechture is resonable. Check the sizes of the feature maps and the number of parameters in the fully connected part. Do not make the model too big, too many parameters can lead to overfitting.

Here are some of the layer types you might want to use:
    
    - tf.keras.layers.Conv2D(nr_of_convolutions, 
                             (filter_size_x, filter_size_y), 
                             padding=padding_mode, 
                             activation=activation_function,
                             input_shape=(image_height, image_width, nr_of_channels) 
                             )
        - You need to set the input_shape only for the first layer
    - tf.keras.layers.AveragePooling2D(height, width)
    - tf.keras.layers.MaxPooling2D(height, width)
    - tf.keras.layers.Flatten()
    - tf.keras.layers.Dense(number_of_units, activation=activation_function)  
    - tf.keras.layers.Dropout(rate)
    - The output unit is just another dense layer with an appropriate activation function

The model is created by passing a list of layers to ``tf.keras.models.Sequential(list_of_layers)``. Assign the result to the variable ```model```. If necessary consult the keras documentation to find out about the parameters you can use with each type of layer.

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (5, 5), padding='same', activation='relu', input_shape=(256, 256, 1)),
    tf.keras.layers.AveragePooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1, activation='sigmoid')])
# tf.keras.utils.plot_model(model, show_shapes=True)
model.summary()

## Exercise 2.4

Set the optimizer, the loss function and at least one additional metric, for example accuracy, by calling the method ``model.compile(optimizer=optimizer, loss=loss_function, metrics=list_of_metrics)``.

To create for example the adam optimizer use ``tf.keras.optimizers.Adam(learning_rate=0.001)``.

In [None]:
additional_metric = 'accuracy'
model.compile(
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.001),
    loss = 'crossentropy',
    metrics=[additional_metric])

We calculate the ratio between the number of elements in the two classes. It will be used to adjust the calculation of the loss and the metrics to the imbalance.

In [None]:
counter = Counter(train_generator.classes)                          
max_val = float(max(counter.values()))       
class_weights = {class_id : max_val/num_images for class_id, num_images in counter.items()}
print(class_weights)

## Exercise 2.5

Train the network by calling ``model.fit(...)``. The method returns the history of the loss, the validation loss (val_loss) and of the additional metrics. Set the number of epochs.

In [None]:
history = model.fit(
    train_generator,
    class_weight=class_weights,
    batch_size=train_generator.batch_size,
    validation_batch_size=validation_generator.batch_size,
    validation_data = validation_generator,
    epochs=80, 
    verbose=1,
    )

## Exercise 2.6

We plot the history of the loss and the validation loss. If the loss is getting smaller over time the model is learning. If the validation loss does not evolve similar to the loss, it means that the model is not generalizing, i.e. overfitting. 

Is the model learning? Does it show signs of overfitting?

In [None]:
plt.plot(history.history['loss'][1:])
plt.plot(history.history['val_loss'][1:])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train loss', 'validation loss'], loc='lower left')
plt.show()

In [None]:
plt.plot(history.history[additional_metric])
plt.plot(history.history['val_' + additional_metric])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['accuracy', "val_accuracy"], loc='lower right')
plt.show()

In [None]:
model.evaluate(validation_generator, verbose=2)

We calculate the confusion matrix for and display a report of different metrics.

In [None]:
Y_pred = model.predict(validation_generator)
y_pred = np.rint(Y_pred)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
print('Classification Report')
target_names = list(validation_generator.class_indices.keys())
print(classification_report(validation_generator.classes, y_pred, target_names=target_names))