Malaria is a life-threatening disease caused by parasites that are transmitted to people through the bites of infected female Anopheles mosquitoes. It is preventable and curable.
In 2017, there were an estimated 219 million cases of malaria in 90 countries. Malaria deaths reached 435 000 in 2017. The WHO African Region carries a disproportionately high share of the global malaria burden. In 2017, the region was home to 92% of malaria cases and 93% of malaria deaths. Malaria is caused by Plasmodium parasites. The parasites are spread to people through the bites of infected female Anopheles mosquitoes, called "malaria vectors." There are 5 parasite species that cause malaria in humans, and 2 of these species – P. falciparum and P. vivax – pose the greatest threat.
Where malaria is not endemic any more (such as in the United States), health-care providers may not be familiar with the disease. Clinicians seeing a malaria patient may forget to consider malaria among the potential diagnoses and not order the needed diagnostic tests. Laboratorians may lack experience with malaria and fail to detect parasites when examining blood smears under the microscope. Malaria is an acute febrile illness. In a non-immune individual, symptoms usually appear 10–15 days after the infective mosquito bite. The first symptoms – fever, headache, and chills – may be mild and difficult to recognize as malaria. If not treated within 24 hours, P. falciparum malaria can progress to severe illness, often leading to death. Microscopic Diagnosis
Malaria parasites can be identified by examining under the microscope a drop of the patient’s blood, spread out as a “blood smear” on a microscope slide. Prior to examination, the specimen is stained to give the parasites a distinctive appearance. This technique remains the gold standard for laboratory confirmation of malaria. However, it depends on the quality of the reagents, of the microscope, and on the experience of the laboratorian.
# Importing the relevant libraries
from tensorflow.keras.layers import MaxPooling2D, SeparableConv2D,BatchNormalization
from tensorflow.keras.layers import Dropout,Flatten,Dense,Input
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelBinarizer
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from sklearn.model_selection import train_test_split
from skimage.transform import resize
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
import numpy as np
import matplotlib.pyplot as plt
import cv2
import os
from imutils import paths
import skimage
from tensorflow.keras import Sequential
# reading in the dataset
dataset = r'C:\Users\animu\Downloads\malaria\Data'
# creating a dictionary to store and iterate through the dataset
args = {}
args['dataset'] = dataset
# separating the data features from the labels and storing them in lists
ipaths = list(paths.list_images(args['dataset']))
features = []
labels = []
for i in ipaths:
label = i.split(os.path.sep)[-2]
image = cv2.imread(i)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (64,64))
labels.append(label)
features.append(image)
data = np.array(features)/255.0
labels = np.array(labels)
# Visualizing the data
infected_images = os.listdir(dataset + '/Malaria/')
normal_images = os.listdir(dataset + '/Normal/')
def cell_image_plotter(i):
uninfected = cv2.imread(dataset + '//Normal//' + normal_images[i])
uninfected = skimage.transform.resize(uninfected, (150,150,3))
malaria = cv2.imread(dataset + '//Malaria//' + infected_images[i])
malaria = skimage.transform.resize(malaria, (150,150,3), mode = 'reflect')
paired = np.concatenate((malaria,uninfected), axis = 1)
print('Malaria Parasitized vs Uninfected Red Blood Cell')
plt.figure(figsize = (10,5))
plt.imshow(paired)
plt.show()
for i in range(5):
cell_image_plotter(i)
# The Malaria infected cells on the left can be clearly
# distinguished by the granulation or small dots present within them.
Malaria Parasitized vs Uninfected Red Blood Cell
Malaria Parasitized vs Uninfected Red Blood Cell
Malaria Parasitized vs Uninfected Red Blood Cell
Malaria Parasitized vs Uninfected Red Blood Cell
Malaria Parasitized vs Uninfected Red Blood Cell
# Transforming the labels to categorical values
binarizer = LabelBinarizer()
labels = binarizer.fit_transform(labels)
labels = to_categorical(labels)
labels
array([[1., 0.], [1., 0.], [1., 0.], ..., [0., 1.], [0., 1.], [0., 1.]], dtype=float32)
# Now that the features and labels are stored in the appropriate format, we can split our data for training
X_train, X_test, y_train, y_test = train_test_split(data, labels,
random_state = 7,
shuffle =True,
stratify = labels,
test_size = .2)
# creating more images using image augmentation
training_data_aug = ImageDataGenerator(validation_split = .2,
horizontal_flip=True,
rotation_range=45,
fill_mode="nearest"
)
# creating the sequential model
model = Sequential()
model.add(SeparableConv2D(16,kernel_size = (5,5),padding = 'same', activation = 'relu', input_shape = data.shape[1:4]))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(SeparableConv2D(32, kernel_size = (5,5),padding = 'same', activation = 'relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(SeparableConv2D(64, kernel_size= (5,5),padding = 'same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(2,2))
model.add(Flatten())
model.add(Dense(126, activation = 'relu'))
model.add(Dropout(.5))
model.add(Dense(2, activation = 'softmax'))
model.summary()
Model: "sequential_36" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= separable_conv2d_7 (Separabl (None, 64, 64, 16) 139 _________________________________________________________________ batch_normalization_6 (Batch (None, 64, 64, 16) 64 _________________________________________________________________ max_pooling2d_42 (MaxPooling (None, 32, 32, 16) 0 _________________________________________________________________ separable_conv2d_8 (Separabl (None, 32, 32, 32) 944 _________________________________________________________________ batch_normalization_7 (Batch (None, 32, 32, 32) 128 _________________________________________________________________ max_pooling2d_43 (MaxPooling (None, 16, 16, 32) 0 _________________________________________________________________ separable_conv2d_9 (Separabl (None, 16, 16, 64) 2912 _________________________________________________________________ batch_normalization_8 (Batch (None, 16, 16, 64) 256 _________________________________________________________________ max_pooling2d_44 (MaxPooling (None, 8, 8, 64) 0 _________________________________________________________________ flatten_34 (Flatten) (None, 4096) 0 _________________________________________________________________ dense_57 (Dense) (None, 126) 516222 _________________________________________________________________ dropout_53 (Dropout) (None, 126) 0 _________________________________________________________________ dense_58 (Dense) (None, 2) 254 ================================================================= Total params: 520,919 Trainable params: 520,695 Non-trainable params: 224 _________________________________________________________________
data.shape[1:4]
(64, 64, 3)
# compiling the model
learning_rate = 1e-3
epochs = 10
batch_sizes = 16
opt = Adam(lr = learning_rate,
decay = learning_rate//epochs)
model.compile(loss= 'binary_crossentropy', optimizer = opt, metrics= ['accuracy'])
# fitting the model with the augmented data
generator = model.fit(
training_data_aug.flow(X_train, y_train, batch_size= batch_sizes),
steps_per_epoch = len(X_train)//batch_sizes,
validation_data = (X_test, y_test),
validation_steps= len(X_test)//batch_sizes,
epochs = epochs
)
Epoch 1/10 100/100 [==============================] - 22s 223ms/step - loss: 0.0714 - accuracy: 0.9787 - val_loss: 0.1317 - val_accuracy: 0.9750 Epoch 2/10 100/100 [==============================] - 19s 194ms/step - loss: 0.0620 - accuracy: 0.9875 - val_loss: 0.1493 - val_accuracy: 0.9600 Epoch 3/10 100/100 [==============================] - 20s 196ms/step - loss: 0.0577 - accuracy: 0.9831 - val_loss: 0.1259 - val_accuracy: 0.9650 Epoch 4/10 100/100 [==============================] - 19s 194ms/step - loss: 0.0641 - accuracy: 0.9806 - val_loss: 0.1758 - val_accuracy: 0.9625 Epoch 5/10 100/100 [==============================] - 20s 204ms/step - loss: 0.0492 - accuracy: 0.9831 - val_loss: 0.1513 - val_accuracy: 0.9650 Epoch 6/10 100/100 [==============================] - 22s 220ms/step - loss: 0.0681 - accuracy: 0.9806 - val_loss: 0.4792 - val_accuracy: 0.9300 Epoch 7/10 100/100 [==============================] - 20s 199ms/step - loss: 0.0434 - accuracy: 0.9862 - val_loss: 0.1811 - val_accuracy: 0.9575 Epoch 8/10 100/100 [==============================] - 19s 193ms/step - loss: 0.0486 - accuracy: 0.9844 - val_loss: 0.1356 - val_accuracy: 0.9550 Epoch 9/10 100/100 [==============================] - 20s 195ms/step - loss: 0.0607 - accuracy: 0.9819 - val_loss: 0.0882 - val_accuracy: 0.9675 Epoch 10/10 100/100 [==============================] - 20s 197ms/step - loss: 0.0758 - accuracy: 0.9862 - val_loss: 0.1302 - val_accuracy: 0.9675
# Visualizing the test predictions
length = 4
width = 5
fig, ax = plt.subplots(length,width, figsize = (13,13))
ax = ax.ravel()
pred = model.predict(X_test, batch_size = batch_sizes)
for i in np.arange(0,length*width):
ax[i].imshow(X_test[i])
ax[i].set_title('Prediction = {}\n True = {}'.format(pred.argmax(axis =1)[i], y_test.argmax(axis =1)[i]))
ax[i].axis('off')
plt.subplots_adjust(wspace = 1, hspace =1)
#calculating the prediction accuracy and printing the classification report
y_prediction = model.predict(X_test, batch_size = batch_sizes)
y_prediction = np.argmax(y_prediction, axis = 1)
print(classification_report(y_test.argmax(axis = 1),
y_prediction, target_names = binarizer.classes_))
print(f'model accuracy = {accuracy_score(y_test.argmax(axis=1), y_prediction)*100}%')
precision recall f1-score support Malaria 0.99 0.94 0.97 200 Normal 0.94 0.99 0.97 200 accuracy 0.97 400 macro avg 0.97 0.97 0.97 400 weighted avg 0.97 0.97 0.97 400 model accuracy = 96.75%
# plotting the training and validation loss and accuracy
# plotting loss
plt.plot(generator.history['loss'], label = 'Training Loss')
plt.plot(generator.history['val_loss'], label = 'Validation Loss')
plt.legend()
plt.show()
plt.savefig('training_validation_loss')
# plotting accuracy
plt.plot(generator.history['accuracy'], label = 'Training Accuracy')
plt.plot(generator.history['val_accuracy'], label = 'Validation Accuracy')
plt.legend()
plt.show()
# saving the model
model.save(r'C:\Users\animu\Downloads\malaria\malaria_classifier.v3')
INFO:tensorflow:Assets written to: C:\Users\animu\Downloads\malaria\malaria_classifier.v3\assets