AI Insights

Analyzing Brain Scan Images for Early Detection and Diagnosis of Alzheimer’s

April 1, 2024


article featured image

Introduction

Alzheimer’s disease, a debilitating brain condition, leads to a progressive decline in essential cognitive functions, such as memory, thinking, learning, and organizing skills, ultimately impairing an individual’s ability to perform basic daily tasks. This disease is part of a broader category of neurological disorders, which, according to the World Health Organization, account for 9% of all global mortality. Notably, Alzheimer’s and other forms of dementia rank among the top ten leading causes of death worldwide, highlighting the urgent need for effective medical interventions. 

Despite the considerable progress in medical technology, the early detection and accurate diagnosis of these conditions present significant challenges. The complexity of the brain and the subtlety of the changes involved necessitate advanced diagnostic tools and methodologies to identify these diseases at an early stage, thereby offering a window for timely intervention and management.

Alzheimer’s disease predominantly impacts individuals over the age of 65, with the likelihood of developing the condition increasing with age beyond this threshold. However, Alzheimer’s is not exclusively a disease of the elderly; a subset of patients, typically in their 40s or 50s, are diagnosed with early-onset Alzheimer’s disease. Although this form of the disease is considerably rarer, accounting for less than 10% of all Alzheimer’s cases, it underscores the variability and unpredictability of the condition’s onset.

Globally, Alzheimer’s disease is alarmingly common, affecting approximately 24 million people. The prevalence of Alzheimer’s increases with age, affecting about one in ten individuals over the age of 65. This rate escalates significantly among those over 85, with nearly a third of this age group suffering from the condition. The widespread prevalence of Alzheimer’s underscores the critical need for continued research into its causes, early detection methods, and treatment options to mitigate its impact on individuals, families, and healthcare systems worldwide.

The trajectory of Alzheimer’s disease prevalence presents a deeply concerning picture for global health. According to a study by Ron Brookmeyer, Elizabeth Johnson, Kathryn Ziegler-Graham, and H. Michael Arrighi, published in ScienceDirect, the global incidence of Alzheimer’s was approximately 26.6 million in 2006. Alarmingly, this figure is projected to quadruple by 2050, indicating that 1 in 85 individuals worldwide will be living with Alzheimer’s disease. This staggering increase underscores not only the growing impact of the disease on individuals and families but also the immense pressure it places on healthcare systems globally.

Furthermore, the study estimates that about 43% of those living with the disease will require a high level of care, equivalent to that provided in a nursing home setting. This projection highlights the urgent need for advancements in prevention, diagnosis, and treatment strategies to manage the burgeoning global burden of Alzheimer’s disease and to support the millions of individuals and their caregivers who are affected by this devastating condition.

Importance of Early Detection of Alzheimer’s Disease

The statistics presented by Ron Brookmeyer, Elizabeth Johnson, Kathryn Ziegler-Graham, and H. Michael Arrighi in their study on the global burden of Alzheimer’s disease highlight an alarming trend: in 2006, the worldwide prevalence of Alzheimer’s was 26.6 million, but by 2050, this number is expected to quadruple. At this future juncture, 1 in 85 individuals globally will be living with Alzheimer’s, with an estimated 43% of these cases requiring a level of care equivalent to that of a nursing home. These staggering figures underscore the pressing need for advancements in the early detection of Alzheimer’s disease.

The importance of early detection of Alzheimer’s cannot be overstated, as it offers significant medical, emotional, and social benefits for individuals diagnosed with the condition. Medically, an early diagnosis opens the door to a range of treatment options that can potentially slow the progression of the disease. It also provides individuals with the opportunity to participate in clinical trials, offering hope not just for their own treatment but also contributing to scientific research that can aid future patients. Early detection encourages individuals to prioritize their health, guiding them towards a lifestyle that may help in managing the disease more effectively.

From an emotional and social perspective, understanding the root cause of their symptoms can significantly reduce anxiety for patients and their families. It allows families to plan for the future, make the most of their time together, and access a wide array of resources and support programs designed to help them navigate the challenges associated with Alzheimer’s. Together, these benefits highlight the critical role that early detection plays in improving the quality of life for those affected by Alzheimer’s disease, making it a key area of focus in the ongoing battle against this debilitating condition.

The additional time afforded by early detection of Alzheimer’s disease to plan for the future is invaluable, both on a personal and an economic level. For individuals diagnosed and their families, early detection offers the crucial opportunity to make informed decisions regarding legal, financial, and end-of-life matters. This foresight not only ensures that the wishes of those diagnosed are respected but also alleviates potential stresses on family members, enabling them to navigate future challenges with a clear understanding of their loved one’s preferences.

Furthermore, the economic implications of early diagnosis are significant. For families and the U.S. government alike, early detection can lead to substantial cost savings in medical and long-term care expenses. A proactive approach to diagnosing Alzheimer’s during the stage of mild cognitive impairment, before the onset of full-blown dementia, presents a profound financial impact. It is estimated that among all Americans currently alive, if those who will develop Alzheimer’s could be diagnosed in the earlier stages of the disease, it would result in a collective savings of approximately $7 trillion in health and long-term care costs.

Use of AI in Medical Imaging

“AI will lead to increased use of quantitative imaging and structured reports, but he noted that it’s a far cry from replacing the radiologist. Instead, deep-learning algorithms will be tools for the radiologist. While computers can do things humans cannot, that argument works both ways; it also makes the case for computers working hand in hand with humans”​

– Dr Eliot Siegel , M.D., RadSite’s Chief Technology Officer and Standards Committee Chair, Podcast 21st Feb 2017. (gotowebinar.com)​

Global AI in Medical Imaging market

AI-enabled medical imaging solutions market 2017-2027

Project Goals

The objective of this project is to harness the capabilities of artificial intelligence, particularly through machine learning and computer vision techniques, to revolutionize the early detection and diagnosis of Alzheimer’s disease. Our focus is on developing an AI model capable of analyzing brain scan images to discern patterns that may signify the presence of these neurological disorders, ultimately making predictions with a high degree of accuracy. 

By leveraging AI technology, we aim to create a tool that can complement current diagnostic methods, offering a more objective and potentially earlier indication of Alzheimer’s and related conditions. This innovative approach holds the promise of enhancing patient outcomes by facilitating timely intervention and management strategies, thereby advancing the fight against these debilitating diseases.

Tools Used 

Discussions and Version Controlling:

  • Slack: Utilize Slack channels for real-time communication, discussion, and collaboration among team members.
  • Dagshub: Employ Dagshub for version control, tracking changes, and managing project documentation in a collaborative environment.
  • Google Workspaces: Utilize Google Workspaces for file sharing, document collaboration, and scheduling meetings.

Data Pre-processing & Manipulation

  • Nibabel: Utilize Nibabel for reading and writing neuroimaging data formats such as NIfTI and DICOM.
  • OpenCV: Use OpenCV for image processing tasks such as resizing, cropping, and filtering.
  • Scikit-Image: Employ Scikit-Image for a variety of image processing tasks including segmentation, feature extraction, and image enhancement.
  • Keras: Utilize Keras for building and training neural networks, particularly for preprocessing steps within the model pipeline.

Data Visualization

  • Matplotlib: Use Matplotlib for creating static, interactive, and publication-quality visualizations to explore and present data insights.
  • Seaborn: Utilize Seaborn for statistical data visualization, providing an aesthetically pleasing and informative representation of data distributions and relationships.

Augmentation

  • imgaug: Employ imgaug for data augmentation techniques such as flipping, rotation, and scaling to increase the diversity of the training dataset.
  • Modeling:
    • TensorFlow: Utilize TensorFlow for building and training deep learning models, leveraging its flexibility and scalability for complex neural network architectures.
    • PyTorch: Use PyTorch for its dynamic computational graph construction and ease of use in building and training neural networks.
    • CNN Transfer Learning: Implement transfer learning techniques with pre-trained convolutional neural network models to leverage knowledge learned from large datasets for tasks with limited training data.

Model Testing and Evaluation

  • DeepChecks: Employ DeepChecks for comprehensive model evaluation, including performance metrics, visualization of results, and comparison against baselines.
  • Deployment:
    • Streamlit: Utilize Streamlit for building interactive web applications to deploy trained models, allowing users to interactively input data and view predictions.
    • HuggingFace Spaces: Utilize HuggingFace Spaces for sharing and deploying natural language processing models and pipelines.
    • Docker and FastAPI: Containerize the application using Docker for easy deployment and utilize FastAPI for building RESTful APIs to serve model predictions efficiently.

This setup provides a comprehensive ecosystem for developing, training, evaluating, and deploying machine learning models for Alzheimer’s disease detection, while ensuring effective communication, collaboration, and version control throughout the project lifecycle.

Data Sources & Data Sets

Dataset Description Data Source
Alzheimer’s Dataset​ The data consists of MRI images. The data has four classes of images both in training as well as a testing set:

1. Mild Demented

2. Moderate Demented

3. Non Demented

4. Very Mild Demented

There are a total of 6400 images.​

KAGGLE: Alzheimer’s Dataset ( 4 class of Images)
ADNI Baseline 3T-Pre-processed​ Collections of uniformly pre-processed images of 300 patients in NIFTI(.nii) format belonging to 3 classes : AD(Alzheimer’s), CN(Control Normal) and MCI(Mild Cognitive Impairment) ADNI

EDA, Pre-processing and Augmentation

For the exploratory data analysis (EDA), preprocessing, and augmentation stages of the project, we will primarily focus on a Kaggle dataset containing MRI images categorized into four classes: MildDemented, ModerateDemented, NonDemented, and VeryMildDemented.

The dataset consists of a total of 6400 grayscale images, each with dimensions (208, 176). These images are pre-processed MRI scans, and the pixel intensities range between 0 and 255.

We have chosen this dataset for analysis because of its ample size, providing a substantial number of images to work with compared to other datasets like ADNI, where the number of subjects is limited.

During the EDA phase, we will explore the distribution of images across different classes, analyze pixel intensity distributions, and visualize sample images to gain insights into the dataset’s characteristics.

For preprocessing, we will standardize image dimensions, normalize pixel intensities, and address any issues such as noise or artifacts that may affect model performance.

Augmentation techniques will be applied to increase the diversity of the training dataset, including flipping, rotation, zooming, and adjusting brightness and contrast. This process will help improve model generalization and robustness.

Overall, leveraging this Kaggle dataset for EDA, preprocessing, and augmentation will lay a solid foundation for subsequent model development and evaluation, facilitating the early detection and diagnosis of Alzheimer’s disease through machine learning techniques.

The dataset is highly class imbalanced which can lead to biased results and model interpretation. Thus, there is a need for data augmentation or weighted loss function. The distribution of classes in training and testing folders is shown below.​

Training Data Testing Data Class Totals
NonDemented 2560​ 640​ 3200  (50 %)
VeryMildDemented 1792​ 448​ 2240 (35 %)
MildDemented 717​ 179​ 896   (14 %)
ModerateDemented 52​ 12​ 64     (1 %)
Train Test totals 5121 (80 %) 1279  (20 %) 6400​

Class Distribution in Training Set

ADNI Data Exploration

For the data exploration of the ADNI dataset, we will focus on three primary classes used for classification:

  • AD (Alzheimer’s Disease): This class represents patients diagnosed with Alzheimer’s disease.
  • MCI (Mild Cognitive Impairment): Patients in this class exhibit mild symptoms of cognitive impairment and have a risk of progressing to Alzheimer’s disease with age or disease progression.
  • CN (Control Normal): This class includes patients who show no evidence of Alzheimer’s disease.

The structural MRI (sMRI) data in the ADNI dataset is collected in a 3D manner, providing images along three axes: Axial, Sagittal, and Coronal. Each axis represents a voxel, and the .nii file dimensions are (256, 256, 170), indicating the number of slices in each voxel.

All relevant information regarding the subjects and collected data is contained in a .csv file. The Image Data ID serves as a unique identifier, while the Subject field may repeat depending on the category of scans, such as MPRAGE, MPRAGE-repeat, Survey, etc.

This dataset structure allows for comprehensive analysis, as it includes both clinical information and volumetric MRI data. Exploratory analysis will involve examining the distribution of classes, investigating potential relationships between clinical variables and imaging data, and understanding the variability in image acquisition across different subjects and scan types.

Additionally, preprocessing steps will be crucial to standardize the data and ensure consistency across different scans and subjects. This may involve normalization of image intensities, registration of images to a common space, and handling missing or corrupted data.

ITK-SNAP is a valuable tool used for visualizing and analyzing neuroimaging data, particularly NIFTI (.nii) files. It provides a user-friendly interface for viewing volume slices across the three-dimensional planes: Axial, Sagittal, and Coronal.

ITK-SNAP

Fig: The .csv file showing the patients metadata and scan information.​

Fig – The .csv file showing the patients metadata and scan information.​

Kaggle Data Preprocessing

In the preprocessing pipeline for the Kaggle dataset, the initial step involved converting images from the DICOM format to PNG for standardized handling. Subsequently, images underwent resampling using simpleITK to ensure a consistent voxel size across the dataset, followed by noise reduction filtering to enhance image quality. Various filters, including the Median, Gaussian, Bilateral, and Non-local Means filters, were applied to reduce noise, which could compromise diagnostic accuracy and hinder feature extraction. Intensity normalization was performed to ensure uniform pixel intensity distribution across images. 

Feature extraction and classical segmentation techniques were then employed to identify relevant structures and regions of interest within the images. Additionally, image augmentation techniques were applied to increase dataset diversity and improve model generalization. Throughout the preprocessing phase, the Peak Signal-to-Noise Ratio (PSNR) served as a metric to quantitatively assess image quality by evaluating the ratio of signal strength to noise, ensuring the fidelity of the processed images for subsequent analysis and model development.

Fig - Comparison of different filtering Median, Gaussian, bilateral and nl_means and the original image on a sample .dcm images.​

Fig – Comparison of different filtering Median, Gaussian, bilateral and nl_means and the original image on a sample .dcm images.​

Fig - PSNR metric boxplots of different filtering Median, Gaussian, bilateral and nl_means.​

Fig – PSNR metric boxplots of different filtering Median, Gaussian, bilateral and nl_means.​

Intensity Normalization

Intensity normalization is a critical step in preparing medical images for analysis. Histogram Equalization and CLAHE techniques enhance contrast by stretching the pixel intensity histogram across the entire range. CLAHE, specifically, operates locally, preventing image washout and preserving important features.

Z-score normalization is another common method used in medical image preprocessing. By subtracting the mean and dividing by the standard deviation, pixel intensities are standardized to follow a normal distribution, aiding in comparison and analysis across different images.

Zero-One normalization is a straightforward technique that scales pixel values to a range between 0 and 1. By subtracting the minimum intensity and dividing by the intensity range, this method ensures consistency in pixel values across the dataset, facilitating effective analysis and model training.

Percentile normalization is a variation of zero-one normalization that scales pixel values based on percentiles. By subtracting a percentile value and dividing by the difference between percentiles, this method accommodates variations in pixel intensity distribution, ensuring robustness in image analysis and feature extraction.

image analysis

Fig - Histogram Equalization​; Note how histogram equalization and CLAHE methods stretch the intensity histograms making the overall image sharper and having high contrast.

Fig – Histogram Equalization​. Note how histogram equalization and CLAHE methods stretch the intensity histograms making the overall image sharper and having high contrast.

Feature extraction

Feature extraction is a crucial step in analyzing medical images, enabling the identification of key structures and patterns relevant to diagnosis and treatment. Among the techniques used for feature extraction, edge detection plays a pivotal role in delineating object boundaries and highlighting important structural details.

Roberts edge detection algorithm operates by detecting areas with rapid changes in pixel intensity, effectively highlighting edges within the image.

Sobel edge detection, on the other hand, emphasizes areas with high first-order derivatives in both horizontal and vertical directions, providing a comprehensive representation of edge structures.

Scharr edge detection is known for its improved performance, particularly in images with high-frequency content, offering enhanced edge detection results compared to other methods.

Prewitt edge detection is similar to Sobel but utilizes a slightly different kernel, offering an alternative approach to edge extraction with comparable effectiveness.

Canny edge detection stands out as a multi-stage algorithm renowned for producing high-quality edge maps with minimal noise. By employing multiple steps, Canny edge detection delivers precise binary edge maps, facilitating accurate feature extraction for subsequent analysis and interpretation.

Fig - Edge detection for feature extraction

Fig – Edge detection for feature extraction

Corner and keypoint detection are fundamental techniques in computer vision, enabling the identification of distinctive features crucial for various applications such as object recognition, image stitching, and 3D reconstruction.

Harris Corner Detection algorithm identifies corners by analyzing variations in intensity within a sliding window, effectively pinpointing areas with significant changes in pixel values.

Shi-Tomasi Corner Detection, a refinement of the Harris method, computes a score value R using a different approach, enabling more precise corner detection and localization.

For keypoint detection, the Scale-Invariant Feature Transform (SIFT) algorithm is widely used due to its robustness in detecting corners, blobs, and circles across different scales. SIFT is particularly valuable in scenarios where images may vary in size or orientation.

Alternatively, the Oriented FAST and Rotated BRIEF (ORB) algorithm leverages FAST and BRIEF techniques for keypoint detection and matching based on intensity variations. ORB offers efficient performance and is commonly used for tasks such as one-shot facial recognition and keypoint-based image alignment.

Fig - Feature extraction with Harris corner detection (left), Shi-Tomasi (middle) and ORB keypoint detector (right)​

Fig – Feature extraction with Harris corner detection (left), Shi-Tomasi (middle) and ORB keypoint detector (right)​

Image Segmentation

Image segmentation plays a critical role in identifying and delineating regions of interest within medical scans, enabling precise analysis and diagnosis. Several methods are employed for this purpose, each tailored to specific characteristics of the images and the desired segmentation outcomes.

Multi-Otsu Thresholding is a technique used for segmenting pixels into multiple classes based on intensity levels. By determining several thresholds, typically represented by a red line in the histogram, Multi-Otsu Thresholding classifies pixels into distinct intensity levels, facilitating the identification of different structures or tissues within the image.

Region-based Segmentation, on the other hand, employs a combination of techniques to identify and delineate regions of interest. Initially, a Sobel filter is applied to generate an elevation map, highlighting potential regions within the image. Markers are strategically placed to guide the segmentation process, providing information on the desired boundaries or regions of interest. The watershed algorithm is then applied, utilizing the marker guidance to segment the image into distinct regions. Additionally, this method fills holes in regions and labels connected components, resulting in comprehensive segmentation that accurately represents the underlying anatomical structures or abnormalities within the medical scans.

Fig - Multi-Otsu Thresholding with Three Classes

Fig – Multi-Otsu Thresholding with Three Classes

Fig - Region based Segmentation ​; Showing the elevation map, markers and segmentation binary (top) and the final segmentation image (bottom).

Fig – Region based Segmentation. Showing the elevation map, markers and segmentation binary (top) and the final segmentation image (bottom).

Image Augmentation

Image augmentation plays a crucial role in improving the robustness and generalization ability of deep learning models, particularly in scenarios where data availability is limited or class imbalances are present. In this project, various augmentation techniques are employed to enhance dataset diversity and address class imbalances effectively.

The Keras Image Generator Augmentation technique is utilized during model training to address class imbalances by applying scaling and geometric transformations to images, thereby increasing the variability of the dataset and improving model performance.

Cut, Paste, and Learn Synthesis is employed for synthetic data generation, augmenting the dataset with artificially generated images to further increase diversity and reduce the class imbalance problem. This technique aids in ensuring that the model is exposed to a wide range of scenarios, enhancing its ability to generalize to unseen data.

Furthermore, a combination of augmentation techniques is applied using the imgaug package, incorporating horizontal and vertical flipping, scaling, translation, brightness and contrast adjustment, Gaussian blurring, additive Gaussian noise, saturation adjustment, shear, and contrast-limited adaptive histogram equalization (CLAHE). By leveraging these combined augmentation methods, the dataset variability is significantly enhanced, contributing to the overall robustness and effectiveness of the trained models in accurately identifying and diagnosing Alzheimer’s disease from brain scan images.

Model Development

Architecture

The custom Convolutional Neural Network (CNN) architecture employed in this project comprises two convolutional layers, each followed by a max-pooling layer, batch normalization layer, and dropout layer. These layers are instrumental in feature extraction, allowing the model to capture intricate patterns present in the input data. The ReLU (Rectified Linear Unit) activation function is utilized throughout the network to introduce non-linearity, enhancing the model’s capacity to learn complex representations from the input data.

In the final stage of the architecture, dense layers are employed to perform classification tasks. The softmax activation function is utilized in the last dense layer, which consists of four neurons representing the four classes of the dataset required for multi-class classification. The softmax function ensures that the output probabilities sum up to one, providing a probabilistic interpretation of the model’s predictions across the different classes. This architecture, with its combination of convolutional, activation, pooling, and dense layers, enables effective feature extraction and classification, making it well-suited for the task of diagnosing Alzheimer’s disease from brain scan images.

Max pooling layers play a crucial role in the convolutional neural network (CNN) architecture by downsampling the spatial dimensions of the input. This downsampling process retains essential features while effectively reducing computational complexity, making the network more efficient in processing large volumes of data.

Batch normalization layers are employed to normalize activations within each layer of the network. By reducing internal covariate shift, batch normalization accelerates training convergence and stabilizes the learning process. This normalization technique ensures that the network’s parameters are updated in a consistent and efficient manner, leading to improved overall performance.

Dropout layers serve as regularization measures to mitigate overfitting in the CNN model. By randomly dropping a fraction of connections during training, dropout layers prevent the network from becoming overly reliant on specific features or patterns in the data. This regularization technique promotes the generalization of the model, enhancing its ability to perform effectively on unseen data and improving overall robustness.

Fig - Model Summary; This summary shows the two convolution layers. The first layer has 64 filters with a kernel size of (3 x 3) with padding parameters set to the same. ​The second convolutions step has 16 filters with a filter size of (5 x 5) again with same padding.​

Fig – Model Summary. This summary shows the two convolution layers. The first layer has 64 filters with a kernel size of (3 x 3) with padding parameters set to the same. ​ The second convolutions step has 16 filters with a filter size of (5 x 5) again with same padding.​

Data Imbalance

The Synthetic Minority Over-sampling Technique (SMOTE) is a valuable method employed to tackle the class imbalance problem commonly encountered in machine learning tasks. By synthesizing new instances for the minority class, SMOTE effectively balances the class distribution, ensuring that the model receives sufficient representation from all classes.

The SMOTE algorithm achieves this by generating synthetic examples of the minority class through interpolation between existing minority class instances. This approach helps prevent the model from becoming biased towards the majority class, which often occurs when the dataset is heavily skewed towards one class. By augmenting the minority class with synthetic examples, SMOTE promotes a more equitable representation of all classes, thereby improving the model’s ability to learn and generalize effectively across different class distributions. Overall, SMOTE is a powerful technique for addressing class imbalance issues and enhancing the performance of machine learning models in classification tasks.

Hyperparameter Optimization using KerasTuner ​

Keras Tuner, an invaluable hyperparameter optimization tool, is instrumental in efficiently searching for the optimal set of hyperparameters tailored to the convolutional neural network architecture. Employing Bayesian optimization, a probabilistic model-based approach, Keras Tuner iteratively explores the hyperparameter space to pinpoint configurations that maximize the validation accuracy of the model.

Various parameters are systematically varied within specified ranges during the exploration process. For instance, the number of convolutional layers ranges from 1 to 15, while the number of filters for each convolutional layer varies between 16 to 128, with filter sizes ranging from 3×3 to 5×5. Additionally, parameters such as the dropout rate, number of neurons in dense layers, and others are also adjusted to uncover the optimal combination of hyperparameters.

The optimization process involves training the model for 10 epochs and conducting a maximum of 20 iterations to exhaustively search through the parameter space and identify the best-performing model configuration. Through this meticulous approach, Keras Tuner facilitates the development of highly optimized convolutional neural network architectures, enhancing their effectiveness in solving complex classification tasks like Alzheimer’s disease detection from brain scan images.

Model Training

In the initial preprocessing stage, the dataset is divided into training data (80%) and testing 

data (20%) sets. Subsequently, the training data is further split to create a validation dataset, constituting 20% of the training data. This ensures that model performance is evaluated on a separate validation set during training, aiding in the assessment of generalization capabilities and preventing overfitting.

Once the data imbalance problem is addressed using techniques like SMOTE, the model is trained with 100 epochs and a batch size of 16. Hyperparameter tuning is facilitated by KerasTuner, which systematically explores the hyperparameter space to identify the optimal configuration for the convolutional neural network architecture.

During training, the learning rate is set to 0.002, and the Adam optimizer is employed to optimize the model parameters. Adam optimizer offers efficient convergence and adaptability to various optimization landscapes, making it suitable for training deep learning models.

For the multi-class classification problem of diagnosing Alzheimer’s disease, the categorical cross-entropy loss function is chosen. This loss function is well-suited for scenarios where each sample belongs to exactly one class, ensuring that the model optimizes its parameters to minimize classification errors across all classes.

Throughout the training process, the accuracy metric of both the training and validation datasets is monitored. This allows for the assessment of model performance and generalization capabilities, ensuring that the model achieves high accuracy not only on the training data but also on unseen validation data. By monitoring accuracy metrics, any potential overfitting or underfitting issues can be identified and addressed effectively.

Training and validation loss

Training and validation accuracy

Fig – Training and Validation Accuracy. The decrease in loss function and increase in accuracy is plotted against epochs for both the training and validation datasets.​

Experiments with other models

In addition to the final model, several alternative approaches were explored to further enhance the performance of the Alzheimer’s disease detection system:

  1. Transfer Learning with Pre-trained Models: Transfer learning techniques were applied using pre-trained models such as VGG19 and EfficientNetV2. By leveraging the knowledge learned from large-scale datasets, these pre-trained models were fine-tuned on the Alzheimer’s dataset to expedite convergence and potentially improve classification accuracy.
  2. Exploration with fast.ai Library: The fast.ai library was utilized to experiment with various pre-trained models, including ResNet18, ConvNext_tiny_in22k, VGG16, and RegNetX_080. By leveraging fast.ai’s streamlined APIs and comprehensive model zoo, these pre-trained architectures were fine-tuned and evaluated for their suitability in Alzheimer’s disease classification tasks.
  3. Error Level Analysis (ELA) with ResNet50: Error Level Analysis (ELA) was employed as an additional technique for image classification, particularly with the ResNet50 pre-trained model. ELA helps identify regions of an image that may have been digitally manipulated or altered, providing insights into potential areas of interest for Alzheimer’s disease diagnosis.
  4. Exploration of Loss Functions: Alternative loss functions, such as weighted cross-entropy loss, were investigated to address class imbalance issues within the dataset. By assigning different weights to each class based on their frequency or importance, weighted cross-entropy loss aims to mitigate the impact of class imbalance and improve model performance in multi-class classification tasks.

These exploratory approaches underscore the iterative nature of model development, as various techniques and methodologies are tested and evaluated to identify the most effective strategies for Alzheimer’s disease detection from brain scan images. Through this iterative process, insights are gained into the strengths and limitations of different approaches, ultimately leading to the refinement and optimization of the final model architecture.

Testing and validation of the Model

After the model is trained for 100 epochs it performed relatively well on the training and validation data as can be inferred from the metrics and loss values.​

Training Data Validation Data
Loss 0.0175 0.0404
Accuracy 0.9946 0.9922

To check how well the model generalizes to unseen data or if the model overfit the training data we need to compute the metrics on the test dataset. ​

Fig - Confusion Matrix ​ ; The confusion matrix shows that the model predicts most of the labels correctly.​ Also we note that the minority class has been well classified.​

Fig – Confusion Matrix. The confusion matrix shows that the model predicts most of the labels correctly.​ Also we note that the minority class has been well classified.​

Upon testing the model on our test data, we get an accuracy of  0.9922.​

Classification Report: 

Classification Report

Deployment

The deployment architecture for the Alzheimer’s disease detection system follows a modular and scalable structure, with distinct front-end and back-end components:

  • Front-End: The front-end component of the system is developed using the Streamlit framework, which allows for the creation of an interactive and visually appealing user interface. Streamlit enables easy integration of data visualisation, input forms, and result displays, enhancing the user experience.
  • Back-End: The back-end component of the system utilizes a Docker container to encapsulate the machine learning model and serve it as a REST API. Docker abstracts away the complexities of environment setup and dependency management, ensuring consistency and reproducibility across different deployment environments. This approach enhances scalability and facilitates easy deployment across various platforms and infrastructures.
  • Deployment: The deployment of the application is facilitated by HuggingFace, a platform designed for hosting and sharing machine learning models and applications as RESTful APIs. Leveraging Hugging Face Spaces streamlines the deployment process, providing a seamless experience for deploying, managing, and scaling machine learning applications.

By separating the front-end and back-end components and leveraging tools like Streamlit, Docker, and Hugging Face Spaces, the deployment architecture ensures flexibility, scalability, and efficiency in deploying the Alzheimer’s disease detection system, ultimately providing a user-friendly and accessible solution for detecting and diagnosing Alzheimer’s disease from brain scan images.

Fig Modelling Deloyment

Summary

The Alzheimer’s disease classification task involved a comprehensive approach from data collection to model deployment:

  1. Data Collection: Data was collected from various sources, including Kaggle and ADNI. The Kaggle dataset was chosen for its large number of well-annotated images, totaling 6400 images across 4 classes. The dataset was shuffled and split into train, test, and validation sets.
  2. Data Pre-processing: Various techniques were applied to enhance the images, including sharpening filters and contrast and intensity improvement methods. Classical feature extraction and segmentation methods were employed to identify prominent sections in the images before model building.
  3. Data Augmentation: To address class imbalance, a variety of augmentation methods were explored, such as Keras image generator augmentation, Cut, Paste, and Learn synthesis, and a combination of augmentation techniques. The Keras method was ultimately used to achieve the final results.
  4. Model Development: A custom Convolutional Neural Network (CNN) architecture was developed for the classification task. Although other methods, including pre-trained models, were tested, the best results were obtained with the custom model. Hyperparameter tuning was performed using Keras Tuner, and the model’s training progress was monitored using accuracy metrics.
  5. Model Testing and Evaluation: The final model demonstrated excellent performance on both the validation and testing datasets, achieving an accuracy of 99.22% on the test data. The classification report indicated good metrics, including recall, F1-score, and precision, suggesting that the model effectively classified most of the images correctly. The Confusion Matrix further confirmed the model’s performance and lack of overfitting.
  6. Model Deployment: The best-fit model was deployed to a Hugging Face Space using a Streamlit app for the front-end and a Docker container as the back-end with a RESTful API. This deployment architecture ensured scalability, modularity, and ease of use for end-users.

Overall, the comprehensive approach from data collection to model deployment resulted in a robust and effective system for Alzheimer’s disease detection from brain scan images.

By Omdena Toronto, Canada Chapter

Want to work with us?

If you want to discuss a project or workshop, schedule a demo call with us by visiting: https://form.jotform.com/230053261341340

Related Articles

media card
Top 66 Innovative Medical Imaging Companies to Follow in 2024
media card
UK WellnessTech Company GoodBoost and Omdena Deploy Web App to Gamify Exercises for MSK Conditions
media card
Visualizing Pathologies in Ultrasound Images Using OpenCV and Streamlit
media card
Deploying a Model Using Docker as Endpoint in a Pathology Mobile App