Detecting Pathologies Through Computer Vision in Ultrasound
  • The Results

Detecting Pathologies Through Computer Vision in Ultrasound

Project completed! Results attached!

Envisionit Deep AI is an innovative medical technology company using Artificial Intelligence to transform medical imaging diagnosis and democratize access to healthcare. In this two-month Omdena Challenge 50 technology changemakers have been building an Ultrasound solution that is able to detect the type and location of different pathologies. The solution works with 2D images and also is able to process a video stream. 


The Problem

The health care services in Africa are under-resourced and overused. Africa is the youngest continent in the world where pneumonia is the number 1 cause of death in children younger than five. Breast cancer is the most frequently diagnosed cancer among women, impacting over two million women worldwide each year and causing the greatest number of cancer-related deaths amongst women. Whilst breast cancer rates are higher among women in more developed regions, rates are increasing in nearly every region globally, with some of the most rapidly increasing incidence rates being from African countries.

Ultrasound is a relatively inexpensive and portable modality of diagnosis of life-threatening diseases and for use in point of care. The procedure is a non-invasive tool and it quickly gives doctors information necessary to make a diagnosis. Sonography machines are being made smaller and smaller, making them more and more accessible to developing countries.

An AI solution integrated with the mobile ultrasound tool will achieve radiology-level performance for diagnosis on ultrasound images, This will assist to deliver impactful and feasible medical solutions to such countries where there are significant resource challenges.


The Project Goals

Envisionit Deep AI sees the AI solution split into the following components:



1. Image Pre-processing and Normalization

Envisionit Deep AI has access to several Ultrasound sets that will be provided. These datasets are in a number of different formats, resolutions, and quality settings, which will be the case with production environments. Different practices and /or hospitals use different Ultrasound equipment that stores the images in a variety of formats and quality settings. The most common storage format is DICOM, archived in a practice /hospital PACS (picture archiving and communication system). Envisionit Deep AI already has the ability to interface with PACS platforms to retrieve and exchange images. However, for this project, we believe that an additional image normalization routine has to be developed to ensure consistency of images in the training, testing, and production datasets.

Envisionit Deep AI will provide a number of datasets for the purposes of investigation, design, and development of an image pre-processing routine that would normalize incoming images.


2. Model Training

Even though the title of this component is about training an AI model, this should be preceded by an algorithm selection /design. The algorithm should be capable of fulfilling the following requirements:

  • Identification of pathologies from a set of pre-defined pathologies on a given image/video frame. Initially, we’d look at 10+ pathologies /labels, but the algorithm should be capable of identifying more pathologies /labels.
  • Identification of the location of pathologies from #1 above – object detection rather than only classification.


Envisionit Deep AI will annotate the initial training dataset (normalized images from component /work item #1 above) with the pathologies /labels and their location. The team will provide Envisionit Deep AI with the format of the annotations as well as any additional details, data, and /or information that is relevant to the selected algorithm.

Envisionit Deep AI utilizes AWS as well as its own, on-premise hardware for model training and will provide an environment where the model can ultimately be trained. To ensure alignment to the currently deployed training environment of Envisionit Deep AI, the new Ultrasound model training environment should consist of the following:

  • AWS S3 bucket location where training and testing images are located together with the relevant annotation files /objects.
  • Docker container with all relevant tools, platforms, libraries installed, and pre-configured.
  • Container configuration file as well as container deployment instructions /parameters.


The Docker training container should connect to the S3 bucket, download the training dataset, and start the training. The container should also provide an ability or interface that allows Envisionit Deep AI and the Omdena collaborators to interrogate progress as well as current training efficiency – loss, mean average precision, and other available metrics.


3. Model Validation, including Field Specialist review

This step involves all the relevant tools to extract model performance metrics as well as provide a UI to a Field Specialist (Radiologist) in order to perform an independent model validation with their own dataset, this dataset may or may not include images that were used in the testing dataset that was used during the model training.

The aim of this validation step is to ensure the adequate performance of the model. Envisionit Deep AI has a set performance metric of 95% accuracy and above before a model is considered to be ready for field pilots and ultimate deployment. Only models with combined (automated and Field Specialist validation) accuracy of 98% are considered for production environments.


4. Field Deployment, including collecting Concordance / Discordance feedback

With all AI models that are deployed by Envisionit Deep AI, an ability for the users to provide concordance /discordance feedback is made available. These stats are collected not only with mere status feedback (Agree /Disagree) but also by allowing the users to augment AI predictions by adjusting the locations of the identified pathologies as well as adding /removing pathologies from the identified set.

These are then recorded, verified, and used for further model training. Envisionit Deep AI already has tools and mechanics to record such feedback, however, we need to ensure that the feedback recorded is adequate for the selected algorithm’s annotation specifications.

Envisionit Deep AI typically includes the following components in-field deployment of a model:

  • AWS S3 bucket with the latest approved model, configuration files, and other components necessary to run the algorithm. This excludes the tools, platform, and code.
  • Docker container(s) with REST-enabled services that would receive an image and run image pre-processing /normalization routines to ensure consistency.
  • Docker container(s) with REST-enabled services that would receive an image /video stream, process it and provide a response in a form of:
    • Set of labels and coordinates of identified pathologies
    • Augmented image with the labels and coordinates overlayed on the original image.
  • Docker container(s) with REST-enabled services that would receive an original image as well as a set of labels /coordinates of the identified pathologies and generate one or more heatmap images based on the configuration /input parameters.


Envisionit Deep AI predominantly used NVIDIA-based GPUs and all cloud and on-premise Docker hosts include support for NVIDIA GPUs in containers themselves.


5. Concordance / Discordance Field Specialist review & addition to the training dataset

This is an important step to ensure that the model is not fed incorrect data for further training, thus concordance /discordance feedback is validated by a radiologist before it is added to the training set (uploaded to the AWS S3 bucket that training containers use). 

A certain level of automation should be available to ensure that Envisionit Deep AI internal staff as well as any Field Specialist consultants are not overwhelmed with feedback validation.

This includes automated quorum testing by using a number of different models (different training stages, different input datasets, perhaps limited to a specific type of images, image quality or contents of images, i.e., specific body part) and only images with discordance feedback significantly deviating from the expected norm should be forwarded for human validation.


Partner testimonial

Articles from the project