Normalized Difference Vegetation Index — You Don’t Always Need Deep Learning for Satellite Imagery

March 21, 2021

While looking for an ML solution to understand the relationship between climate change and forced displacement in Somalia, Deep Learning turned out to be non-resource-efficient. Instead, we used satellite imagery indices to understand image bands and the different combinations to get the information and data we needed for our project.

Author: Vishal R

Deep Learning algorithms are designed to mimic the working of our human brain. We know our brain is a powerful computer. So, an algorithm that mimics such a computer must need a lot of processing power. This is one of the many disadvantages of Deep Learning.

A Few Other Disadvantages

A large amount of data

Training a Deep Learning model requires a lot of data. But it does not stop here. The distribution of the data (in a classification task) must be uniform for good results. A non-uniform distribution can make the results of the model biased.

Possibility of overfitting

While training a model, there is a high chance that the model might end up overfitting the data if the training is not stopped at the right point. There are several methods (like early stopping) that can help prevent overfitting. But they can sometimes lead to poor results.

Overfitting. Source: Wikipedia

Hyperparameter tuning

A neural network has a lot of hyperparameters that can be adjusted to improve the results. But identifying the best values for these hyperparameters requires a lot of experimentation. Given the number of hardware resources the neural network needs, experimentation is not that easy.

The Problem We Faced

While working on the UNHCR challenge of detecting displacement using climate changes, we wanted to identify land features like water, vegetation, and buildings. Initially, we were planning to train a deep learning model to get this done. But here is the problem — We were using satellite images (Landsat — 8, Modis) as a data source. These images have 13 bands and make take up around 1GB of storage space per image.

So, building the model that was not a good option considering the hardware resources we need and the limited time we had.

Normalized Difference Indexes

While looking for a solution to the problem we were facing, we were introduced to NDVI (Normalized Difference Vegetation Index), which uses a simple formula to identify the presence of vegetation in a given satellite image. This is based on the fact that healthy vegetation (chlorophyll) reflects more near-infrared (NIR) and green light compared to other wavelengths. But it absorbs more red and blue light. Similarly, water and buildings create relationships between different bands in the satellite images.

Satellite Imagery vegetation Index calculation

Satellite Imagery water Index calculation

Satellite Imagery built-up Index calculation

Using these indexes, we were able to create a heatmap of how vegetation, water, and buildings varied in individual districts. This made it easier to correlate with the displacement data we had. This is a clear example of a place where Deep Learning could be used but the use of alternative methods could save a lot of resources, time and also give better results.

Heatmap of NDVI values per district - Source: Omdena

The UNHCR project with Omdena was my first experience working as part of a team. I had the role of a Machine Learning Engineer and it was a great experience. I had always wondered how people work as a part of a team in the field of machine learning as the output of the team is essential to a model. But this experience gave me a clear understanding of how it works. Also, there was a lot of learning. I was working with people from different backgrounds and different levels of experience and each of them had something unique to offer in terms of knowledge.

I believe community learning programs like this are the way to go for anyone with any level of skill.