Data-driven decision making and signal processing with Google Earth Engine to meet the electricity and water demand in Nigeria.
The Nigerian NGO Renewable Africa #RA365 has the mission to install off-grid solar containers to mitigate the lack of electricity access in the country, where only half of the population of 198 million has stable access to the power supply. We came up with the solution to by using Solar Data Science concept.
The demand – A known Problem
The Demographic and Health Surveys (DHS) provide a large amount of data on African and other developing countries.
This dataset has been used by several researchers and plots similar to the above can be found throughout the internet and literature.
However, the dataset in Nigeria is based on a 2015 survey of about only 1000 households per state, without specifying their precise geographic location within each state. Nevertheless, it shows the critical state of energy access in Nigeria. For example, from the 1194 sampled households in Sokoto state, only 20% (239) had access to electricity in 2015.
Our approach — Nighttime images
We quickly came up with the idea of comparing nighttime satellite imagery against the geographic location of the population.
Although the nightlights seem quite straightforward to use, we still needed to find where all the Nigerian houses are located, and then check if they are lit up at night or not (demand).
We initially thought of using a UNet-like model to detect or segment the house roofs from the sky. This has been done already in several machine learning competitions, however, we came across the population dataset from WorldPop, which is also available in Google Earth Engine and uses ground surveys and machine learning to fill the gaps.
GRID3 is another dataset from the same group, which has been validated during vaccination campaigns and provides much higher resolution and precision.
With both datasets in hand, the math seems easy: demand = population and no lights.
Here is the code snippet link.
Some challenges to overcome
However, we first have to take into consideration the noise present in each one of the datasets. And secondly, find the optimal places for Installation of the Solar containers by using Geographical Data Science, within the immense sea of electricity demand in Nigeria, Africa.
We also used a few sample villages (where the electricity supply was known) to calibrate the thresholds of minimal population density and minimal light levels to consider into the algorithm.
Building the location heatmap
A large part of the container installation cost is due to the wiring and distribution of the electricity. This cost has a nonlinear relationship to the distance between the panel and the house to be supplied with energy, in the way that it is much cheaper to supply to nearby houses.
For example, a house 200m away from the energy source should cost more than 2x the cost of one at 100m.
We assume the optimal solar panel location in relation to a household will approximately follow a Gaussian distribution due to the wiring cost. Therefore both noisy nightlights and the electricity demand itself can be smoothed out by applying Gaussian convolutional filters in order to find the best spots for the solar panel installation.
Finally, we tried several image segmentation techniques to capture the clusters of demand, however, the best technique in GEE turned out to be the very simple “connected Components algorithm”.
Additionally, we can sum the population density of each area to estimate the total population on each cluster.
GEE allows you to export the raster as a TIF, which can then be worked on GeoPandas to find their contour and centroids and link it back to google maps for further exploration.
We showed how to combine satellite imagery and population data to create an interactive map and a list of the top Nigerian regions with high demand for electricity by the usage of Solar containers Installation, via Geographical Data Science.
The NGO Renewable Africa will use those tools to survey and validate the locations before installing the solar panels. This should have a real impact on the lives of thousands of people in need. Additionally, this report can also be used to show where the demand lies and help to pressure the local government into action.
We also hope that the initiative is followed by the neighboring and other developing countries, as all the methodology and code used here can be easily transferred to other locations.
Source code for both GEE and the Colab notebook is available here.
How oversampling yielded great results for classifying cases of Sexual Harassment. The problem: Overcoming an imbalanced data set When it comes to data science, sexual harassment is an imbalanced data problem, meaning there are few (known) instances of harassment in...
Improving the accuracy score from 83% to 93% to identify land conflict topics in news articles. The Problem: Identifying environmental conflict events in India using news media articles Part of this project was to scrape news media articles to identify...
Using GAN Network for data augmentation to prevent power outages and fires sparked by falling trees and storms. The Problem: Generating Images with Just Noise The Solution: Using Generative Adversarial Network (GAN) for Data Augmentation The GAN...