Machine Learning For Rooftop Detection and Solar Panel Installment

Machine Learning For Rooftop Detection and Solar Panel Installment

By Harshita Chopra


Solar energy is a promising and freely available resource for managing the forthcoming energy crisis, without hurting the environment. Unlike conventional fossil fuels, it won’t run out anytime soon.

Fact — There’s enough solar energy hitting the Earth every hour to meet all of humanity’s power needs for an entire year.

Let’s face it, what’s cooler than the sun powering your home? And that is quite literally true.

Fun fact — Solar panels also act as “roof shades” to keep buildings cool. They absorb the sun’s rays, directing them away from the roof, whereas a roof without panels would allow heat to penetrate into the building.

As people around the world look for ways to “go green” and protect the earth, solar panels provide an excellent option. But the utility industry needs smart systems that can help improve the integration of renewables in an effective way.

Solar AI, a Singapore based startup incubated as a part of ENGIE Factory, collaborated with Omdena, to pull off a mission to hyper-scale the deployment of distributed solar and the transition towards 100% renewables by modernizing the way rooftop solar is sold.


The problem statement

The rooftop solar assessment process can be time consuming and expensive, taking anywhere between 1 hour to 2 full days to calculate the solar potential of each rooftop. In the solar industry, this has resulted in the cost of sales taking up to 30–40% of total project costs, significantly worsening the unit economics of solar projects.

By automating these evaluations with Artificial Intelligence, Solar AI aims to drastically reduce the cost of this process and make this information easily available for both building owners as well as solar energy companies.

So we had a mission to accomplish within eight weeks:

Combining multiple models that can automatically identify rooftops and detect rooftop features using machine learning like obstacles, material, slopes and area from high-resolution satellite imagery.


The solution

Solar AI provided us with high-resolution satellite imagery in Singapore. With these huge and detailed images in hand, we had a list of tasks to perform.

The 2 GB size of one image fascinated me enough, to begin with pre-processing and creating thousands of smaller tiles out of it — using just a few lines of code bundled up in a function.


Snapshot of a few tiles created from the huge image / Source:



The power of annotations

Even the most technically advanced algorithms cannot address or solve a problem without the right data. We know having access to data is quite valuable, but having access to data with a learnable structure is the biggest competitive advantage nowadays. That’s the power of data annotation.


A quirky image with hundreds of rooftops / Source:


Our wonderful team of collaborators volunteered to annotate thousands of rooftops in 500+ tiles. We pulled off a smarter method of annotating the buildings, by mapping the OSM data on the raster layer (TIF format tile) in the QGIS software.

The consistent determination of the annotators resulted in a perfectly labeled dataset for Supervised Machine Learning algorithms.

The food for models was ready!


Scanning images of rooftops via machine learning 

The major task was to detect rooftops in a given image using machine learning & computer vision models.

Not just this, we also had to determine their type/structure such as Flat-roof, Hip-roof, Shed-roof, or any other. Hence, this became an instance segmentation problem.

We tried out a number of models such as Mask R-CNN, YOLACT (You Only Look At CoefficienTs), Dectectron2, and more. After training on different batches of annotations as they were delivered, we kept seeing improvement in results. Eventually, the best performing model was selected to go ahead with other tasks.




Zooming in on your rooftops 

Now that we had the bounding boxes and mask contours of various rooftops, trapped properly in a data frame, we were ready to start the analysis of individual rooftops. After extracting and zooming into masks of each detected roof, we needed the following attributes:

  • Obstacle Detection
  • Area of the roof (excluding obstacles)
  • The material of the roof
  • Detecting faces of Hip/Shed roof
  • The orientation of individual slopes


Calculating “Area Available” for panels

For the calculation of a rooftop’s effective area, the area occupied by obstacles has to be subtracted from the whole. So that gives rise to the task of identifying obstacles.

Due to the lack of labeled data for obstacle detection, our genius team shifted their thought process towards an unsupervised approach of edge detection and creating contours. By setting a threshold on contour colors, obstacles were distinguished from plain area to a great extent.

An effective area was therefore mathematically calculated as the difference between total area and obstacle area in terms of pixels, which was then converted into meters squared.


Roof Materials / Source:


Quality of the roof

Because solar panels are installed on your home’s rooftop, it is important to understand how different roof materials may influence this process.

Generally, they range from concrete, metal, roof tiles, eternit to composite shingles.

This task also required a labeled dataset, so I decided to jump in to find a solution where we could skip annotations per se. Using Open Street Maps, we created a small but fruitful dataset labeled with roofs and their materials. A deep learning-based Image Classification model was then created which identifies the material of the roof and gives the probability scores for each class.


Which way do solar panels face?




Orientation, or the direction your roof faces, may have a large impact on how productive roof-mounted solar panels will be. Your system will generate the most energy when it gets as many hours of light exposure per day as possible. In most places, the ideal power generation angle is 30–40 degrees.




The task of identifying many faces of a hip roof was a challenging one. After multiple attempts with different approaches, the task team managed to create an appreciable mathematical model that could identify the facets as well as the angles they’re inclined on, using some constructive utility functions. The output was the orientation of different roof facets.


Conclusion: Putting it all together

The outputs of all the tasks were captured systematically in a data frame. Keeping in mind that we computed various attributes based on pixel values, we converted them back to geographic coordinates at the end. This allowed us to project the data on satellite images of a particular CRS (Coordinate Reference System).

After merging everything into an automated pipeline and many rounds of reviews, evaluation, fixing bugs, and testing — our software was ready to be delivered.

Solar AI is extremely happy with the final deliverables, and this is something that makes the experience even more worthwhile. As CEO Bolong Chew puts its:

“This work went beyond our wildest expectations and we’re extremely happy. We set the bar really high and the team delivered. It was an amazing experience.”

Increasing Solar Adoption in the Developing World through Machine Learning and Image Segmentation

Increasing Solar Adoption in the Developing World through Machine Learning and Image Segmentation


The problem

How to Increase Solar Adoption in the developing world through Image Segmentation? Applied in India.


The solution

Step 1: Identification of the Algorithm: Image Segmentation

We initially started with the goal of increasing Solar Adoption using Image Segmentation algorithms from computer vision. The goal was to segment the image into roofs and non-roofs by identifying the edges of the roofs. Our first attempt was to use the Watershed image segmentation algorithm. The Watershed algorithm is especially useful when extracting touching or overlapping objects are in the images. The algorithm is very fast and computation inexpensive. In our case, the average computing time for one image was 0.08 sec.

Below are the results from the Watershed algorithm.

Images of rooftops in Delhi, India

Original image(left). The output from the Watershed model(right)


As you can see the output was not very good. Next, we implemented Canny Edge Detection. Like Watershed, this algorithm is also widely used in computer vision and tries to extract useful structural information from different visual objects. In the traditional Canny edge detection algorithm, there are two fixed global threshold values to filter out the false edges. However, as the image gets complex, different local areas will need very different threshold values to accurately find the real edges. So there is a technique called auto canny, where the lower and upper bound are automatically set. Below is the Python function for auto canny:



Snippet of the code for Image Segmentation



The average time taken by a Canny edge detector on one image is approx. 0.1 sec, which is very good. And the results were better than the Watershed algorithm, but still, the accuracy is not enough for practical use.

Image of Delhi rooftops for understanding the algorithm

The output from the Canny Edge detection algorithm


Both of the above techniques use Image Segmentation and work without understanding the context and content of the object we are trying to detect (i.e. rooftops). We may get better results when we train an algorithm with the objects (i.e. rooftops) looks like. Convolutional Neural Networks are state-of-the-art technology to understand the context and content of an image and are being used here to increase Solar Adoption Awareness using Image Segmentation technique.

As mentioned earlier, we want to segment the image into two parts — a rooftop or not a rooftop. This is a Semantic segmentation problem. Semantic segmentation attempts to partition the image into semantically meaningful parts and to classify each part into one of the predetermined classes.

Explaining what segmentation is using a basic example

Semantic Segmentation (picture taken from


In our case, each pixel of the image needs to be labeled as a part of the rooftop or not.

Differentiating image into two segments, Roof and Non roof part

We want to segment the image into two segments — roof and not roof(left) for a given input image(right).


Step 2: Generating the Training Data

To train a CNN model we need a dataset with rooftops satellite images with Indian buildings and their corresponding masks. There is no public dataset available for Indian buildings’ rooftops images with masks. So, we had to create our own dataset. A team of students tagged the images and created masked images (as below).

And here are the final outputs after masking.


Roof top satellite images converted into image segmented photo


Although the U-Net model is known to work with fewer images for data but to begin with, we had only like 20 images in our training set which is way below for any model to give results even for our U-Net. One of the most popular techniques to deal with less data is Data Augmentation. Through Data Augmentation we can generate more data images using the ones in our dataset by adding a few basic alterations in the original ones.

For example, in our case, any Rooftop Image when rotate by a few degrees or flipped either horizontally or vertically would act as a new rooftop image, given the rotation or flipping is in an exact manner, for both the roof images and their masks. We used the Keras Image Generator on already tagged images to create more images.

Augmenting Data

Data Augmentation


Step 3: Preprocessing input images

We tried to sharpen these images. We used two different sharpening filters — low/soft sharpening and high/strong sharpening. After sharpening we applied a Bilateral filter for noise reduction produced by sharpening. Below are some lines of Python code for sharpening

Code for Low Sharpening Filter

Low sharpening filter

Code for high sharpening filter

High sharpening filter



And below are the outputs.


Satellite view of buildings


Google Images


Step 4: Training and Validating the model

We generated training data of 445 images. Next, we chose to use U-Net architecture. U-net was initially used for Biomedical image segmentation, but because of the good results it was able to achieve, U-net is being applied in a variety of other tasks. is one of the best network architecture for image segmentation. In our first approach with the U-Net model, we chose to use RMSProp optimizer with a learning rate of 0.0001, Binary cross-entropy with Dice loss (implementation taken from here). We ran our training for 200 epochs and the average(last 5 epochs) training dice coefficient was .6750 and the validation dice coefficient was .7168

Here are the results of our first approach from the Validation set (40 images):

Predicted and Targeted Image

Predicted (left), Target (right)


Predicted (left), Target (right)



As you can see, in the predicted images there are some 3D traces of building structure in the middle and corners of the predicted mask. We have found out that this is due to the Dice loss. Next, we used Adam optimizer with a learning rate 1e-4 and a decay rate of 1e-6 instead of RMSProp. We used IoU loss instead of BCE+Dice loss and binary accuracy metric from Keras. The training was performed for 45 epochs. The Average(last 5 epochs) training accuracy was: 0.862 and the average validation accuracy was: 0.793. Below are some of the predicted masks on the Validation set from the second approach:


And here are the results form the test data:



Test data




Test data



More About Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

Stay in touch via our newsletter.

Be notified (a few times a month) about top-notch articles, new real-world projects, and events with our community of changemakers.

Sign up here