Supervised Machine Learning for Damage Assessment in Agriculture

November 30, 2021

Developing a Supervised Machine Learning prediction model to tackle farm pest challenges in agriculture. Including a personal story on overcoming impostor syndrome while working with Omdena AI changemakers from around the world.

The Project Problem Statement

The partner for this project is OKO, an agricultural insurance company that seeks to offer a product to farmers in the West African emerging markets with intentions of prompt payment of claims by use of mobile technology.

In this endeavor, OKO has faced pests as a challenge in farms they insure. One recent and noticeable one is the FAW. In large numbers, it has destroyed tones of plants that could feed thousands.

Armyworms are caterpillar pests of grass pastures and cereal crops, where they attack grains and feed on leaves. A very hungry caterpillar is rampaging through crops across the world, leaving a trail of destruction in its wake. The Fall ArmyWorm (FAW), also known as Spodoptera frugiperda (fruit destroyer), loves to eat corn but also plagues many other crops vital to human food security, such as rice and sorghum.

Further details about the challenge can be found at our OKO challenge and further, the article describes what steps followed to produce the output to meet the goal of the project that to assess the damage of Agricultural crops utilizing Machine Learning and Satellite imagery techniques.

Source: UN

Further to the task, he contributed to solving the remote sensing knowledge and eventually it contributed to the next progress that helped to come with the following solutions to this challenge.

Data Collection, Pre-processing and Modelling

After consuming lots of information about the nature of Fall ArmyWorms (FAW) and how to sense their damage on leaves via satellite images, I suggested a strategy to start off the project. Seemingly, the majority of the collaborators had a similar vision.

At first glance from his remote sensing knowledge, Landsat and Sentinel platforms came to mind. Because of their open-source policy. Even with this resourcefulness, came across some spells of reluctance and lethargy creeping on the thought that was enough about manipulating satellite imagery. This was not the case and had to get down to learn like everyone else.

The next phase came with some higher levels of confusion since the data was scarce and lacked proper annotations. So, we had to come up with a way around this. And to prove that a model could be developed with good data.

During the collection of image data from sentinel via Google Earth Engine I shared a QGIS technique to develop rectangle masks. The masks could be used to collect a standard size of image scenes for training models.

The technique was adapted and shout-out to Nandhini who adapted it to simplify the collection of the image scenes and from whom he could consult when he could face coding obstacles. This also suppressed his impostor syndrome with an aim to contribute more valued information.

During pre-processing and modeling where I interacted with other team members Ketut, Sagar, and Solomon Mulugheta so that they could understand the challenge more and brainstorm. Besides, I also learned proper coding patterns from code written by Ketut and Sagar which provided more insight to deliver my model.

With more interactions via meetings and code by other contributors, I could focus and contribute towards building a ‘prototype’ Resnet-50 based classification and area prediction model.

The model demonstrated promising results with the already limited collected data. Thus, the model required more data to ascertain if it will perform better. As part of the collaborators, we developed a collection pipeline that increased the quantity of the data. Therefore, there was a glamour of machine learning hope that the model will perform.

With the collected data, he proceeded with pre-processing the data, generating the vegetation indices, resizing the images and training the model. It didn’t turn out easy as there were challenges at every turn of pre-processing and training.

Challenges during the projects:

Limited computer resources (mostly RAM)
Large data set to be stored in single Google drive: so he spread the data in various drives during processing
Class imbalances

Above all, the major challenge was annotation as the data collected by FAO did not match the land cover on the ground. Some points occurred in urban areas instead of the agricultural land which was the focus of the project. If you are interested in learning more about overcoming data challenges in agricultural projects check out 7 Steps to Build a Quality Satellite Imagery Dataset for Agricultural Applications.

Resnet-50 Model

For its proven classification and recognition accuracy, I adapted Resnet50, a CNN model with 50 layers which are very crucial in learning and retaining features of images.

The model was built to classify 4 levels of damage due to FAW using already collected and labeled data and to predict the extent of damage from an image scene difference of vegetation indices.

A prototype model on a few Sentinel 2A sampled images achieved a classification accuracy of 47% and area prediction RMSE of 0.2699.

After the demonstration, he explored with the large data set and the accuracy was about 94% on classification and below 0.39 RMSE on predicting area. A point to note is that the accuracy is only promising. For the workable model, balanced data set is crucial.

A Resnet50 model strictly accepts an input of 3 channels. Thus, he reconfigured 3 vegetation indices (GCI, NDVI, and SAVI) to form a pseudo-image with 3 channels. Additionally, he configured the model to output class/level of infestation and area under damage. Below he shares preliminary results of the model.

Figure 01: Loss on Area Prediction Training

Figure 02: RMSE on Area Prediction Training

Figure 03: Accuracy on Infestation Level Training

Figure 04: Loss on Infestation Level Training

Lessons and Key Takeaways

Doing projects gets one out of a comfort zone
Obstacles make one grow resilient
Consultation: a challenge shared is ¾ way solved

Overcoming the Imposter Syndrome

If not for the contribution of Omdena then I would not be giving this appraisal of my experience pre and post the Fall ArmyWorm (FAW) project. I stumbled upon Omdena and the associated content via connections on LinkedIn.

From that eureka moment I consumed everything, I could get about Omdena. I got to learn about the Omdena projects. And I pushed to acquire the necessary skills so that I could be considered in my applications. About 4 months in, some of the impostor syndromes had dissipated and I was a little more confident.

Therefore, I started applying furiously but my destined moment came in July when I was picked to join the OKO project.

Coming from a background with a blurry direction, this project was to be a beacon of hope in getting a mentor, a sense of direction and certainty of the field of data science and its relationship with engineering.

My background was based on novice patched rote-learning of several software technologies like PHP, C, C++, JavaScript, Java, HTML/CSS and Django python. Unfortunately, I would not do much with the knowledge.

I joined a team of about 50 collaborators from different backgrounds and skills. From it, my impostor syndrome reduced since I got to learn that you don’t have to learn everything and with consultation, you can learn anything new.

The whole 2-month project was a mentorship session in itself as senior colleagues in software engineering, machine learning and previous Omdena projects were special and thoughtful guides. The project also streamlined my perspective of a complete project from start to end in negotiation, clear scoping of the project’s phases.”

Conclusions

Looking back, the two-month period was a time I grew confident in my skills to collaborate, find the best way to communicate with team members and time management to ensure delivery within the set time frame among others.

Using remote sensing satellite images is always a challenge, but I think the most challenging part of this project was assuming the labels provided were correct and could be used.

Now, I can comfortably describe or introduce myself as a Machine Learning Engineer and as the legend goes learning is an everyday process.

Big thank you to all collaborators who made this project a success.

This article is written by Bebeto Nyamwamu.