A Crop vs Weed Segmentation on the edge pipeline, in order to improve the model performance that identifies weed from crops in images, we used different image segmentation techniques. Here we present a walkthrough of the preprocessing and pipelining for the Semantic Segmentation exploration.

 

Authors:  Sijuade Oguntayo, Pankaja Shankar, Shubham Gandhi, Marjan Gbohadian Melania Abrahamian,  Nyan Swan Aung (Brian)

 

 

Introduction

In partnership with Omdena and WeedBot, an impact-driven startup developing a laser weeding robot for farmers to recognize, and remove weeds with a laser beam to facilitate pesticide-free food production, a number of Machine Learning Engineers spread around the world worked for over two months with the goal to develop a high speed, high precision Image Segmentation model with a speed of 12 milliseconds or faster on the Nvidia Xavier edge device. 

 

 

Photo by Weedbot

Photo by Weedbot

 

We explored two types of Image Segmentation — Instance & Semantic Segmentation. In this article, we shall be doing a walkthrough of the preprocessing and pipelining for the Semantic Segmentation exploration. A different article looks into the Instance Segmentation exploration here.

 

Photo by Markus Spiske on Unsplash

Photo by Markus Spiske on Unsplash

 

 

 

Image Segmentation

Semantic Segmentation was considered because Semantic Segmentation tends to be less computationally intensive than Instance Segmentation. One reason for this is, unlike Instance Segmentation, Semantic Segmentation simply performs object pixel-wise classification without aiming to determine instances of the object. Instance Segmentation, on the other hand, performs both pixel-wise semantic segmentation and object detection (bounding boxes). 

We concluded the distinguishing pixels belonging to crops vs weeds do not necessarily require “instance awareness”. You can see more here on the difference between Instance & Semantic segmentation. 

 

 

Components

Feature Extraction

One of the feature extraction methods decided upon is based on this  paper. It details methods to generate 11 additional channels that could aid in improving the model accuracy for crop vs weed classification. 

RGB + Additional Engineered channels

RGB + Additional Engineered channels

 

Engineered channels calculation feature extraction

Engineered channels calculation feature extraction

 

Included below are some examples of the extracted channels — 

 

Original RGB Image

Original RGB Image

 

HSV

HSV

 

EXG

EXG

 

EXR

EXR

 

CIVE

CIVE

 

 

Target Preprocessing

The dataset provided by Weedbot contained annotations in the coco format for the carrots. For our purposes, each image was converted to a 3 channel binary mask corresponding to the Background, Carrot, and Weed classes. This would form the image output to be predicted. Using this method, we could simply extract the necessary channels for further work.

 

3 channel mask

3 channel mask

 

Original RGB Image

Original RGB Image

 

 

Binary Masks

 

Background Binary Mask

Background Binary Mask

 

Carrot

 

Carrot Binary Mask

Carrot Binary Mask

 

Weeds

 

Weed Binary Mask

Weed Binary Mask

 

 

Original RGB

Original RGB

 

Background-Red, Carrot-Green, Weed-Blue

Background-Red, Carrot-Green, Weed-Blue

 

Problem

The next step in the pipeline was to experiment with different existing open-source segmentation models including uNet, PSPNet, Bonnet to name a few, as well as custom architectures and for this, we needed to split into multiple teams and compare results. 

It became imperative that we work with the exact same datasets, as well as run tests on the same validation images so as to be able to more directly compare the results of the different models. An implication of this was that the different teams had to spend some time performing a lot of the same preprocessing steps to get to the segmentation part of the pipeline and a lot of those preprocessing steps might need to be performed within the training loop. 

 

Hub by Activeloop to the rescue

Activeloop  was a technical partner on this project and we made use of their platform to overcome this hurdle. 

We were able to perform the preprocessing half of the model pipeline which resulted in a 15-channel input image, and a 3 channel output image. This was then uploaded to Activeloop. This meant that the image preprocessing could be performed once centrally and uploaded. 

For the modeling, everyone could call up the same image dataset without the need to worry about any of the preprocessing or image loading steps that are typical in computer vision problems as the image data is streamed directly to the model as arrays/tensors. 

Let’s dive in. 

Install hub by Activeloop — 

pip3 install hub

Register for an account on Activeloop or on the command line using — 

hub register

To log in with a registered account, on the command line — 

hub login

Each of the masks was saved to disk with the same file names but in different folders.

In my_schema, we define the input type as “image”, and the target arrays as the output. We also define the dimensions of both the input and target. 

We define a load_transform function that loads the different images and masks, combines them into the input and target formats, and uploads the array to Activeloop. 

Next, we pass in a list representing the image names, and for each image name in the list, we run load_transform. The tag string takes our hub username and the name we wish to store the repository as. 

Computing the transformation: 100%|██████████| 19.4k/19.4k [18:09<00:00, 17.8 items/s]

We can now load the image dataset simply with — 

tag = "username/repo_name"
ds = hub.load(tag)

 

Another advantage to using this method was that we could quickly set up the pipeline on Colab notebooks, and each team could make a copy of the same notebook and quickly get to experimenting without spending any time on environment setup. 

Here’s a link to the initial Colab notebook shared to begin experimenting with different architectures. 

Activeloop offers a nice visualization dashboard to view both the input channels, as well as the target images. 

 

Dashboard display for input images

Dashboard display for input images

 

Toggle to view target

Dashboard display for Target

Dashboard display for Target

 

 

References

Milioto A, Lottes P, Stachniss C 2018 Real-time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs https://arxiv.org/pdf/1709.06764.pdf

Mishra A 2020 Faster Machine Learning Using Hub by Activeloop https://towardsdatascience.com/faster-machine-learning-using-hub-by-activeloop-4ffb3420c005

 

 

Develop Your Career and Make a Real-World Impact

Innovation

The world´s only place for truly collaborative AI projects to apply your skills on real-world data with changemakers from around the world.

Apply & grow your skills in our real-world projects

Upcoming AI Projects

AI Teams

Make an impact in our upcoming projects in Natural Language Processing, Computer Vision, Machine Learning, Remote Sensing, and more.

Check out our projects!

Stay in touch via our newsletter.

Be notified (a few times a month) about top-notch articles, new real-world projects, and events with our community of changemakers.

Sign up here