By Laura Clark Murray, Joanne Burke, and Rishika Rupam
A team of AI experts and data scientists from 12 countries on 4 continents worked collaboratively with the World Resources Institute (WRI) to support efforts to resolve land conflicts and prevent land degradation.
The Problem: Land conflicts get in the way of land restoration
Among its many initiatives, WRI, a global research organization, is leading the way on land restoration — restoring land that has lost its natural productivity and is considered degraded. According to WRI, land degradation reduces the productivity of land, threatening the economy and people’s livelihoods. This can lead to reduced availability of food, water, and energy, and contribute to climate change.
Restoration can return vitality to the land, making it safe for humans, wildlife, and plant communities. While significant restoration efforts are underway around the world, local conflicts get in the way. According to John Brandt of WRI, “Land conflict, especially conflict over land tenure, is a really large barrier to the work that we do around implementing a sustainable land use agenda. Without having clear tenure or ownership of land, long-term solutions, such as forest and landscape restoration, often are not economically viable.”
And though governments have instituted policies to deal with land conflicts, knowing where conflicts are underway and how each might be addressed is not a simple task. Says Brandt, “Getting data on where these land conflicts, land degradation, and land grabs occur is often very difficult because they tend to happen in remote areas with very strong language barriers and strong barriers around scale. Events occur in a very distributed manner.” WRI turned to Omdena to use AI and natural language processing techniques to tackle this problem.
The Project Goal: Identify news articles about land conflicts and match them to relevant government policies
“We’re very excited that the results from this partnership were very accurate and very useful to us.
We’re currently scaling up the results to develop sub-national indices of environmental conflict for both Brazil and Indonesia, as well as validating the results in India with data collected in the field by our partner organizations. This data can help supply chain professionals mitigate risk in regards to product-sourcing. The data can also help policymakers who are engaged in active management to think about what works and where those things work.” — John Brandt, World Resources Institute.
The Use Case: Land Conflicts in India
In India, the government has committed 26 million hectares of land for restoration by the year 2030. India is home to a population of 1.35 billion people, has 28 states, 22 languages, and more than 1000 dialects. In a land as vast and varied as India, gathering and collating information about land conflicts is a monumental task.
The team looked to news stories, with a collection of 65,000 articles from India for the years 2017–2018, extracted by WRI from GDELT, the Global Database of Events Language and Tone Project.
Identifying news articles about land conflicts
Land conflicts around land ownership include those between the government and the public, as well as personal conflicts between landowners. Other types of conflicts include those between humans and animals, such as humans invading habitats of tigers, leopards, or elephants, and environmental conflicts, such as floods, droughts, and cyclones.
The team used natural language processing (NLP) techniques to classify each news article in the 65,000 article collection as pertaining to land conflict or not. While this problem can be tackled without the use of any automation tools, it would take human beings years to go through each article and study it, whereas, with the right machine or deep learning model, it would take mere seconds.
A subset of 1,600 newspaper articles from the collection was hand-labeled as “positive” or “negative”, to act as an example of proper classification, or example of proper classification. For example, an article about a tiger attack would be hand-labeled as “positive”, while an article about local elections would be labeled as “negative”.
To prepare the remaining 63,400 articles for an AI pipeline, each article was pre-processed to remove stop words, such as “the” and “in”, and to lemmatize words to return them to their root form. Co-referencing pre-processing was used to increase accuracy. A topic modeling approach was used to further categorize the “positive” articles by the type of conflict, such as Land, Forest, Wildlife, Drought, Farming, Mining, Water. With refinement, the classification model achieved an accuracy of 97%.
With the subset of land conflict articles successfully identified, NLP models were built to identify four key components within each article: actors, quantities, events, and locations. To train the model, the team hand-labeled 147 articles with these components. Using an approach called Named Entity Recognition, the model processed the database of “positive” articles to flag these four components.
Matching land conflict articles to government policies
Numerous government policies exist to deal with land conflicts in India. The Policy Database was composed of 19 policy documents relevant to land conflicts in India, including policies such as the “Land Acquisition Act of 2013”, the “Indian Forest Act of 1927”, and the “Protection of Plant Varieties and Farmers’ Rights Act of 2001”.
A text similarity model was built to compare two text documents and determine how close they are in terms of context or meaning. The model made use of the “Cosine similarity” metric to measure the similarity of two documents irrespective of their size.
The Omdena team built a visual dashboard to display the land conflict events and the matching government policies. In this example, the tool displays geo-located land conflict events across five regions of India in 2017 and 2018.
Underlying this dashboard are the NLP models that classify news articles related to land conflict, and land degradation, and match them to the appropriate government policy.
The results of this pilot project have been used by the World Resources Institute to inform their next stage of development.
Want to watch the full demo day?
Check out the entire recording (including a live demonstration of the tool).