4 Steps of Using Latent Dirichlet Allocation for Topic Modeling in NLP
April 12th, 2021

  Topic Modeling is a technique that you probably have heard of many times if you are into Natural Language Processing (NLP). It is commonly used for document clustering, not only used for text analysis, but also in applications such as search and recommendation en

Read More

Tags: ,

Uncovering Infrastructural Needs Using Topic Modelling and NLP
April 6th, 2021

  In the Omdena-ACET challenge, we turned to online information sources, such as social media, newspapers, scientific articles, and websites of institutions involved with infrastructure. Each source provides an abundance of information that’s not possible to anal

Read More

Tags: ,

NLP Pipeline: Understanding Land Ownership in Kenya through Network Analysis
March 17th, 2021

An end-to-end NLP pipeline from collecting and preparing more than 32.000 notices, legal entities, and court documents to build a web-based dashboard displaying land ownership in Kenya. The purpose of this project is to boost Kenya’s efforts to restore degraded land i

Read More

Tags: , , ,

NLP Data Preparation: From Regex to Word Cloud Packages and Data Visualization
March 11th, 2021

“REGEX‘’ and “Word Clouds” for Natural Language Processing (NLP) data preparation? `Yesss! Regex, short for “the regular expression”, is not an old technique to find and extract text data. It is still one of the basic techniques used in scrapin

Read More

Tags: , ,

How to Webscrape 700,000 PDFs for Natural Language Processing in 14 Hours to Help the Planet
March 6th, 2021

In order to identify financial incentives for forest and landscape restoration in LATAM, we needed to webscrape policy documents from government pages combining more than 1,300 keywords for each of the 108 targetted states. We did this 61x times faster (than using a lap

Read More

Tags: , , ,

NLP Analysis and Feature Engineering: Untangling the Data To Help NGOs Get Funding
March 4th, 2021

A Natural Language Processing (NLP) analysis pipeline walkthrough for feature extraction, scraping Twitter, Google, and 1200 PDF files through automated APIs. The overall approach allowed us to gather data that visualizes several billion dollars of not for profit grant

Read More

Tags: , ,