In this demo day, we talked about the inevitable data challenges/roadblocks that come up in real-world AI projects. The insights shared came from our experiences with more than 20 AI projects, working with partners including the UN Refugee Agency (UNHCR), the World Resources Institute, the World Energy Council, and numerous NGOs and corporations.
Omdena is a collaborative platform to build innovative, ethical, and efficient AI solutions to real-world problems. Since our founding in May 2019, over 1250 AI experts from more than 80 countries have come together on Omdena projects to address significant issues related to hunger, sexual harassment, land conflicts, gang violence, wildfire prevention, and energy poverty.
We’ve seen that the way that we approach AI development, via bottom-up collaboration with diverse team members, fosters innovation and creativity which leads to the breakdown of data roadblocks. Innovation is inherent in the Omdena process.
We shared three Omdena projects to act as case studies for these innovative approaches to tackling data challenges.
Data Roadblock 1: Incomplete Data Sets
In the real world, datasets are rarely complete. We find having large teams of dozens of people means that data gathering, cleaning, and wrangling happen at a phenomenal speed. And by taking a bottom-up approach, we have multiple sub-teams looking at data problems from different angles, allowing for innovative approaches to be explored.
In the following case study, the Omdena team worked out ways to identify safe routes in a city in the aftermath of an earthquake, where the relevant data sets were inconsistent and unreliable.
Case Study : Disaster Response: Improving the Aftermath Management of an Earthquake
In collaboration with Istanbul’s Impact Hub innovation center, Omdena data scientists combined satellite imagery of Istanbul with street map data in order to build a tool that facilitates family reunification by indicating the shortest and safest route between two points after an earthquake.
“Omdena´s approach to AI development is by far the best that I have seen in 2019” — Semih Boyaci, Co-Founder Impact Hub Istanbul
You can learn more about this project here:
Data Roadblock 2: No Data
We don’t see the lack of data as a showstopper. On those projects without data, the team starts by asking what do we need to know to address the problem? Where might that data live? If it doesn’t exist, how can we create it from something that does exist? Here the diversity of the team members is very powerful.
We’ve seen time and again the impact of bringing together people with vastly different professional and life experiences. Our teams are typically 30% or more female. On any project, we’ll have on average 14 countries represented. Our collaborators range in age from 17 to 65. Not only does this diversity lead to ethical and trusted solutions, but it also fosters creativity and alternative ideas about what data is relevant and where to find it.
In the following project, we looked at how to assess post-traumatic stress disorder among those that have suffered trauma in low-resource environments. In this case, the team started with no data in-hand.
Case Study : Building a chatbot for Post-traumatic-stress-disorder (PTSD) assessment
32 Omdena collaborators developed a machine learning-driven chatbot for PTSD assessment in war and refugee zones.
Through the collaborative efforts of the project community, the team identified and annotated suitable patient data. The teams applied linear classifiers for Natural Language Processing (NLP) for PTSD risk assessment and transfer learning for data augmentation.
You can learn more about this project here:
Data Roadblock 3: Disparate Data Sources
Relevant data doesn’t typically come packaged in just one form. We often need to meld disparate data sources to get at a solution. Through collaboration, sub-teams focused on separate data and AI techniques come together to integrate those efforts to derive insights about the problem.
In the following project, the goal was to uncover domestic violence in India hidden due to COVID lockdowns. Among the many challenges the team addressed was the integration of data culled from disparate sources.
Case Study : Analyzing Domestic Violence through Natural Language Processing
This project was done with the award-winning Red Dot Foundation. Within Omdena’s collaborative platform, the team looked craft a dataset to reveal domestic violence and online harassment patterns in India during COVID-19 lockdowns. The AI experts scrapped data from news articles as well as social media to apply various natural language processing (NLP) techniques such as topic modeling, document annotations, and stacking machine learning models.
You can learn more about this and related projects here:
More about Omdena
Omdena is the collaborative platform to build innovative, ethical, and efficient AI and Data Science solutions to real-world problems.