Projects / AI Innovation Challenge

Detecting Hate Speech in Tamil Language Using Natural Language Processing

Challenge Completed!


Featured Image

Background

Hate speech is a pervasive issue, particularly in linguistically diverse societies. In Sri Lanka, hate speech in the Tamil language manifests across various domains, such as religion, gender, community, and politics. Addressing this problem is critical to fostering social harmony and preventing the harmful consequences of divisive rhetoric. To tackle this challenge, Omdena partnered with DreamSpace Academy (DSA), with support from the NYU Center on International Cooperation and the Netherlands Ministry of Foreign Affairs, to create a cutting-edge solution for hate speech detection in Tamil.

Objective

The primary goal of this initiative was to develop an AI-based solution capable of accurately detecting and classifying hate speech in the Tamil language. By building a robust model, the project aimed to provide actionable insights for identifying hate speech and mitigating its impact. The solution also sought to create a framework that could potentially be adapted for other languages, including Sinhala.

Approach

A global team of 50 AI changemakers collaborated over two months to tackle the problem using innovative approaches, including:

  • Data Collection and Preprocessing: Aggregating and analyzing a diverse dataset of Tamil text, encompassing various types of hate speech.
  • Natural Language Processing (NLP): Leveraging advanced language modeling techniques to power an API that can classify Tamil sentences as either hate speech or non-hate speech.
  • Classification System: Developing a categorization framework for different hate speech types, such as religion-based, gender-based, community-based, and political hate speech.
  • Visualization Tools: Creating AI-powered graphs and lexicon reports to provide insights into patterns of hate speech usage.

Results and Impact

The project successfully delivered impactful solutions:

  • An AI-enabled system for detecting and categorizing hate speech in Tamil.
  • A classification framework that identifies hate speech based on religion, gender, community, and politics.
  • Visual tools like graphs and statistical reports to aid in understanding hate speech dynamics.
  • The ability to retrain the model, enhancing its adaptability for other languages, such as Sinhala.

These advancements provide stakeholders with actionable insights to combat hate speech, promoting inclusivity and reducing societal tensions.

Future Implications

The findings of this project have far-reaching potential:

  • Policy Development: Informing policies aimed at curbing hate speech and fostering societal cohesion.
  • Educational Interventions: Incorporating AI tools in educational campaigns to raise awareness about the consequences of hate speech.
  • Further Research: Expanding the model’s capabilities to other languages and exploring cross-cultural applications of hate speech detection tools.

This pioneering initiative is a significant step toward leveraging AI for social good, providing a scalable and adaptable framework for addressing hate speech worldwide.

This challenge has been hosted with our friends at
NYU Center on International Cooperation
DreamSpace Academy


Plant Nursery
Monitoring Plants Health with AI and Computer Vision
Shopping for fabric, Lomé, Togo. Photo by Brittany Danisch
Building an AI-powered System to Enhance Economic Policymaking With a Pan BBC African Think Tank
3 women standing and 1 woman sitting in a wheelchair in front of many flags from different countries. These women are part of Fight For Right - a DLO that WID assisted in crisis response.
AI-Driven Resource Identification and Matching System

Become an Omdena Collaborator

media card
Visit the Omdena Collaborator Dashboard Learn More