Projects / AI Innovation Challenge

Building Open Source NLP Libraries and Tools for the Arabic Language

Project completed!


Omdena Featured image

This special two-month Omdena Challenge is a first-of-its-kind community-driven project with 50 AI changemakers to build open-source Arabic NLP libraries. The solutions help to overcome present adoption challenges and increase accessibility of Arabic NLP applications.

The Problem

 اللغة العربية تعد من اكثر اللغات انتشارا و استخداما و تتميز لغة الضاد بثراء رصيدها من الكلمات والصيغ ، وهي لغة متميزة من الناحية الصوتية ، فقد اشتملت على جميع الأصوات التي اشتملت عليها اللغات السامية الأخرى . كما تتميز بالمرونة حيث تستوعب جميع الألفاظ المشتقة والمترادفة وتضع لكل مقام مقال لها

ادركنا اهمية اللغة العربية و مكانتها بين شعوب الشرق الاوسط و العالم, و نسعى فى ادراج اللغة العربية ضمن اللغات التى يتيسر استخدامها فى تطبيقات الذكاء الاصطناعى و معالجة اللغات الطبيعية للبشر

  • Arabic is the 5th most spoken language in the world and the 1st language of the Arab world countries, making it extremely important worldwide.
  • Arabic is grammatically complex and has free order properties, which all pose significant challenges in Arabic NLP applications.
  • There are 3 types that characterize Arabic, Classical Arabic, Modern Standard Arabic & Dialect Arabic.
  • Tools built by big tech and accessible to the majority of the world are limited to translating only a few of the most popular languages.

The project outcomes

The envisioned deliverables can be broken down into two main areas: 

  • Build open-source Arabic NLP libraries for sentiment analysis, morphological modeling, dialect identification, and named entity recognition
  • Build 5:8 core functions to support Arabic NLP (lemmatization, stop words, tokenizing text, word embedding, part of speech tagging.. etc.) like NLTK but for Modern Standard Arabic.

A Community-Driven Initiative: Omdena Country Chapter Leads

This project is facilitated by Omdena´s Country Chapter Leads in the following Arabic countries. We are welcoming partnerships to spread this initiative to as many countries and communities as possible.

Mohamed Amr, Egypt Chapter Lead

Rasha Salim, Iraq Chapter Lead

Naoufal Rahali, Morocco Chapter Lead

First Omdena Project?

Join the Omdena community to make a real-world impact and develop your career

Build a global network and get mentoring support

Earn money through paid gigs and access many more opportunities



Your benefits

Address a significant real-world problem with your skills

Get hired at top companies by building your Omdena project portfolio (via certificates, references, etc.)

Access paid projects, speaking gigs, and writing opportunities



Requirements

Good English & Excellent Arabic Language Skills

A very good grasp in computer science and/or mathematics

Student, (aspiring) data scientist, (senior) ML engineer, data engineer, or domain expert (no need for AI expertise)

Programming experience with C/C++, C#, Java, Python, Javascript or similar

Understanding of Data Analysis, Data Collection, and/ or Natural Language Processing



This project is hosted with our friends at



Application Form

Related Projects

media card
Optimizing Solutions for Identifying Inaccuracies in GESI Conversations in Sri Lanka
media card
Developing an AI-Driven Sentiment Analysis Tool for Political Actors in El Salvador
media card
Cross-Language Media Review: Identifying Inaccuracies in GESI Conversations

Become an Omdena Collaborator

media card
Visit the Omdena Collaborator Dashboard Learn More