Building Open Source NLP Libraries and Tools for the Arabic Language
This special two-month Omdena Challenge is a first-of-its-kind community-driven project with 50 AI changemakers to build open-source Arabic NLP libraries. The solutions help to overcome present adoption challenges and increase accessibility of Arabic NLP applications.
The Problem
اللغة العربية تعد من اكثر اللغات انتشارا و استخداما و تتميز لغة الضاد بثراء رصيدها من الكلمات والصيغ ، وهي لغة متميزة من الناحية الصوتية ، فقد اشتملت على جميع الأصوات التي اشتملت عليها اللغات السامية الأخرى . كما تتميز بالمرونة حيث تستوعب جميع الألفاظ المشتقة والمترادفة وتضع لكل مقام مقال لها
ادركنا اهمية اللغة العربية و مكانتها بين شعوب الشرق الاوسط و العالم, و نسعى فى ادراج اللغة العربية ضمن اللغات التى يتيسر استخدامها فى تطبيقات الذكاء الاصطناعى و معالجة اللغات الطبيعية للبشر
- Arabic is the 5th most spoken language in the world and the 1st language of the Arab world countries, making it extremely important worldwide.
- Arabic is grammatically complex and has free order properties, which all pose significant challenges in Arabic NLP applications.
- There are 3 types that characterize Arabic, Classical Arabic, Modern Standard Arabic & Dialect Arabic.
- Tools built by big tech and accessible to the majority of the world are limited to translating only a few of the most popular languages.
The project outcomes
The envisioned deliverables can be broken down into two main areas:
- Build open-source Arabic NLP libraries for sentiment analysis, morphological modeling, dialect identification, and named entity recognition
- Build 5:8 core functions to support Arabic NLP (lemmatization, stop words, tokenizing text, word embedding, part of speech tagging.. etc.) like NLTK but for Modern Standard Arabic.
A Community-Driven Initiative: Omdena Country Chapter Leads
This project is facilitated by Omdena´s Country Chapter Leads in the following Arabic countries. We are welcoming partnerships to spread this initiative to as many countries and communities as possible.
First Omdena Project?
Join the Omdena community to make a real-world impact and develop your career
Build a global network and get mentoring support
Earn money through paid gigs and access many more opportunities
Your benefits
Address a significant real-world problem with your skills
Get hired at top companies by building your Omdena project portfolio (via certificates, references, etc.)
Access paid projects, speaking gigs, and writing opportunities
Requirements
Good English & Excellent Arabic Language Skills
A very good grasp in computer science and/or mathematics
Student, (aspiring) data scientist, (senior) ML engineer, data engineer, or domain expert (no need for AI expertise)
Programming experience with C/C++, C#, Java, Python, Javascript or similar
Understanding of Data Analysis, Data Collection, and/ or Natural Language Processing
This project is hosted with our friends at
Application Form
Become an Omdena Collaborator