Local Chapter Addis Ababa, Ethiopia Chapter
Coordinated byEthiopia ,
Status: Completed
Project Duration: 23 Mar 2023 - 30 Apr 2023
Ethiopia, the oldest independent country in Africa and the only one in the continent with its own alphabet, has a population of almost 120 Million people. Its a land of enormous diversity with more than 80 languages and over 200 dialects. Amharic or Amharigna, is one of the working languages in the country along with Oromigna and Tigrigna.
The rest of the world is rapidly adopting Machine Learning and AI to take advantage of the available language data. Countries, Ethiopia, with low-resource languages remained behind. It’s time for them to catch up. The ability to effectively leverage current language technologies can benefit in a variety of ways such as by increasing literacy, preserving legacy languages, doing large-scale analysis, improving efficiency, etc. There is a better amount of data available on the internet today than ever before, and leveraging it to build useful projects remained a challenge.
The current problem with Amharic language processing is that there are not enough works for public use. Most research projects remained on the shelf of universities. This project, which is the first in a series of NLP-related projects on local languages, aims to build and consolidate capacity in Amharic language processing by leveraging the latest available data.
Week 1
Initiation, platforming and teaming
Week 2
Data collection stage
Week 3
Data collection and processing
Week 4
Processing
Corpus preparation End-to-end NLP project with low resource language (Amharic) Working on a project