Digitizing the Documents and Books for Accessibility With Machine Translation and NLP
Background
Bhutan, a nation with 18 distinct languages, faced challenges in preserving and globalizing its cultural heritage due to a lack of digitized and translated texts. With Dzongkha as the only language with a native literary tradition, there was an urgent need to digitize and translate Bhutanese literature into English, promoting its accessibility, preservation, and relevance in a globalized world.
Objective
The project aimed to:
- Develop a machine translation model for Dzongkha-to-English translation.
- Digitize Bhutanese documents and store them in the cloud.
- Promote Bhutan’s cultural heritage globally and support the growth of its digital creative industries.
Approach
- Ideathon Launch: Omdena, in collaboration with Druk Holding & Investments, hosted a virtual ideathon to identify impactful technological solutions. Akash Phaniteja Nellutla’s solution won, becoming the foundation for this project.
- Global Collaboration: A diverse team of 50 AI engineers from around the world, including Bhutanese contributors, worked collaboratively for eight weeks (August to October 2022).
- Technical Implementation:
- Developed the machine translation model using Natural Language Processing (NLP) techniques.
- Ensured translations included logical sentence structures for accuracy and meaning.
- Deployed a user-friendly application using Streamlit for public use.
- Tools and Data Sources:
- Employed state-of-the-art NLP models.
- Leveraged Bhutanese text and documents as training data for the model.
Results and Impact
- Machine Translation Model: Successfully translated Dzongkha text into English with logical and accurate sentence structures.
- Deployed Application: Built a Streamlit app for easy access to translation capabilities.
- Cultural Impact: The model aids the digitization and preservation of Bhutanese literature, fostering its globalization and integration into modern IT and creative industries.
- Economic Growth: Supports Bhutan’s principles of gross national happiness by driving advancements in IT, creative industries, and digital heritage.
Future Implications
This project sets a precedent for using AI-driven solutions to preserve and promote cultural heritage. The model’s success opens avenues for further research into translating other Bhutanese languages and scaling digitization efforts. Additionally, it showcases how technology can integrate with Bhutan’s principles of gross national happiness, fostering a sustainable, tech-driven economy while maintaining cultural integrity.
This challenge is hosted with our friends at
Become an Omdena Collaborator