Creating an Automated Redaction Wizard That Utilizes Optical Character Recognition
50 AI engineers collaborated to create an automated redaction wizard that utilizes optical character recognition, natural language processing, and machine learning algorithms.
The problem
Most industries use some amount of redaction, but some use it more than others. The medical field, for example, has requirements under HIPAA to protect personal health information (PHI). When documents are redacted, they can be used or published by a wider audience than originally intended without compromising confidentiality.
Redaction is also commonly used to protect other kinds of personal identifying information (PII) like:
- Social security numbers
- Driver’s license numbers
- Financial details
- Proprietary information or trade secrets
- Addresses, dates of birth, and names
- Certain information on legal or Medical documents
Performing this process manually is time-consuming not to mention the human error factor that makes this approach inefficient, especially when having a large number of documents involved.
Redactable is an online tool that offers various ways to redact official documents. Search automation, pattern reduction, and manual reduction are some of the options provided by our platform. Our goal is to elevate our users’ experiences and improve our document redaction process by harvesting the power of AI and the advancement in the field of Natural Language Processing.
The project outcomes
The team built a document reduction pipeline that included collecting, processing, and labeling a custom data set, training and testing multiple state-of-the-art NLP models, and building a training pipeline allowing the model to improve its performance over time.
First Omdena Project?
Join the Omdena community to make a real-world impact and develop your career
Build a global network and get mentoring support
Earn money through paid gigs and access many more opportunities
Your benefits
Address a significant real-world problem with your skills
Access paid projects, speaking gigs, and writing opportunities
Good English
A very good grasp in computer science and/or mathematics
Student, (aspiring) data scientist, (senior) ML engineer, data engineer, or domain expert (no need for AI expertise)
Programming experience with Python
Understanding of Data Analysis and Machine Learning.
This Challenge has been hosted with our friends at
Application Form
Become an Omdena Collaborator