Habit Building Challenge for Soft Skills Analysis using Speech and Text Data with Advanced NLP Techniques

Local Project Hyderabad, India Chapter

Coordinated by the Lead of India, Mohammad Yahiya,

Status: Ongoing

Project background.

We live in the knowledge economy and skills are your card. Some estimates state that 50% of jobs will be affected due to automation by 2030. Non-routine cognitive skills often described as “soft skills”, have been increasing in importance since the first industrial revolution. Soft skills have the potential to provide the only long-term competitive advantage in the job market of the future and open up equal opportunities to culturally-diverse remote employees. At the same time, soft skills are the hardest to learn due to their abstract nature and context dependency. The traditional coaching platforms are unscalable by design. Online learning marketplaces and MOOCs provide content that does not stick. Self-improvement apps add up to daily distractions and fail to maintain engagement.

Edtech Startup is a tech-enabled skill development tool for companies, aimed to help engineers get skills, like self-confidence, self-awareness, empathy, communication, influencing, and taking ownership, with the use of technology. Edtech is going to connect all the learning tools & knowledge sources by automatically fetching user’s content, recommendations, and insights into the Edtech learning pipeline. Then, it will help users act on them in the context of their life through a customizable mix of tools e.g., spaced-out notifications, self-reflection, community Q&As, and speed-dating with peers. To put it in more context, we provide one source of truth around soft skills with a repository of resources, definitions, and recipes for achieving goals.

The problem.

Building on the previous two challenges, this project aims to create an AI system that can identify problem areas in the user’s interpersonal communication skills and recommend resources for improvement. So far we have built a pipeline that ingests user audio data and identifies possible weaknesses in their communication skills and a database of AI-curated tips. The current challenge will focus on integrating the two systems into a coherent whole. We will build a recommender system using the Google Cloud Platform and work to improve the audio pipeline using insights drawn from our database of tips. We will also ensure the security of the system by training a GAN for user identification.

ASR model for Habit Building Project:

Edtech Platform Startup is looking to design and develop a communication skills assessment model based on spoken language data. Given a user’s speech/audio data, we need the ML/DL model to identify the stance of the speaker such as positive, negative, neutral, his/her fluency of the language, and also deduce any personality traits based on the speech data. Open to other additional possibilities like Video, Text Analysis, etc., to undergo the modeling process and see it’s an effective implementation based on the categories set.

ASR model

Recipes of Soft Skills model:

Edtech Platform is looking to develop a scraping tool that would 1) aggregate all the information about soft-skill-related issues available on the Internet, 2) summarize them eliminating repetitive content, and 3) extract actionable learning items from the resulting text.

Soft skills recipes problem statement

Project goals.

  • Speech processing
  • Combining Speech & Text Model
  • Model building
  • Using GANs in Speech Model (Generative AI modeling)
  • Model deployment
  • Recommender systems

Project plan.

  • Week 1

    • Week 1 – 2: Understanding the Problem Statement and Data Collection, Data Preparation & exploring Pre-Trained ASR, Text Models.

  • Week 2

    • Week 3 – 4: Model Training/Fine Tuning Pre-trained model, Finalizing the Model

  • Week 3

    • Week 5 – 7: We will explore other possibilities such as data mining, Text analysis, Recommender Systems, Generative AI, etc.

  • Week 4

    • Model Deployment on GCP/Streamlit

Share project on: