AI Insights

Building an AI Chatbot for Interview Preparation using NLP

March 12, 2024


article featured image

Introduction

In the digital age, AI chatbots are stepping into new roles, including that of job interviewers. The main objective is to conduct preliminary interviews with candidates, assess their suitability for open positions, and gather pertinent information required for further evaluation. These automated interviewers are gaining traction across industries like healthcare, retail, and restaurants, aiming to streamline the hiring process. For companies looking to streamline their recruiting process and reduce costs, AI screening agents appear as an attractive option. Human resources departments have traditionally been a cost center for companies. Chatbots can ease the burden on recruiters, helping them manage a larger pool of applicants efficiently.

Summary

With the increasing popularity of conversational agents like Siri, Alexa, and Google Assistant, there is growing interest in developing chatbots that can help automate various tasks, including job interviews. Moreover, advancements in natural language processing (NLP) and machine learning have made it easier than ever before to create sophisticated chatbots that can understand human speech and respond intelligently. Companies spend significant resources on recruitment, screening candidates, scheduling interviews, and making hiring decisions. AI-powered interview chatbots help organizations reduce these costs while still ensuring high-quality hires.

Our project delves into the evolving landscape of non-technical jobs, focusing on their demand, interview processes, and anticipated trends across various sectors. Market research has been conducted to select the most demanding jobs using different articles. Relevant questions and answers from diverse sources were collected using articles, user posts, and scraping large data using Selenium to create a structured knowledge base. The collected data has undergone thorough cleaning by using the Natural Language Toolkit (NLTK) and grouped questions according to their similarity using cosine similarity.

Later, questions were categorized according to the interview phases using large BERT and zero-shot classification to ensure their quality and readiness for analysis. ChromaDB is used to store questions and associated AI screening agents’ metadata with each question, which includes position and interview phase. The NLP framework and tools were selected based on compatibility, scalability, and flexibility. The chatbot does semantic search for finding questions; if questions were not found, we used LLM as a fallback to generate questions.

Streamlit mic recorder was used to record speech and convert it into text, which empowered the chatbot to understand human language. Questions and the utterance from the candidate of the interview are stored in a dictionary for evaluation. Answers from candidates are evaluated using the LLM chain to provide ratings for candidate answers and to provide qualitative feedback. A user-friendly conversational design was crafted to enhance the user experience using Streamlit UI. The interview chatbot is deployed on a Streamlit server, and its performance is monitored in real-world scenarios.

Task

Omdena Hyderabad Local Chapter brought together over 60 collaborators from countries around the world. Collaborators were asked to collect information on non-technical jobs, focusing on their demand and collecting required data for HR interviews. After collaborators collected the data, they diligently undertook the crucial step of cleaning and preprocessing, ensuring the information was refined and ready. Collaborators worked on application development, leveraging their diverse skills to create a cohesive and innovative solution.

Problem Statement

People are rejected from job interviews for a variety of reasons, which vary based on unique circumstances and the employer’s expectations. At interviews, most candidates do not succeed due to:

  • Lack of confidence: Sometimes candidates have great knowledge or exposure, but they can’t muster up the courage to share that knowledge or can’t embody that experience in their personality or body language. This can be due to nervousness or anxiety, but most times it is due to lack of preparation, which is the foundation of confidence and charisma.
  • Lack of experience in interviews: Failing to explain the fundamentals or basic principles of their domain/niche or not being able to explain their last year’s project or any previous internship experience is a red flag.
  • Lack of communication skills: This means a lack of Spoken English or lack of articulation (in any language). Not talking enough (because you think you should speak in English but you can’t) or talking too much (because you don’t know how to articulate to the point) is a sign that this would create a communication gap in the future.
  • Lack of attitude: As a candidate, we need three things at the beginning of our career: humility, enthusiasm, and a desire to learn. Aptitude without an empowering attitude is a big no.
  • Lack of self-awareness: Given our current education system, it’s okay to not have a lot of knowledge or exposure at the beginning of your career. But if you don’t know about your strengths, highest values, interests, and most importantly, why you are a fit for a particular career or role, it’s again a red flag.

For practicing through mock interviews, candidates have very few options for AI-driven and cost-effective platforms, which can provide candidates/users accurate feedback based on their performance during the interview.

During the hiring process, there may be multiple rounds such as written tests and online exams to narrow down the pool of job applicants. Once a candidate successfully passes through each stage, they must then undergo a final face-to-face evaluation to determine if they are indeed suitable for the position. For this reason, Omdena Hyderabad Local Chapter decided to build an LLM-based chatbot which can conduct mock interviews of the HR round for aspiring candidates. The team will focus on those job roles where the HR round will be the crucial point in the interview.

Goals and Objective

The project seeks to create an interactive chatbot that prepares individuals for interviews. One of the main goals of a chatbot for mock interviews is to improve users’ interview skills through repeated practice and exposure to various types of interview questions. A chatbot can help build candidates’ confidence by allowing them to rehearse their responses in a low-pressure setting before facing actual interviews. Mock interviews with a chatbot can reduce candidates’ anxiety levels associated with job interviews by giving them opportunities to become familiar with the process and format.

Chatbots can analyze users’ responses and provide immediate feedback on areas like answers, grammar, and other aspects, helping candidates refine their interview techniques. Chatbots make interview training available anytime, anywhere, without requiring a physical coach or scheduling constraints, which helps widen its reach. Compared to traditional methods like hiring trainers or attending workshops, chatbots offer a cost-effective solution for improving interview skills, especially at scale.

Goals and objectives

The main objectives of the project are as follows:

  • Build a platform to help job aspirants attend mock interviews and identify areas of improvement.
  • Explore the possibilities of LLM-based apps in human development.
  • Build skill sets of participants in NLP, LLM-based apps, prompt engineering, and retrieval augmented generation (RAG).

Market Research Analysis

The objective of this task is to identify the top ten job roles which have high demand from employers in the past few years:

  • Customer Service
  • Hospitality
  • Sales and Marketing
  • Teaching
  • Documentation Manager
  • Construction Manager
  • Investment Banker
  • Business Analyst
  • Truck Driver
  • Healthcare and Services [1]

The most suitable three job positions were finalized with inputs from domain experts like HR professionals, experienced hiring managers, and internet articles. They are listed below:

  • Customer Service Representative
  • Sales and Marketing
    • Sales Manager
    • Marketing Manager
  • Healthcare and Services
    • Nurse
    • Medical Assistant

Datasets for AI Chatbot for Interview

Data Collection

The data collection phase is a crucial step in the project where relevant information is systematically gathered to address specific objectives. This phase involves the careful and organized acquisition of data from various sources.

With the assistance of domain experts, the interview process was refined to identify the essential key data elements to be captured during data collection and common threads/themes across the interview processes for different job positions so that a generic design can be put in place and identify the common data elements that need to be captured during data collection.

Thousands of questions and answers were found directly from different websites like blogs, articles, and collected using Selenium-based web scraping tools for websites containing large data. Web scraping with Selenium is a powerful tool for extracting data from websites. It allows you to automate the process of collecting data and can save you significant time and effort. Using Selenium, you can interact with websites just like a human user and extract the data you need more efficiently. Chrome is used as the default browser for scraping. The team has played a role in obtaining multiple answers for individual questions. [2]

Data Preprocessing

Data preprocessing is the process of transforming raw data into an understandable format. Preprocessing of data is mainly to check the data quality. The quality can be checked by the following:

  • Accuracy: To check whether the data entered is correct or not.
  • Completeness: To check whether the data is available.
  • Consistency: To check whether the same data is kept in all the places that do or do not match.
  • Believability: The data should be trustworthy.
  • Interpretability: The understandability of the data.

Collected questions and answers are checked manually for mismatches. Unwanted questions and answers which are not related to the topic are removed. Bad questions were replaced with good and meaningful questions.

Data Preprocessing: Question Grouping

In this project, collaborators had a different approach. One of them is the Natural Language Toolkit (NLTK) for text preprocessing because it is the main library for building Python projects to work with human language data. It provides simple to-use interfaces to more than 50 corpora and lexical assets like WordNet, alongside a suite of text preprocessing libraries for tagging, parsing, classification, stemming, tokenization, and semantic reasoning wrappers for NLP libraries and an active conversation discussion. 

NLTK is available for Windows, Mac OS, and Linux. The best part is that NLTK is a free, open-source, community-driven project. It has some disadvantages as well. It is slow and difficult to match the demands of production usage.

The learning curve is somehow steep. NLTK has different features like Entity Extraction, Part-of-speech tagging, Tokenization, Text classification, etc.

In the beginning, unwanted columns which are not required for preprocessing are removed. Text cleaning is done to remove punctuation marks and convert everything to lowercase, making the text more readable and consistent. Similar questions are grouped using cosine similarity to determine how the two questions are close to each other in terms of their context or meaning with similarity scores using a threshold of 0.5 which allows for efficient categorization and organization of questions. [3]

Data Preprocessing: Categorizing for Interview Phase

One of them is BERT, which stands for “Bidirectional Encoder Representations from Transformers,” a large language model developed by Google. The goal was to develop a more comprehensive and accurate understanding of various categories. By leveraging the contextual understanding provided by BERT, the model adeptly captured intricate nuances in language, enabling a more precise categorization of queries. The deployment of Large BERT in this project exemplifies how advanced natural language processing can significantly enhance the efficiency and effectiveness. [4]

Another one was Zero Shot classification, a task in natural language processing where a model is trained on a set of labeled examples but is then able to classify new examples from previously unseen classes. They used the Transformers library zero-shot-classification pipeline to infer zero-shot text classification models. [5]

After cleaning questions, grouping them to similarity, and classifying them, we found around 460 questions for Customer Service Representative, 331 questions for Sales and Marketing, and 1262 questions for Healthcare and Services, a grand total of 2053 combined datasets.

Data Preprocessing: Evaluation Specific

Following the preprocessing of questions and answers, each answer underwent individual assessment to categorize it as either good, average, or poor. The goal was to ensure that every question had multiple answers spanning the range of quality levels. If multiple answers were not available, additional answers were collected. Ratings to the answers were given just to test the prompt which is going to be used in the evaluation of candidate answers.

Interview Flow

Although teams varied in their approach to the interview flow, all teams shared two common elements: introductory questions at the beginning and summarizing/concluding at the end.

  • Customer Service Representative
    • Introduction
    • General
    • Behavioral
    • Situational
    • Conclusion
  • Sales and Marketing
    • Introduction
    • Behavioral
    • Technical
    • Role Specific
    • Conclusion
  • Healthcare and Services
    • Introduction
    • Behavioral
    • Communication
    • Technical
    • Conclusion

The selection process leading to the interview stage is elaborated upon within the question generation. [6]

Mock Interview Flow

Data Storage

After standardizing columns of different job roles, and interview flow, all data which are collected are combined and stored in the VectorDB notebook which is designed to store, manage, and index massive quantities of high-dimensional vector data efficiently. These databases are rapidly growing in interest to create additional value for generative artificial intelligence (AI) use cases and applications. We used ChromaDB as the vector storage which is an open-source vector database, and HuggingFace embeddings which is open source for embedding text. [7]

Question Generation

When the application gets loaded, the user has to provide his name and job position as an input to the application. At the beginning, the user has to provide his summary regarding his education, experience, etc. The chatbot generates the list of relevant questions to be asked to the candidate/user, through the candidate’s position and a summary of their profile as inputs. The application loads the ChromaDB instance into memory which includes the question ‘collection’.

Collection is a ChromaDB terminology which simply means a data structure that contains the vector embeddings of the questions collected in the data collection phase, and the associated metadata with each question which includes position and interview phase. The application uses a combination of semantic search and Retrieval Augmented Generation (RAG) to generate questions to be asked to the candidate. [8]

In the above table for the given job position, the Interview phase, sequence of the interview, and the number of questions to be generated have been provided. At the beginning, it will generate questions using Semantic Search. For each interview phase, if the Semantic Search is not able to generate all the required questions, then RAG is used for additional questions. “Preset Question 1” and “Present Question 2” columns would be used when both methods (Semantic Search and RAG) aren’t able to generate the questions for a particular interview phase.

Speech to Text

Our HR-based chatbot interview requires verbal answers since strong communication is vital; thus, discouraging text input. Key points:

  • Prioritizing verbal interactions in HR chatbot interviews.
  • Encouraging spoken answers to promote better communication.
  • Deprioritizing typed replies within the scope of this project.

In the project, Streamlit mic recorder is used to record candidate answers, and the speech-to-text function to convert answers recorded from candidates to text and store it in a dictionary along with questions as keys, which will be displayed to the candidate after answering each question and further for evaluation after the end of the interview.

Evaluation Framework

In the evaluation framework, the team worked on developing an evaluation component to evaluate answers generated by the user. There were several tools available for the evaluation framework:

  • Huggingface
  • Mistral 7b model
  • Tokenizer
  • Bitsandbytes 4bit quantization
  • Pipeline
  • Langchain
  • Agents
  • Chains
  • LLM chain
  • Simple Sequential Chain
  • Sequential Chain
  • Chat prompt template

After experimenting with different approaches, the team ended up using chains from the Langchain library. LangChain stands alone as a gateway to the most dynamic field of Large Language Models, which offers a profound understanding of how these models transform the raw inputs into refined and human-like responses. The most basic form of chain within this system is the LLMChain, widely recognized and fundamental. Its operation involves a structured arrangement, including a PromptTemplate, an OpenAI model (either a Large Language Model or a ChatModel), and an optional output parser.

It works by taking a user’s input and passing it to the first element in the chain — a PromptTemplate — to format the input into a particular prompt. The formatted prompt is then passed to the next (and final) element in the chain — a LLM.

The main purpose of the evaluation component is to evaluate the answer provided by the candidate and provide a rating for the answer as poor, average, or good, and then provide qualitative feedback.

Internal work of the component begins with ingesting our collected data into a vector database. The evaluation began after the interview was ended by iterating through questions generated and the provided answers from the candidate.

The evaluation starts with three main stages. First, retrieval starts with embedding the question-answer pairs then searching in our vector database for a similar question, answer, and position, and then fetching three answers one for each rating. Second is the augmentation stage, here we are using few-shot learning by passing our fetched data alongside question-answer pairs to make the model more contextually aware of what we want. The final stage is leveraging LLM to generate the rating and qualitative feedback using the previously passed examples. [9]

Figure

Chatbot UI Design

The User Interface (UI) is the point of human-computer interaction and communication in a device. In our project, we use the Streamlit web user interface. The UI has a sidebar menu and a main page containing three tabs: Q&A, History, and Results.

When the application is loaded at the beginning, the user has to insert his username and job position in the Candidate profile sidebar menu and press start mock interview. Then the main page (panel on the right) containing three tabs gets loaded with information about the interview.

In the Q & A tab, the application will load a summary question at the beginning followed by six mock interview questions, after answering each question the candidate has access to the history tab to check his answers. After answering all questions, the result tab will be triggered to give a summary of the interview.

Application Deployment

We are pleased to announce that we have successfully deployed our newly developed application on the Streamlit cloud server. You can access the application using this link.  Our dedicated team members have rigorously tested the functionality and usability of this application in various locations around the globe. After a thorough examination, we can confirm that it is performing optimally with no reported issues.

To ensure a seamless user experience, we conducted extensive testing covering different scenarios, devices, and network environments. This robust testing phase allowed us to identify and rectify potential bugs, resulting in an efficient and reliable product ready for deployment.

In summary, we are confident that our application will meet your expectations and deliver outstanding performance. We invite you to explore its features and enjoy the benefits it offers. Should you encounter any difficulties or require assistance, please do not hesitate to contact our support team. [10]

AI Chatbot for Interview application developed on Streamlit

Conclusion

We are thrilled to present our newly developed chatbot for mock interviews. Our team has worked hard to ensure that it provides an engaging and effective interviewing experience for job seekers. With its sophisticated natural language processing capabilities and customizable question sets, our chatbot offers a flexible solution for organizations looking to streamline their hiring process.

We believe that our chatbot can help candidates prepare for real-world interviews by simulating common questions and scenarios. At the same time, it can also assist recruiters by freeing up their time to focus on other important tasks. We are committed to continuously improving our product based on user feedback and emerging technologies.

Limitation

The team was able to work on a limited number of job positions. All questions are generated at the beginning using role and summary as input, and questions do not change according to the answers. Evaluation is done by ratings assigned to the answers in the collected data. Candidates have to explain their summary for question generation.

Future Directions

Our future direction is to make our application available for multiple job roles. Interview orchestrator to generate new questions based on responses to previous questions to maintain a consistent interview flow. Evaluating answers based on STAR or similar methods to rate the answers. Instead of giving a summary as input, make the candidate upload a resume/CV and extract the candidate’s profile summary automatically.

Github Repository and Models

Ready to test your skills?

If you’re interested in collaborating, apply to join an Omdena project at: https://www.omdena.com/projects

Related Articles

media card
From Data to Empathy: Building and Deploying Chatbot for Real-World Impact in Disaster Zones
media card
AI-Powered Chatbots Initiative to Enhance Mental Health Support
media card
Revolutionizing Reforestation: The Impact of Chatbots in Forest Restoration Efforts
media card
Shoreline Change Prediction using Satellite Imagery