Leveraging Machine Learning to Predict Accomplishment Rate of Startups

Status: Completed

Project Duration: 25 Jun 2021 - 25 Aug 2021

Open Source resources available from this project

Project background.

More than 100 million startups are launched per year, which is about 3 startups per second. But more than 50% of startups fail in the initial four years. The United States of America is a leading country by the number of startups and has almost three times more startups than most other countries combined. Startups are vital for economic growth in all countries. They bring new ideas, knowledge, innovation, and ample employment opportunities. Anticipating the success rate of startups will help investment firms provide better investment advice and investors look for potential growth in the firm and invest. On average, 9 out of 10 startups fail, and there are several reasons responsible for that.

The problem.

The highly volatile nature of startup firms makes it difficult to interpret the success rate, and due to this intensive nature, it becomes inevitable to use the potential of machine learning or deep learning to build a predictive model.

The first step would be to collect data from online platforms of Pennsylvania state (or the USA in general) by web scraping, Google Forms, and questionnaires, and then analyze the data for a selection of attributes. After analyzing and preprocessing the data, possible models would be discovered which can be implemented for leveraging AI to build machine and deep learning models to predict startup success.

The project results will be made open source. The aim is to build efficient predictive models that can help not only entrepreneurs but also other stakeholders, such as investors, shareholders, suppliers, and customers/clients.

Project goals.

- Collect data on startups from public databases, web pages, creating Google Forms, etc.
- Analyze the data and identify factors using a proper methodology.
- Perform exploratory data analysis.
- Prepare machine learning models.

Project plan.

    Learning outcomes.

    1. Web Scraping

    2. Data preprocessing and EDA

    3. Machine Learning Models

    4. Model Deployment

    Share project on: