The story behind this project is driven by the urgent need to address the problem of air pollution in Mexico City. Air pollution in Mexico City is a complex and multifaceted issue that has serious health and economic consequences for its residents.
The city has been struggling with air pollution for decades, and despite various efforts to reduce emissions and improve air quality, the problem persists. As a result, there is a pressing need for new and innovative approaches to tackle this issue and make a real impact.
According to a 2018 study by the Mexican Institute for Competitiveness, air pollution in Mexico City costs the economy approximately $10 billion per year in healthcare costs and lost productivity.
According to a 2020 report by the Mexico City government, air pollution was responsible for more than 9,000 premature deaths in 2019 in the Mexico City Metropolitan Area.
The specific problem that this project aims to solve is to develop machine learning models that can predict air pollution levels in different parts of the city based on various factors such as weather data, traffic patterns, and industrial activity.
Data Gathering and Cleaning
Identify and gather relevant data on air pollution levels in Mexico City from publicly available sources such as the Mexican government and international organizations.
Clean and preprocess the data, including handling missing values and removing duplicates.
Exploratory Data Analysis and Visualization
Conduct exploratory data analysis to understand the patterns and trends in the air pollution data.
Visualize the data using tools such as Tableau to identify any correlations or trends.
Model Development and Evaluation. Develop machine learning models using scikit-learn to predict air pollution levels based on various factors such as weather data, traffic patterns, and industrial activity. Evaluate the performance of the models using appropriate metrics such as R-squared and mean squared error.
Project Presentation and Documentation. Create a final report summarizing the project and its results, including visualizations and explanations of the machine learning models. Present the project to stakeholders, including potential policy makers and community leaders, to showcase the potential impact of the project. Document the code and methodology used in the project for future reference and replication.
Team members will develop technical skills in data analysis, machine learning, and data visualization using tools such as scikit-learn, folium, matplotlib, Tableau. Team members will gain a deeper understanding of air pollution and its causes and effects.