Welcome to the Colombia Chapters!
There are 2 active chapters in Colombia:
- Bogota, Colombia
- EAFIT University, Colombia
Apply here to be a chapter lead for other cities and/or universities in Colombia
EAFIT University, Colombia Chapter
Project starts: 07.07.2022
EAFIT University, Colombia Chapter – Suggested routes within Medellín's public transportation system (Subway)
Chapter Lead – Zcharick Yaday Romeo Campuzano
The Medellín metro was inaugurated in November 1995 and currently carries 1.5 million people daily.
However, in the last few days, from different stations of the mass transportation system Medellin subway, there are long lines at the entrances and there are delays in the circulation of trains that provide the service, which causes delays in the working hours of passengers and affects the satisfaction of the service.
It is important to mention that the municipality is already presenting solutions, such as the arrival of new trains and the construction of the future 80th Street subway that will reach more parts of the city and new social sectors.
The focus we want to give to the project is oriented to passengers and how to make them prevent long lines and congestion inside the station, therefore, we will run an algorithm that shows the possible routes to get from point A to point B, but also shows, approximate time to get there, congestion of each route, and best times to take the subway. With this information, the passenger will have the necessary information to better organize his time and choose the best route and time to reach his destination.
THE PROJECT GOALS:
- Build a model showing the different routes from one point to another, as well as the approximate travel time based on the data set provided by the Medellín metro system.
- Make suggestions on the best route and time to travel on the Medellín metro system.
- Provide the necessary information to passengers before taking any Medellín subway ride.
THE LEARNING OUTCOMES
- Learn how to extract data using google maps API
- Boost up your analytical skills while you do exploratory data analysis with real world data.
- Learn more about the Dijkstra route algorithm and its importance today
THE TASKS & TIMELINE:
Understanding the problem
Data Collection about the coordinates and distances of the streets of Medellín city
Data to Dataframe (Pandas)
Dataframe to Graph (Networkx)
Graph Dijkstra result (Folium)
Data visualization model
Write the final report
Politics Fake News Detector in LATAM (Latin America)
Since the Cambridge Analytica scandal a pandora box has been opened around the world, bringing to light campaigns even involving our current Latinamerica leaders manipulating public opinion through social media to win an election. There is a common and simple pattern that includes platforms such as facebook and fake news, where the candidates are able to build a nefarious narrative for their own benefit. This fact is a growing concern for our democracies, as many of these practices have been widely spread across the region and more people are gaining access to the internet. Thus, it is a necessity to be able to advise the population, and for that we have to be able to quickly spot these plots on the net before the damage is irreversible.
Once the capacity to somewhat detect irregularities in the news activity on the internet is developed, we might be able to counter the disinformation with the help of additional research. As we reduce the time spent in looking for those occurrences, more time can be used in validating the results and uncovering the truth; enabling researchers, journalists and organizations to help people make an informed decision whether the public opinion is true or not, so that they can identify on their own if someone is trying to manipulate them for a certain political benefit.
If this matter isn’t tackled with enough urgency, we might see the rise of a new dark era in latin america politics, where many unscrupulous parties and people will manage to gain power and control the lives of many people. Therefore, the results of the project can provide support for both private and public companies on their future analysis and activities. Additionally, researchers and students could use the outcomes for their own research or use it for learning purposes.
Una vez contemos con la capacidad de detectar irregularidades en las noticias por internet, seremos capaces de contrarrestar la desinformación con la ayuda de investigaciones adicionales. Mientras reducimos el tiempo invertido en identificar estos patrones, podemos dedicar más tiempo a validar los resultados y buscar las verdades ocultas; habilitando a los investigadores, periodistas y organizaciones a que ayuden a la población a tomar decisiones informados de la veracidad de la opinión pública, y que estos puedan identificar si alguien está tratando de manipularlos para beneficio político.
Si este problema no se trata con urgencia, podríamos ver el resurgir de una era oscura en la política latina, donde muchos partidos y personajes inescrupulosos tomarán el poder y el control de la vida de las personas. Por lo tanto, los resultados del proyecto pueden ser de provecho para los análisis y actividades futuros de entidades tanto públicas como privadas. Además de que tanto estudiantes como investigadores pueden hacer uso de los entregables para sus propias investigaciones y/o aprendizaje.
The Project Goals
– To gather and clean datasets from different newspapers and new outlets in LATAM.
– To predict if there is a political affiliation in a certain topic on the news.
– To compare and determine if there is any irregularity between the information available respecting an specific news in the different news sources
– To understand and visualize the information patterns from the news by creating a visualization dashboard.
The Learning Outcomes
- 1. How to gather and clean text datasets from news for data modeling.
- 2. How to use data visualization tools for further app creation and data reporting.
- 3. How to create a classification model with NLP libraries.
The Tasks & Timeline
|Week 1||Week 2||Week 3||Week 4|
– Data collection
– Data collection
– Data cleaning
– Topics Analysis
– Unsupervised Model Creation
|Week 5||Week 6||Week 7||Week 8|
– Division by Branches
– Political Party Classifier (If feasible)
– Map Visualization
– Streamlit App
– Streamlit App
Proposals of workshops topics for the challenge:
Select the workshops you would like us to organize for your project from listed down below and share your thoughts and needs during the kick-off meeting. If you would like us to organize a workshop on a topic not listed here, please mention it here and we will try to find a speaker for it.
- Data gathering and cleaning for News/text
- Topic modeling NLP
- Streamlit – creating interactive visualizations
Building a Career Recommendation System to make Suggestions to Students
Colombia’s dropout rate has increased 37% over the last few years, becoming the second country in Latin America with the highest dropout rate right after Bolivia. Different factors can be related to this rapid increase: new students’ characteristics and the regulation of academic institutions.
This study has considered, as mentioned in the World Bank article, that this high dropout rate correlates, firstly, to the students’ lack of academic preparation due mainly to the low-quality education they receive in high school. Secondly, to the shortage of financial means among low- income students. Thirdly, to the long duration of undergraduate programs, and fourthly, to the academic institutions’ strict regulations concerning career change.
The objective of the project is to help tertiary students choose a study program based on rationality. This may be relevant as their future jobs and possible career opportunities might depend on it. The proposal consists of the construction of a model based on historical data collected from scores that high-school students have obtained after taking ICFES (Colombian Institute for the Promotion of Higher Education) tests.
The structure and relationship with the data are analyzed in order to facilitate the creation of a system that can essentially recommend a plan that would provide high-school studs an academically-oriented route. This project expects to guide students in the identification of undergraduate programs that accurately match their skills at the beginning of their academic careers, thus, potentially reducing drop out, increasing student satisfaction, and improving their experience at the university.
Recommendations are based on the results of the Colombian standardized Saber 11 examination (which is similar to SAT [Scholastic Assessment Test] scores in the U.S.), and how other students with similar characteristics (demographic, socio-economic, family information) performed in their undergraduate tests and the Colombian standardized Saber Pro exam (which is similar to GRE -Graduate Record Examination- scores in the U.S.).
The Project Goals
1. Build a career recommendation model (ruled based or machine learning) based on the ICFES dataset provided.
2. Provide insights on how colombian student population is distributed around the country.
3. Make suggestions in which areas students should improve based on ICFES data in order to reduce lack of preparation in higher education.
The Learning Outcomes
1. Understand basic concepts about recommendation systems and their importance in society nowadays.
2. Learn about Machine Learning frameworks and how to measure your models results.
3. Boost up your analytical skills while you do exploratory data analysis with real world data.
The Tasks & Timeline
Identifying Insights and Perspectives on Colombia’s National Strike in Social Networks using NLP
The objective of the project is to identify these insights, perspectives, sentiments and fake news on Colombia’s national strike on social networks using natural language processing (NLP) or other artificial intelligence techniques to get a much better understanding of what is happening.
The identification of these insights and perspectives may allow the main stakeholders to learn first-hand what is being said about the strike in social networks and identify ideas for resolving the disagreement between the parties. In addition, it is very important to inform people in a simple and truthful way about what is happening in the country with the national strike or with other social problems as well.
The main idea is that people who are in digital environments from which they receive information on a daily basis are able to use and receive the benefits contained in this challenge. Also, this project can be adapted to future social unrest.
The project results will be made open source. The aim being to help ride sharing service providers, regulatory bodies, policy makers etc. while educating aspiring data scientists in solving real-world problems.
The Project Goals
1. Perform web scraping to extract information from social networks
2. Build an NLP model that identifies opinions, perspectives and sentiments of people in social networks. This will allow us to observe differences between social networks.
3. Summarize the opinions on each topic in the context of the national strike through a dashboard to better understand the issues.
Colombia Chapter Leads
Zcharick is a marketing student at EAFIT. She enjoys working in a team and looking for solutions to different problems. She is creative and energetic, always looking for something new to do and learn.
She loves hiking and connecting with nature, she loves social work and she is currently a volunteer and monitor at her university.