Welcome to the Italy Local Chapter Chapter!
Apply here to be a chapter lead for other cities and/or universities in Italy.
Read about our Upcoming Challenge!
Project Starts: 27.06.2022
Duration: 4 Weeks
Trieste, Italy Chapter – Using social media to study the phenomenon of Long Covid in Italy
1.Long Covid is a post-viral syndrome characterised by a prolonged, multisystem involvement and significant disability. According to a study based on a survey that involved an international cohort, the symptoms experienced by affected people are numerous and diverse (e.g. DOI:https://doi.org/10.1016/j.eclinm.2021.101019). After six months from recovery the most frequent are fatigue, post-exertional malaise, and cognitive dysfunction. It can take several months for patients to recover from systemic and neurological/cognitive symptoms. There is evidence that also children are affected.
2.Countries across Europe are underestimating the potential impact of Long Covid. According to numerous scientific studies the incidence of the syndrome could be substantial : up to more than 40 % of survivors could be affected (e. g. https://doi.org/10.1093/infdis/jiac136). The impact on the quality of life of millions of people and the burden on health systems is obvious.
3.The italian Ministry of Health launched recently a project called CCM (https://www.iss.it/it/web/guest/long-covid-razionale) with the aim of monitoring the dimension and the clinical management of Long Covid, survey the centres assisting Long Covid patients in order to define a national network and build a dedicated information network.
In this 4-weeks project, the goal is to leverage the Twitter Stream related to Covid-19 chatter to better understand and characterise the Long Covid phenomenon in Italy using NLP techniques. The publicly available dataset curated by Georgia State University https://github.com/thepanacealab/covid19_twitter collects Long Covid related twitters in several languages. Extracting information relevant to Italy we plan to estimate the magnitude of the phenomenon, the most frequent symptoms, the impact on the quality of life, the ability to work and the mental health of the affected people. Possibly, we could integrate data mined from other social media like Facebook.
The Project Goals
Build a dataset to help study Long Covid in Italy
1.Hydrating available twitter dataset https://github.com/thepanacealab/covid19_twitter and extract data relevant to Italy
- Possibly (depending on resources) collect Long-Covid related data from Facebook.
- Using NLP and sentiment analysis to extract information on symptoms, quality of life, ability to work, psychological and mental health aspects of long haulers.
- Dashboard to visualise the relevant insights.e
The Learning Outcomes
- Twitter and Facebook API
- NLP and sentiment analysis
- Data collection, EDA, data cleaning
- Dashboard for data visualisation
The Tasks & Timeline
|Week 1||Week 2||Week 3||Week 4|
Introduction to NLP and sentiment analysis
Exploration of Covid-19 Twitter Dataset
Defining search criteria : e.g. keywords, potential findings.
(Other) Social networks data mining (depends on resources)
Evaluate tools for data visualisation
Deploy public dashboard for data visualisation
Italy, Trieste Chapter Lead
My name is Elena. I am a software engineer with 10+ years of experience, a MSc in Physics and a PhD in Mathematics. I like coding in C++ and Python and to experiment with Deep Learning techniques to solve computer vision and image processing problems. Looking forward to being involved in projects that leverage AI for public utility.