Augmenting Public Safety Through AI and Machine Learning

Augmenting Public Safety Through AI and Machine Learning

In this demo day, we took a close look at the tremendous potential AI offers for making communities safer, by helping to reduce, prevent, and respond to crimes. When it comes to public safety, it is often critical to act quickly. AI technologies can supplement the work of people, taking on monotonous and time-consuming tasks that would be impossible for humans to do effectively. Natural language processing can read and analyze public communications and news reports to detect potential problem areas and get-ahead of violence. Of course, this work must be done responsibly and ethically.

Sharing her perspective on the impact that AI can have in keeping people safe was an expert in the field, ElsaMarie D’Silva, the Founder & CEO of the Red Dot Foundation. The Red Dot Foundation’s award-winning platform Safecity crowdsources personal experiences of sexual violence and abuse in public spaces. ElsaMarie is listed as one of BBC Hindi’s 100 Women, and her work has been recognized by numerous UN organizations and the SDG Action Festival.

To go a little deeper into the application of AI for public safety, we shared Omdena projects that took innovative approaches to make communities safer.

 

Case Study 1: Preventing sexual harassment through a safe-path finder algorithm

UN Women states that 1 in 3 women face some kind of sexual assault at least once in their lifetime.”

With the first case study, the Omdena team drew upon Safecity’s crowdsourced data about sexual harassment in public spaces and leveraged open-source data to build heatmaps and calculate safe routes through major cities in India. Part of the solution is a sexual harassment category classifier with 93 percent accuracy and several models that predict places with a high risk of sexual harassment incidents to suggest safe routes.

 

AI Sexual Harassment

 

 

You can learn more about this and related projects here:

 

Case Study 2: Understanding gang violence patterns and actors through Twitter analysis

Our team worked in partnership with Voice 4 Impact, an award-winning NGO whose solution to violence in our communities addresses the questions people worldwide are asking: “How do we keep missing the signs?”

The Omdena team made use of natural language processing techniques — AI techniques that analyze text to understand what is being communicated. Machine learning algorithms were used to understand gang language and AI models built to detect violent messages on Twitter, without profiling. The aim is to predict and ultimately prevent, gang violence.

 

AI Gang Violence

 

You can learn more about this and related projects here:

 

Case Study 3: Analyzing Domestic Violence through Natural Language Processing (NLP)

Finally, we presented Omdena’s work to uncover domestic violence in India hidden due to COVID lockdowns. This work is part of a project with the award-winning Red Dot Foundation and Omdena’s collaborative platform to build solutions to better understand domestic violence and online harassment patterns during COVID-19. The project used natural language processing techniques with social media, government reports, and other text content to create a dataset with which Safecity could mobilize local efforts to protect and support domestic violence victims.

 

 

AI Domestic Violence

 

 

You can learn more about this and related projects here:

 

 

 

 

Host an AI project with us.

 

Overcoming Data Challenges through the Power of Diverse & Collaborative Teams

Overcoming Data Challenges through the Power of Diverse & Collaborative Teams

In this demo day, we talked about the inevitable data challenges/roadblocks that come up in real-world AI projects. The insights shared came from our experiences with more than 20 AI projects, working with partners including the UN Refugee Agency (UNHCR), the World Resources Institute, the World Energy Council, and numerous NGOs and corporations.

Omdena is a collaborative platform to build innovative, ethical, and efficient AI solutions to real-world problems. Since our founding in May 2019, over 1250 AI experts from more than 80 countries have come together on Omdena projects to address significant issues related to hunger, sexual harassment, land conflicts, gang violence, wildfire prevention, and energy poverty.

We’ve seen that the way that we approach AI development, via bottom-up collaboration with diverse team members, fosters innovation and creativity which leads to the breakdown of data roadblocks. Innovation is inherent in the Omdena process.

We shared three Omdena projects to act as case studies for these innovative approaches to tackling data challenges.

 

Data Roadblock 1: Incomplete Data Sets

In the real world, datasets are rarely complete. We find having large teams of dozens of people means that data gathering, cleaning, and wrangling happen at a phenomenal speed. And by taking a bottom-up approach, we have multiple sub-teams looking at data problems from different angles, allowing for innovative approaches to be explored.

In the following case study, the Omdena team worked out ways to identify safe routes in a city in the aftermath of an earthquake, where the relevant data sets were inconsistent and unreliable.

 

Case Study : Disaster Response: Improving the Aftermath Management of an Earthquake

In collaboration with Istanbul’s Impact Hub innovation center, Omdena data scientists combined satellite imagery of Istanbul with street map data in order to build a tool that facilitates family reunification by indicating the shortest and safest route between two points after an earthquake.

“Omdena´s approach to AI development is by far the best that I have seen in 2019” — Semih Boyaci, Co-Founder Impact Hub Istanbul

You can learn more about this project here:

 

 

Data Roadblock 2: No Data

We don’t see the lack of data as a showstopper. On those projects without data, the team starts by asking what do we need to know to address the problem? Where might that data live? If it doesn’t exist, how can we create it from something that does exist? Here the diversity of the team members is very powerful.

We’ve seen time and again the impact of bringing together people with vastly different professional and life experiences. Our teams are typically 30% or more female. On any project, we’ll have on average 14 countries represented. Our collaborators range in age from 17 to 65. Not only does this diversity lead to ethical and trusted solutions, but it also fosters creativity and alternative ideas about what data is relevant and where to find it.

In the following project, we looked at how to assess post-traumatic stress disorder among those that have suffered trauma in low-resource environments. In this case, the team started with no data in-hand.

 

Case Study : Building a chatbot for Post-traumatic-stress-disorder (PTSD) assessment

32 Omdena collaborators developed a machine learning-driven chatbot for PTSD assessment in war and refugee zones.

 

The unique aspect of the project was that we did not start with a data set.

Through the collaborative efforts of the project community, the team identified and annotated suitable patient data. The teams applied linear classifiers for Natural Language Processing (NLP) for PTSD risk assessment and transfer learning for data augmentation.

You can learn more about this project here:

 

Data Roadblock 3: Disparate Data Sources

Relevant data doesn’t typically come packaged in just one form. We often need to meld disparate data sources to get at a solution. Through collaboration, sub-teams focused on separate data and AI techniques come together to integrate those efforts to derive insights about the problem.

In the following project, the goal was to uncover domestic violence in India hidden due to COVID lockdowns. Among the many challenges the team addressed was the integration of data culled from disparate sources.

 

Case Study : Analyzing Domestic Violence through Natural Language Processing

This project was done with the award-winning Red Dot Foundation. Within Omdena’s collaborative platform, the team looked craft a dataset to reveal domestic violence and online harassment patterns in India during COVID-19 lockdowns. The AI experts scrapped data from news articles as well as social media to apply various natural language processing (NLP) techniques such as topic modeling, document annotations, and stacking machine learning models.

 

 

You can learn more about this and related projects here:

 

 

 

More about Omdena

Omdena is the collaborative platform to build innovative, ethical, and efficient AI and Data Science solutions to real-world problems. 

Analyzing Domestic Violence in Lockdowns: From No Data to Building an ML Classifier for Tweets

Analyzing Domestic Violence in Lockdowns: From No Data to Building an ML Classifier for Tweets

By Omdena Collaborator Harshita Chopra

 
Data mining, topic modeling, document annotations, NLP, and stacking machine learning models: A complete journey.

Artificial Intelligence and its possibilities have always fascinated me. Making machines learn through data is nothing short of amazing. When I got to learn about Omdena and it’s a wonderful initiative for bringing AI to social good using the power of global collaboration, I couldn’t stop myself from participating in its empowering challenges.

I felt truly content to be given the role of a Machine Learning Engineer in my first challenge. Connecting with a team of 50 fabulous collaborators from various countries around the world, including domain experts, data scientists, and AI experts — felt like a golden opportunity to gain knowledge in the best possible way.

It provided a space to create value out of my ideas and learn from the enhancements. The harmonizing environment gave me the experience of leading task groups and interacting with some really innovative minds.

In this blog post, I’ll walk you through a major part of the project I led and contributed to for the past month.

 

The Problem

There has been a surge in Domestic Violence (DV) and online harassment cases during COVID-19 lockdowns in India. Homes are no more a safe place for victims trapped in abusive relationships with their family members.

Domestic violence involves a pattern of psychological, physical, sexual, financial, and emotional abuse. Acts of assault, threats, humiliation, and intimidation are also considered acts of violence.

Data substantiating Domestic Violence from government resources are only available in summary form. Incidents are largely reported via calls, and hence make data and subsequent mapping difficult.

The goal of the challenge was to collect and analyze data from different social media platforms or news sources so as to gain insights on the rise in DV incidents during the nation-wide lockdown.

 

The Solution

Diverse social media platforms come up as a huge and largely untapped resource for social data and evidence. It generates a vast amount of data on a daily basis on a variety of topics. Consequently, it represents a key source of information for anyone seeking to study various issues, even the socially stigmatized and society tabooed topics like Domestic Violence.

Victims experiencing abuse are in need of earlier access to specialized services such as health care, crisis support, legal guidance, and so on. Hence the social support groups for a good social cause play a leading role in creating awareness promotion and leveraging various dimensions of social support like emotional, instrumental, and informational support to the victims.

Red Dot Foundation plans to deal with this challenge. When the victims seek help, it is important to identify and analyze those critical posts and acknowledge the help needed with more immediate impact.

Tasks were divided to mine data from different sources: Twitter, Reddit, YouTube, News articles, Government reports, and Google trends. After the acquisition of huge amounts of data, the next step was filtering out relevant posts through topic modeling and keywords. This was followed by annotation of data and then building an NLP based machine learning classifier.

In this blog post, Tweets would be in the spotlight!

 

 

Scraping data with the right queries

Tweets need to be extracted in the pre-lockdown and during the lockdown period so as to judge the surge in domestic violence. Hence, we took a time-frame of January’20 to May’20.

Tweepy (the official tweets scraping API of twitter) extracts tweets only from the past seven days, making it a bothering limitation. Hence, we needed an alternative for mining old tweets with the location.

GetOldTweets3 is a fantastic Python library for this task. Twitter’s advanced search can do wonders for generating your customized query. In order to extract harassment-related posts, here are a few examples of queries we used:

 

 

 

Using ANDcombinations of relationships words with actions and nouns yield good results. The until and since attributes hold the limits of the time frame.

The setNear() feature accepts a location name (like Delhi, Maharashtra, India, etc) or latitude and longitude of that region. The central point of India is approximately around (22,80) degrees. The setWithin() feature accepts the radius around that point, and 1800 km generally covers India and nearby places.

After executing more such queries with different keywords, we had thousands of tweets in handsome relevant topics and some irrelevant.

 

Data needs to be classified  –  Would topic modeling work?

Since a considerable number of tweets in our huge datasets were not related to the kind of harassment we were looking for, we needed some filtering. Classifying tweets into broad topics was the goal. Topic modeling was the first thing that clicked.

Topic modeling is an unsupervised learning process to automatically identify topics present in a collection of documents and to derive hidden patterns exhibited by a text corpus. Thus, assisting better decision making.

Latent Dirichlet Allocation is the most popular technique to do so. It assumes that documents are produced from a mixture of topics. Those topics then generate words based on their probability distribution. Given a dataset of documents, LDA backtracks and tries to figure out what topics would create those documents in the first place.

 

 

Topic 0 words are generally included in awareness posts or #BanTiktok posts due to its inappropriate content.
> Topic 1 words are headlines or real victim stories.

 

Topic modeling works best when the topics are considerably distinct or not too related to each other.

The generated topics didn’t satisfy our target of classifying as relevant or irrelevant. Hence we had to pick up another approach, since our dataset, in general, talks about kinds of harassment.

Rule-based classification turned up to be a more precise approach in this task. I created three sets of keywords to look for — relationships, violence verbs, and not-relevant words. The following algorithm as implemented in Python to filter out some documents.

 

Relationships: [List of keywords like wife, husband, housemaid etc.]
Verbs        : [List of keywords like harass, beat, abuse etc.]
Not-relevant : [List of keywords like webinar, politics, movie etc.]
Iterating through document:
{ 
  R1: (any word in Relationships) AND (any word in Verbs) -> 'Keep'
  R2: (any word in Not-relevant) -> 'Discard'
}

 

The dataset was filtered pretty much based on our customer needs. Now comes up with the task of modeling. But we need annotations for training a supervised model.

 

Deciding Labels and Annotating Tweets

Eminent domain experts helped in coming up with the categories to classify tweets based on their context. Proper labeling guidelines were set up and training sessions helped to label tweets properly, keeping in mind the edge cases. For document classification, a quick and awesome tool called Doccano was used. Several collaborators helped by taking up queues of data points and annotating them. Following were the labels used:

  • DV_OPINION_ADVOCATE
    (advocating against domestic abuse)
  • DV_OPINION_DENIER
    (denying the existence of domestic abuse)
  • DV_OPINION_INFO_NEWS
    (stating factual information or news)
  • DV_STORY
    (describing an incident of domestic abuse)
  • NON_D_VIOLENCE_ABOUT
    (other kinds of harassment)
  • NON_D_VIOLENCE_DIRECTED
    (harassment directed at individual or community)
  • NO_VIOLENCE

 

Domestic Violence Twitter

Analytics derived from NER.

 

 

And data for modeling is ready!

After all the annotations and some fabulous work with collaborators, we’re ready with an incredible training dataset.

Tinkering with Natural Language Processing…

Once pre-processing of texts by lowering cases, removing URLs, punctuation, stopwords, followed by lemmatization was done — we were ready to play around with modeling techniques.

To convert words to vectors (machine learns through numbers), experimenting with TF-IDF Vectorizer gave good results but we had a very limited vocabulary, while the inference data would have a greater variety of words. Therefore, a decision of using pre-trained word embeddings was made.

Our model used FastText English word vectors(wiki-news-300d-1M.vec) and IndicNLP word vectors (indicnlp.v1.hi.vec) for Hindi and Hinglish languages present in the documents.

Since tweets related to DV stories were quite less in number, data augmentation was used on these — by creating new sentences using synonyms of the original words.

nlpgaugis a library for textual augmentation in machine learning experiments. The goal is to improve model performance by generating augmented textual data. It’s also able to generate adversarial examples to prevent adversarial attacks.

 

 

Bringing into play  – Machine Learning Model(s)

A number of models including BERT, SVM, XGBoostClassifier, and many more were evaluated. Since there were really minute differences between some similar classes, we needed to combine two sets of labels.

After combining similar labels:

 

 

Limitations faced — Data under various classes was not easily separable because 3 classes plainly talked about Domestic Violence (story, opinion, news/info) which made it tough for the classifier to spot marginal variation in semantics.

Also, data under DV_STORY had the least number of samples given the fact that it was the most relevant class.

Hence, to deal with an imbalanced dataset, Under Sampling using NeighbourhoodCleaningRule was used from the imbalanced-learn library. The resampled data was fed to Stacked Models.

Stacking is a way of combining predictions from multiple different types of ML models, that introduces the concept of a meta learner.

 

 

Source: GeeksforGeeks

 

Level 0 learners:
– Random Forest Classifier
– Support Vector Classifier
– MLP Classifier
– LGBM Classifier

Level 1 meta-learner:
SVC with hyperparameter tuning and custom class weights.

 

Class Encoding — 0: DV_INCIDENT, 1: DV_OPINION, 2: DV_OPINION_INFO_NEWS, 3: NON_D_VIOLENCE_ABOUT, 4: NO_VIOLENCE

 

This pretty much sums up the modeling. This model was used to predict labels on 8000 rows long dataset containing tweets. The misclassifications were skimmed through and corrected in some crucial classes in order to deliver the best data.

I feel glad to be a part of this incredible community of change-makers. Making some great connections through this amazing journey is like an icing on the cake.

So excited to collaborate in the upcoming mind-blowing projects, making the world a better place using AI for good!

 

 

 

More About Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

 
| Demo Day Insights | How COVID-19 Pandemic Policies Affected the Vulnerable Populations

| Demo Day Insights | How COVID-19 Pandemic Policies Affected the Vulnerable Populations

By Devika Bhatia & Laura Clark Murray

 

A team of 28 AI experts and data scientists collaborated to gauge the impact of pandemic policy implemented post-COVID-19 on vulnerable populations to find correlations and encourage data-driven policymaking to lessen the adversity for the most vulnerable populations around the world.

The entire data analysis including a live demonstration can be found in the demo day recording at the end of the article.

 

COVID-19 pandemic policy impacting the world’s vulnerable populations

At the onset of the pandemic in 2020, the World Health Organization urged governments to take “urgent and aggressive action” against COVID-19. Many governments reacted with strict measures such as closing borders and quarantining entire cities. Governments all over the world enacted these policies without fully analyzing the factors that impact their effectiveness. Nor did they consider how these policies might deepen the problems for vulnerable populations in different regions.

 

The project goal: Conduct data-driven impact-analyses on how various pandemic policies affect the well-being of vulnerable populations.

 

Defining “Vulnerability”

An important step of the project was to define “vulnerability” with respect to the particular context. The project focused on the factors of employment and wage loss, access to health, and domestic violence. To identify the vulnerable population for each of these categories, the team looked to the Inequality-adjusted Human Development Index, considered populations above 65 years of age, and women.

 

Source: UNDP

 

 

 

 

Assessing policies and their effects

The team looked at 17 types of policies from the Oxford COVID-19 Pandemic Government Response Tracker, across the categories of containment, economic response, and health systems. The policies explored included closing of public transportation, stay at home requirements, income support, COVID-19 testing policy, and emergency investment in healthcare.

To analyze the effects of these policies, three key aspects were considered:

  • Time of policy enactment: comparing the time of policy enactment with the effect on a target variable
  • Stringency metric: the degree of intensity of the policies enacted
  • Google Mobility Dataset: quantifies the movement of people in places (e.g. grocery stores vs. parks)

 

Domestic violence as a ‘Shadow pandemic’

It was ascertained that domestic violence is a growing shadow pandemic as countries displayed a relationship between a decrease in mobility and an increase in the google search rates of relevant topics, coupled with an increase in the number of domestic violence-related articles.

The number of news articles related to both Covid19 and domestic violence started to increase a couple of weeks after the first lockdown measures were implemented in Europe (end of February).

 

Figure 1: Graph between Ratio of News Articles and Date of recording the values

 

The data shows a strong relationship between a decrease in mobility and an increase in Google search rates of domestic abuse topics in many countries. In the countries considered, other than Japan, the peak in search rates has doubled or even tripled, as seen in these graphs of the data from France and India.

 

Figure 2: Graph between Search trend and mobility change (%) and Date recorded with two different categories namely: Schooling Closing and Workplace Closing for France

 

Figure 2: Graph between Search trend and mobility change (%) and Date recorded with two different categories namely: Schooling Closing and Workplace Closing for India

 

The results indicate that the problem of domestic violence could be much bigger than indicated by news stories.

 

Access to healthcare

The effects of COVID-19 pandemic-response policy measures on access to healthcare, specifically for non-COVID patients was a fascinating angle in this challenge.

The team sought to understand the effects of policy measures on access to healthcare, specifically for non-COVID patients. The vulnerable population was defined based on age, existing chronic medical conditions, and physical access to care facilities. The analysis was focused on England and Wales where there was significant relevant data.

It was found that there was high-mortality among patients with non-COVID chronic diseases during the pandemic as compared to the numbers for the same group in previous years. The data shows a correlation between medical appointment status, such as whether an appointment was kept, changed, or canceled, the stringency of the pandemic policies enacted for the region, and the mobility of the population in that region. In other words, the stringency of pandemic policy and the resulting restrictions on the mobility of a population may cause the medically-vulnerable to miss or avoid regular medical care. And this may be contributing to the increase in non-COVID deaths among this group.

 

 

 

The economic impact of pandemic policies

Closures, lockdowns, and decreased mobility have led to wage and employment loss. Though some governments have instituted income support policies, the timing of that aid correlates to employment loss. In countries where income support policies were put in place at roughly the same time as stringent lockdown policies such as workplace closings, the unemployment rate remained relatively flat. This was the case, for example in Sweden and Belgium. In contrast, a delay in the implementation of income support policies correlates to an increased unemployment rate, as was seen in the United States.

Income support policies may affect individuals in the labor force differently. Many countries have undergone employment and wage loss in the informal economy, wherein enterprises, jobs, and workers are not protected by the state.

The team set out to identify the most economically vulnerable populations in this context. The analysis focused on those countries with stringent lockdowns that have implemented income support policies, and in which the population works in sectors highly-impacted by the pandemic policy, such as accommodation and food service, manufacturing, and retail trade.

Some of the results of this analysis are represented here by a mapping of countries according to the stringency of their pandemic policies and the share of their labor force participation in highly impacted sectors. Each country is represented as a circle, the color, and size of which indicates the vulnerability of the workforce in terms of the share of the workforce involved in informal labor.

 

Vulnerability ranked by Informality Rate

 

Circle size denotes vulnerability, defined in terms of percentage of worker in high impact sectors and share of workforce involved in informal labor.

Figure 3: The circle size denotes vulnerability, defined in terms of percentage of workers in high impact sectors and share of the workforce involved in informal labor.

 

This type of topology of the vulnerability of labor forces during the pandemic may be useful in indicating which groups to attend to with income support policies.

 

Conclusions

While government lockdown policies were designed to slow the spread of COVID-19, they had direct and indirect negative effects on their populations.

  • We found that non-COVID deaths of those with existing health conditions and considerations increased during the pandemic, for the population studied. For this medically-vulnerable population, we found a relationship between the stringency of lockdown pandemic policy and the level of mobility within a locality, with the delivery of non-COVID, and potentially life-saving, healthcare.
  • Domestic violence emerged as a growing “shadow pandemic”. We found a strong relationship between a decrease in mobility of a population and indicators of domestic violence.

 

To offset the economic impact of anti-contagion policies, many governments instituted income support policies.

  • We determined that the timing of income support policies mattered. For the locations studied, when income support policies were implemented at the same time as lockdown measures, unemployment rates stayed flat. In contrast, in countries where income support policies were delayed, unemployment rate curves remained steep even after policy implementation.
  • The team created economic vulnerability assessments of countries, by considering the stringency of lockdown policies and the share of the labor force involved in highly-impacted sectors and in the informal economy. Income support policies may be more effective when such vulnerability is considered.

 

Our objective with these results is to support policymakers in finding the most effective ways to minimize the suffering of those most vulnerable.

 

Find all insights in the demo day recording

 
All Collaborators from this project

We thank our partner organizations, AI for Peace, SH4P, and PWG. as well as all Omdena collaborators (listed below) who made the project a success.

Omdena collaborator

 

More about Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

Domestic Violence - The Shadow Pandemic of COVID-19

Domestic Violence - The Shadow Pandemic of COVID-19

 

By Omdena Collaborator Elke Klaassen


 

 

The Problem: Effects of policy measures on the vulnerable population

 

To prevent the spread of Covid-19, many governments have been taking strict measures such as closing borders, imposing nationwide lockdowns, and setting up quarantine facilities. While these measures may ensure that social distancing is followed seriously, they may have indirect effects on the economy and adverse effects on the well-being of people, especially the vulnerable population. To help governments make data analysis-driven policy decisions to effectively deal with issues like during COVID-19 like Domestic Violence, Omdena provided an enabling platform to AI experts, data scientists, and domain experts so that they could study the effects of Covid 19 policy measures on the vulnerable population. This article describes the results of one of many facets of this challenge, which focused on the impact of Covid-19 on domestic violence using Data Analysis.

The goal of this task was to get a better grip on domestic violence during COVID-19 and gauge the scale of the problem. To this end, different data sources were used — including news articles, policy data, mobility trends, and domestic violence search rates. The results indicate that the problem of domestic violence could be much bigger than indicated by some of the key figures mentioned in the news. Further, restrictions on movement and strict enforcement of lockdowns may have further amplified the issue. It can be said that domestic violence is a shadow pandemic and it is integral to understand the gravity of the problem and ensure redressal and support to survivors and vulnerable populations.

 

 

Domestic violence — a growing shadow pandemic of COVID 19

The UN Women recently labeled the increase of violence against women as ‘a growing shadow pandemic’. As a consequence of Covid19 policy measures, many victims find themselves in proximity to their abusers due to lockdown measures. The world is witnessing a sharp rise in the number of helpline calls, domestic violence reports, as illustrated in the following infographic. This highlights the pressing need to reflect upon the pre-existing and growing incidence of domestic violence and sensitizing organizations and communities at the grassroots level to provide help and support.

 

 

Infographic on Covid19 and domestic violence adapted from the UN Women.

Infographic on Covid19 and domestic violence adapted from the UN Women.

 

 

 

The shadow pandemic’s size— news coverage

The news is replete with reports and cases of domestic violence and its surge during the pandemic. During March beginning, the increase in domestic violence in China received coverage in the news. In the Hubei Province the number of reported cases had tripled in February, compared to the same period last year. Weeks later, similar articles appeared from all over the world.

To get the first grip on gravity and spread of this shadow pandemic, a dataset of about 80,000 Covid-19-related news articles was used. This dataset was created using GDELT to query relevant articles and news-please to extract contents. The said dataset has been used for different analysis in the Omdena AI pandemic challenges. To identify the news articles related to domestic violence, the corpus was filtered based on domestic violence-related keywords. In total 1,500 articles were linked to both Covid-19 and domestic violence  using Data Analysis revealing a connection.

 

 

Covid19 and domestic violence-related articles

To assess the relevance of the subset of domestic violence-related news articles, LDA topic modeling was performed, using gensim. Three topics were modeled, and one of these clearly illustrates that the considered subset covers domestic violence. The world-cloud of this topic is shown in the figure.

 

Graph between Number of news articles and Date

Number of both Covid19 and domestic violence-related news articles over time.

 

The absolute increase in domestic violence-related articles

The number of news articles related to both Covid-19 and domestic violence started to increase a couple of weeks after the first lockdown measures were implemented in Europe (end of February).

 

Relative increase

The increasing trend in domestic violence-related articles could be explained by an overall increase in Covid-19 related articles. To study whether the topic of domestic violence has become more dominant in the discussion, the ratio of domestic violence-related articles versus the total number of Covid-19 related articles is illustrated. An increasing trend can be observed using Data Analysis, indicating that the issue of domestic violence has become more dominant post the onset of the pandemic.

 

 

Graph between Ratio of Domestic Violence News Articles and Date

Domestic violence-related news articles are relative to Covid19 related news articles.

 

 

The shadow pandemic’s size — search rates

The data mentioned in the news is typically in summary form, similar to the key figures shown in the Infographic of UN Women. To get a more detailed grip on the extent and size of the shadow pandemic, different datasets were used:

  • Policy data:
    OXFORD COVID-19 Government Response Tracker (OxCGRT), covers the policy measures taken in 152 countries (accessed on May 8, 2020).
  • Mobility data:
    Google COVID-19 Community Mobility Reports, indicate the percentual changes in mobility patterns in 132 countries (accessed on May 8, 2020). The data is relative (_rel) to the mobility patterns between January 7 and February 7, 2020. To limit stochasticity, a moving average (_ma) filter of 7 days (1 week) was applied.
  • Search data:
    Google Trends data, indicates the search trend of a certain topic over time (accessed on May 8, 2020). To get the percentual change (_rel) in search rates, this date is made relative to a baseline period as well (Jan 3 — Feb 13). To remove stochasticity a moving average filter (_ma) of 14 days (2 weeks) was applied to the Google Trends data.

 

The data analysis focuses on countries that are present in all three datasets, and that have sufficient Google Trends data available. The condition of having data available for at least 50% of the considered time period (Jan 3 — May 8) was imposed. This ensured that the analysis was expansive and included a total number of 53 countries.

The search trend data is considered to be relevant for studying the scale of the problem in situations where one is in search of help, has access to the internet, and has a certain level of trust in societal organizations to be able to offer help. Evidently, the last two conditions are not met in different countries to the desired level across the world. This is, amongst others, reflected in the Human Development Data — for example, the % of the (female) population that has access to the internet. Hence, the results should be considered with these conditions, caveats, and nuances in mind.

Further, the use of search rates has a clear advantage. The victim’s quest for help and receiving help is expected to consist of several steps; and more courage is required for every succeeding step that needs to be taken. The most basic step might be to browse the web for ways to deal with and seek help for domestic violence. Hence, search rate data might reflect the scale of the real problem more accurately than the number of domestic violence reports, because the search rate is probably the first step a victim might take in seeking assistance. 

 

Correlation between policy measures, mobility, and domestic violence search rates using Data Analysis

 

The first step in the analysis is to study correlations between the different features in the dataset. The correlation plot for France is shown below. A highly negative correlation (-0.95) between workplace mobility and domestic violence search rates can be observed. And, as expected, workplace mobility highly correlates with the workplace closing policy measure that was implemented by the government.

 

 

Correlation plot of the different features of the policy, mobility, and search rate dataset (France).

Correlation plot of the different features of the policy, mobility, and search rate dataset (France).

 

 

Graph between Search Trend and Mobility Change and Date

Policy measures, mobility, and search rate trends over time (France).

 

In the figure, the trend of workplace mobility and domestic violence search rates is visualized over time. The negative correlation between both variables is illustrated by the decrease in workplace mobility, while at the same time there is an increase in domestic violence search rates. Compared to the baseline, search rates almost doubled (100% increase). This indicates that the incidence of searching for information related to domestic violence increased with the decline in workplace mobility and as people found themselves stuck at home.

 

Regression models to quantify the effect of mobility on domestic violence search rates

Regression models were used to assess the size and significance of the relationship between workplace mobility and domestic violence search rates.

 

Regression model results of the impact of mobility on domestic violence search rates (France).

Regression model results of the impact of mobility on domestic violence search rates (France).

 

The linear line in the scatter plot is the illustration of the output of the regression model for the case study of France. The relationship between mobility and domestic violence is significant, and the slope indicates that with every 1% decrease in mobility, domestic violence search rates increase by 1.4%.

The results of the models for the countries in the top 10 and bottom 10 are listed below. In the top 10 countries, decreasing mobility correlates with a steep increase in domestic violence search rates. In the bottom 10 countries, the opposite trend is observed: mobility and domestic violence both decrease at the same time. To further study and explain the results of the different models, the individual plots for the first six in the categories of the top 10 and bottom 10 countries are shown in the next section.

 

 

Tabular format describing top 10 countries to bottom 10 countries defining Pvalues, Coefficient, Country, and Significance.

 

 

Countries illustrating a strong relationship between a decrease in mobility and an increase in domestic violence

The individual figures for the first six among the top 10 countries are shown. These countries have a strong relationship between mobility decrease and domestic violence increase.

 

Graph Between Search Trend and Mobility Change % vs date for 6 countries namely, Vietnam, Japan, South Africa, Germany, France, and Belgium.

 

  • With the exception of Japan, the peak in search rates has doubled or even tripled in each of the illustrated countries.
  • Although the coefficient in Japan is relatively high, the peak in search rate is ‘just’ 60%. This is due to a relatively limited decrease in mobility, likely due to less strict lockdown measures in this country.
  • Vietnam stands out with a peak in domestic violence search rates that increased by more than triple the baseline. The issue of domestic violence in light of social distancing in Vietnam is stressed in this article as well, stating that the number of people who are in need of shelter has doubled compared to 2018 and 2019.
  • The figures for Germany, France, Belgium, and South Africa, clearly illustrate the increasing trend in domestic violence search rate as mobility drops.

 

Countries not illustrating a relationship between a decrease in mobility and an increase in domestic violence

The individual figures for the final six countries among the bottom 10 countries are displayed below and show a positive relationship between mobility and domestic violence.

 

Graph Between Search Trend and Mobility Change % vs date for 6 countries namely, Australia, Thailand, South Korea, Jamaica, El Salvador, and Philippines.

 

  • First of all, the plot for Australia stands out, which witnessed a high increase in domestic violence towards the end of February. The sudden rise in domestic violence in Australia is assumed to be a consequence of the bushfires which occurred around this time. This relationship is also expressed in this article: ‘the bushfires’ hidden aftermath: Surging risk of domestic abuse
  • In South Korea, lockdown measures could be considered to be more targeted instead of strict blanket measures, and this could explain the unique trend displayed for this country as compared to the others.
  • For the Philippines, Thailand, El Salvador, and Jamaica, the simultaneous drop in domestic violence search rates and mobility is visible. This does not mean that there have been fewer domestic violence incidents. There can be various other factors influencing the observed search rate trends. For example, the peaks in search rates in these countries towards the late February / beginning of March could be explained by the (media) attention given to domestic violence in light of International Women’s Day on March 8. there was a large turnout for the different marches that were held that day, both in Asia and Latin America.
 

 

Action is needed to mitigate the increase in domestic violence

This article studies the impact of the Covid-19 global pandemic on domestic violence. The increase in domestic violence can be viewed as the ‘growing shadow pandemic’. This is stressed by the news as well — there is an increasing trend in the number of articles that cover the issue. Some of these articles give insight into the gravity and scale of the ‘growing shadow pandemic’ in summary form. For example, the Infographic of UN Women, shown at the beginning of this article, mentions that in France, Argentina, Cyprus and Singapore domestic violence emergency calls and reports have increased by more than 30%.

 

The results indicate that the problem of domestic violence could be much bigger than indicated by some of the key figures in the news
The Data analysis of Google mobility and search rate trends shows that the effect of lockdown measures on domestic violence, such as the closing of workplaces, can be much higher than 30%. In countries where the inverse relationship between the decrease in mobility and increase in domestic violence is strongest, search rates have doubled, and some more than tripled. A search query could be considered the most accessible step in seeking out help. This could explain why the results in this article indicate that the problem of domestic violence could be much bigger than the previously mentioned key figures.

It is important to note that there are many other factors that can influence the search rate results. The extent to which the search rates may accurately reflect the growing scale of the problem of domestic violence also depends on the situation the countries are in. As stated before, a victim is only expected to perform a search query if s/he has access to the internet and a certain level of trust in societal organizations to be able to offer help. These assumptions could explain that a strong relationship is found in many European countries in this study.

The aim of this work is to help build awareness on the issue of domestic violence. Although some countries have adopted steps to mitigate problems, the results clearly indicate that the issues persist. In this light, the UN recently published a brief with ‘recommendations to be considered by all sectors of society, from governments to international organizations and to civil society organizations in order to prevent and respond to violence against women and girls, at the onset, during, and after the public health crisis with examples of actions already taken’.

 

 

 

More About Omdena

 

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

 
 

Stay in touch via our newsletter.

Be notified (a few times a month) about top-notch articles, new real-world projects, and events with our community of changemakers.

Sign up here