A Chatbot Warning System Against Online Predators

A Chatbot Warning System Against Online Predators

Using Natural Language Processing to warn children against online predators.

The following work is part of the Omdena AI Challenge on preventing online violence against children, implemented in collaboration with John Zoltner at Save the Children US.


Protecting Children

Today, children face an evolving threat — online violence. Violence and harassment of children have been growing exponentially for more than 20 years but due to the recent events leading to the closing of schools for over 1 billion children around the world, children are more vulnerable than ever. Online predators use Internet avenues popular with children and adolescents such as game chat rooms to lure them into sexual exploitation and even in-person assaults.

Protection against online sexual violence greatly varies from platform to platform. Some gaming platforms include a profanity filter that looks for problematic words and replaces them with a string of asterisks. Outside the gaming platforms, many chat platforms still do not have any safeguards in place to protect children from predatory adult conversations. However, chat logs can provide information on how a predator might attempt to exploit children and young adults into risky situations, such as sharing photos, web cameras, and sexting (sexual texting).

Often, pattern recognition techniques provide an automated identification of these conversations for potential law enforcement intervention. 

Unfortunately, this strategy uses many man-hours and spans many message categories, which makes it all the more difficult to identify these patterns. It is a challenging task, but one that is worth tackling, and we elaborate on our approach in the rest of the article.


Online Predators



First, let us establish our working definitions.

  • The New Oxford American Dictionary defines a predator as a “person or group that ruthlessly exploits others.”
  • Expanding the term to a sexual predator as “a person seen as obtaining or trying to obtain sexual contact or favor with another person in a metaphorically ’predatory’ manner”, Daniel M. Filler, Virginia Journal of Social Policy & the Law (2003).


The Solution — Data Engineers Unite!


Online Predators

The team focused on the solution’s Predator Analysis portion.


Our solution looked to reduce man-hours and to develop a near real-time warning system to the chat that alerted the child when the conversation changes sentiment. The team used a semi-supervised approach to evaluate if the conversations provide a low, medium, or high risk to the child. The system would evaluate the phrase or sentence and return an effective sentiment warning if warranted. The data for our chatbot (Predator-Pseudo-Victim conversations) was collected from interactions between a predator and a law enforcement officer or volunteer posing as a child.

The chatbot was designed to learn from non-predatory and predatory conversations and distinguish between them. Additionally, it would have the ability to recognize inappropriate messages no matter whether they came from the predator or the child’s side. The corpus also had adult-like conversations initiated from the child’s side.


The Dataset

The team consolidated and cleaned nearly 500 chat log files that contained exchanges between a predator and a pseudo-victim. The collection grew into a corpus containing 807,000-plus messages ranging from “hello” to explicit remarks. The dataset creation proved laborious, where I voluntarily provided more than 630 hours in just labeling data. The dataset received labels, such as male or female as they identified themselves in the chats, predator or victim, and level of risk of the conversation. Nearly half of the project time was dedicated to a properly built and parsed dataset.

This dataset was split into a training, development, and test set. The training set held 75 percent of all messages for the chatbot to learn the contextual format and nuances of conversation. The development set, which was 10 percent of the data, was held away from the chatbot until after model selection, to prove the validity of the model.

The 2 mins video quickly discusses how the team assembled the chatbot’s dataset.



Data Format and Storage

The data was housed in a relational database. It became large enough to serve as a nexus to provide uniquely formatted datasets for the machine learning pipeline.

During the labeling process, few issues arose on how to semantically define a conversation. With many different log formats, ranging from AOL Messenger to SMS and other online platforms, the sentences would start and stop at different points. In conversations, I implemented a similar format as used in the competitively-used Cornell University’s movie corpus that provided a standard structure making it easy to parse the data. Additionally, the corpus contained chat slang, abbreviations, and number-for-words, like “l8r” for “later” and “b4” for “before”, which required a team consensus on how to handle these stopwords. The team did not focus on timestamps due to extremely varied formatting, missing values, and lack of importance to the overall project.


The Model

Many models presented as candidates to the chatbot’s internal workings. The main goal for the team was to have a local and offline solution for now. This was done to reduce privacy concerns and legal issues. Future considerations of this project would evaluate these features with appropriate development operations.


Online Predators

Basic sequence-to-sequence model diagram.


The selected model focused around a Long Short Term Memory (LSTM) network cell, arranged as a sequence-to-sequence configuration. LSTMs have long been proved well-suited to work with sequential data. Our application would use this ability to help the chatbot predict the next plausible word to use for response.

For the sentiment analysis portion, we focused our efforts on an ensemble learning model as well as a support vector machine to help predict when the conversation changed from benign to risky.



Our team successfully built a chatbot and a sentiment analysis model independently. The chatbot learned from its more than 807,000 messages to understand how to parse sentences and structure a proper response. The limited vernacular stemmed from the chatbot’s time to learn and framework limitations.

The greatest challenge to code performance-centered inside the platform chosen, TensorFlow 1.0.0, provided limitations. The code did provide a conversation-capable entity, but the model needs more training data if we want to go beyond proof-of-concept to deploy it in an application.

The project successfully employed message sentiment analysis and was able to warn the user of potentially risky conversations initiated by online predators. The sentiment analysis ranged from low, medium, or high levels of risk.

Future considerations will take this project into a full-functioning environment of TensorFlow 2.1.0, eliminating other frameworks, including PyTorch. The internal model will receive an update to the LSTM structure and performance will be improved with the use of graphics computing processors, such as NVIDIA and its cuDNN framework.

Augmenting Public Safety Through AI and Machine Learning

Augmenting Public Safety Through AI and Machine Learning

In this demo day, we took a close look at the tremendous potential AI offers for making communities safer, by helping to reduce, prevent, and respond to crimes. When it comes to public safety, it is often critical to act quickly. AI technologies can supplement the work of people, taking on monotonous and time-consuming tasks that would be impossible for humans to do effectively. Natural language processing can read and analyze public communications and news reports to detect potential problem areas and get-ahead of violence. Of course, this work must be done responsibly and ethically.

Sharing her perspective on the impact that AI can have in keeping people safe was an expert in the field, ElsaMarie D’Silva, the Founder & CEO of the Red Dot Foundation. The Red Dot Foundation’s award-winning platform Safecity crowdsources personal experiences of sexual violence and abuse in public spaces. ElsaMarie is listed as one of BBC Hindi’s 100 Women, and her work has been recognized by numerous UN organizations and the SDG Action Festival.

To go a little deeper into the application of AI for public safety, we shared Omdena projects that took innovative approaches to make communities safer.


Case Study 1: Preventing sexual harassment through a safe-path finder algorithm

UN Women states that 1 in 3 women face some kind of sexual assault at least once in their lifetime.”

With the first case study, the Omdena team drew upon Safecity’s crowdsourced data about sexual harassment in public spaces and leveraged open-source data to build heatmaps and calculate safe routes through major cities in India. Part of the solution is a sexual harassment category classifier with 93 percent accuracy and several models that predict places with a high risk of sexual harassment incidents to suggest safe routes.


AI Sexual Harassment



You can learn more about this and related projects here:


Case Study 2: Understanding gang violence patterns and actors through Twitter analysis

Our team worked in partnership with Voice 4 Impact, an award-winning NGO whose solution to violence in our communities addresses the questions people worldwide are asking: “How do we keep missing the signs?”

The Omdena team made use of natural language processing techniques — AI techniques that analyze text to understand what is being communicated. Machine learning algorithms were used to understand gang language and AI models built to detect violent messages on Twitter, without profiling. The aim is to predict and ultimately prevent, gang violence.


AI Gang Violence


You can learn more about this and related projects here:


Case Study 3: Analyzing Domestic Violence through Natural Language Processing (NLP)

Finally, we presented Omdena’s work to uncover domestic violence in India hidden due to COVID lockdowns. This work is part of a project with the award-winning Red Dot Foundation and Omdena’s collaborative platform to build solutions to better understand domestic violence and online harassment patterns during COVID-19. The project used natural language processing techniques with social media, government reports, and other text content to create a dataset with which Safecity could mobilize local efforts to protect and support domestic violence victims.



AI Domestic Violence



You can learn more about this and related projects here:





Host an AI project with us.


Preventing Sexual Harassment Through a Path Finding Algorithm Using Nearby Search

Preventing Sexual Harassment Through a Path Finding Algorithm Using Nearby Search

Building a pathfinding algorithm in combination with heatmaps to identify ‘safe spots’ relative to a user’s coordinates and directions 

By Daniel Ma

The results from this case study depend on previous work on heatmaps, which predict places at high risk of sexual harassment incidents. Below are simulations using Nearby Search and Directions API.


Path finding algorithm

Figure 1: Zoomed-out look of simulated ‘hotspots’ given coordinate boundaries in Mumbai


Again, this heatmap layer generated using gmaps Python package is completely simulated, but it does give insight into what potentially we could work with. Next, we need to find nearby ‘safespots’ and walking directions to those places using a pathfinding algorithm to prevent Sexual Harassment Cases. Thankfully, we can leverage Google’s Maps API for this — that is, the combination of both Nearby Search and Directions API:


Path Finding Algorithm

Figure 2: Hospitals found within 800m walking distance using Nearby Search (Point A is the origin, Point Bs are destinations)


To quickly detail what was done here:

  • Used Nearby Search API to find the nearest hospitals within 800 meters
  • Nearby Search to create a list of coordinates for these locations
  • Directions API to find walking directions to these places

If you do want to take a look and try it out on your own, check out here for the Nearby Search API and here for the Directions API .

At this time, my fellow team members informed me they have pushed a working version of the heatmaps model to prevent sexual harassment cases and that we were able to incorporate the first version of the heatmap; so away with the simulated heatmap, and here is what it actually looks!



Figure 3: Heatmaps of predicting sexual harassment incidents in Mumbai


A few quick points about this figure:

  • Given latitude, longitude boundaries and a 28 by 28 array of risk scores, we interpolate the specific coordinates of each spot or ‘grid point’ to create this lattice-looking overlay (coordinate calculations were done using haversine)
  • Risk scores are discrete, ranging from 0 to 4, with 0 being the safest of spots and 4 being the most dangerous
  • Area of each ‘grid point’ works out to be ~1.74 km² (which is fairly large, meaning the resolution isn’t the finest)

Let’s now superimpose the directions layer on top of this and see what we’re working with:


Nearby Search

Figure 4: Nearby hospitals within 800 meters walking distance of Point A


As you can see, various hospital locations are located within 800 meters; however, it is quite intuitive to any user looking at this to avoid the top-side locations, since the risk scores are higher (illustrated by warmer colors). Looking at this, one can simply make a decision on where to go by:

  • Prioritizing the overall route safety-ness
  • Choosing the closer location


Computing the risk associated with taking each route and finding the safest using heatmap analysis to prevent sexual harassment cases

However, in times of where the user is in stress and needs to make a decision immediately, such cognitive abilities come as a luxury (this is akin to determining which route to take with only traffic data provided). The application should complete these tasks for the user, and thus we’ve come to the crux of the problem:

How can we determine the safety-ness of routes?

Again, working our way from simple to more complex solutions, we can:

  • Determine overall route safety-ness solely by the risk score associated with the destination
  • Calculate the average of risk scores based upon grid coverage of the straight-line path from origin to destination
  • Calculate the average of risk scores based upon grid coverage of each step of the route to get to the destination

Completing the first solution isn’t too complicated — in short, you have coordinates for each grid point and finding the grid point that is closest your destination to obtain its risk score involves finding the one with shortest Euclidean distance (here we use a linearized method and assume there’s no curvature between points for simplicity).


Euclidean distance

Figure 5: Euclidean distance between points a and b


Taking on the second and third solutions requires leveraging external resources — we can reframe this problem as one that involves determining the line-of-sight. For example, how can a one determines if their vision is blocked by obstacles; in this context of finding the best route, how do we know if a route touches a grid point? Equipped with the knowledge that there’s pretty much an algorithm for everything, we turn to Bresenham’s line drawing method.

Bresenham’s line algorithm is a line drawing algorithm that determines the points of an n-dimensional raster that should be selected in order to form a close approximation to a straight line between two points.

Computer graphics work in units of pixels, so we can see how this path finding algorithm originated from the need to shade in pixels graphical operations.

The general idea behind this algorithm is: given a starting endpoint of a line segment, the next grid point it traverses to get to the other endpoint is determined by evaluating where the line segment crosses relative to the midpoint (above or below) of the two possible grid points choices

As an attempt to walk through this method, we’ll study the following diagram:



Bresenham line

Figure 6: Demonstration of Bresenham line drawing algorithm (courtesy of Rochester Institute of Technology)


  • Here, we assume the points refer to each grid point (which are the center point for each grid box)
  • Assume we have a line segment spanning from left to right in upward slope fashion (smaller (x, y) to larger (x, y) coordinates). Let us also say that the initial grid point P is shaded and has already been determined to be traversed through
  • Consider point P as the previous grid point, we then need to determine the current grid point to be shaded. To do so, we also consider the grid points E and NE (for east and northeast of the current grid point, respectively)
  • M is the midpoint between E and NE
  • Based on where the line segment intersects between the line between E and NE and the point of intersection relative M (above or below M ), we can make a decision on the current grid point that the line traverses through, which is NE in this case, as the line intersects above M
  • The grid points E and NE (and consequently, M) are then updated in reference to the current grid point and the same methods above are applied again and again until we reach the opposing endpoint of the line segment


When we complete the process of the pathfinding algorithm, we arrive at something like this:

Path finding algorithm

Figure 7: Shaded-in grid points using Bresenham’s algorithm


Fortunately, we need not code this out from scratch as skimage already has an implementation of this — here’s a small snippet of what the code looks like:

from skimage.draw import line
import matplotlib.pyplot as plt
# create array of zeros
example_arr = np.zeros((10, 10))
# set endpoints for line segment
x0, y0 = 1, 1
x1, y1 = 7, 9
# draw the line
rr, cc = line(y0, x0, y1, x1)
example_arr[rr, cc] = 1
# plot it out
plt.figure(figsize=(8, 8))
plt.imshow(example_arr, interpolation='nearest', cmap='viridis')
plt.plot([x0, x1], [y0, y1], linewidth=2, color='r')


Bresenham’s line

Figure 8: Illustration of Bresenham’s line drawing method using skimage and matplotlib


As you can see, the approximations of the grid point shading are imperfect, as there are definitely grid points that the line has crossed that are not shaded in.

Of course, there’s another pathfinding algorithm to improve on this where all of the points that are crossed are shaded in; I won’t go in detail here but the term used is to find the supercover of the line segment:


pathfinding algorithm

Figure 9: Method for finding the ‘supercover’ of a line segment


Zooming back out to our problem at hand here, we apply these grid coverage methods to compute the average of risk scores of steps for each route to determine the best. We prioritize the safety of the route first, before accounting for the distance (i.e. the safest route is suggested first if two routes are tied in safety-ness, we then suggest the closer destination), using heatmaps analysis to prevent sexual harassment cases.

To top it off, we apply this method to our heatmap to illustrate how one route could look like:



Figure 10: Route determined using line drawing methods between two points


More about Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.


Heatmaps and Machine Learning Based Prediction of Sexual Abuse Hotspots

Heatmaps and Machine Learning Based Prediction of Sexual Abuse Hotspots

This case study is part of our AI project with award-winning NGO Safecity. 34 Omdena Collaborators build solutions for preventing Sexual Harassment using machine learning driven heatmap and path finding algorithms to identify safe routes with less sexual crime incidents for major Indian cities. A non-technical overview can be found here.

What is a heatmap?


According to the Oxford dictionary, a heatmap is “a representation of data in the form of a map or diagram in which data values are represented as colors”. One of the most effective ways to express a heatmap that can be used on mathematical models is thought the use of matrices where each cell represents a square portion of space in a given measuring distance system and the colors represent the intensity of the studied event that happened on each cell mapped.

In this case, the intensity represents the number of crimes women suffered in that area, at a specific time.

In figure 2 we can see an example of heatmap predicted by a machine learning model plotted on a grid divided in 30 by 30 cells where each cell has a color representing the crime intensity for in Delhi on August 13. The more reddish the cell, the riskier the place. The black dots represent the real crimes that occurred on the particular date reported by the SafeCity application.

1. How can we use past aggregation maps to have future insights on a specific region?

It is time for machine learning. Heatmap prediction is one of the many fields studied by people that want to predict human behavior over time.


Heatmap of crime intensity against women in Delhi

Figure 2 — Heatmap of crime intensity against women in Delhi


Heatmap prediction using machine learning is a three-dimensional problem that can also be called spatial-temporal problems as seen in figure 3. It involves a spatial component since a heatmap, in our case, is a two-dimensional matrix and involves a temporal component that varies on time and depends on the granularity that we decided to see it which is used by machine learning. This granularity that aggregates the events on each cell can be expressed in hours, days, months, and so on.


 Spatial and Temporal Dimensions of a Heatmap

Figure 3 — Spatial and Temporal Dimensions of a Heatmap



2. Selecting the right model

Being a passionate person for convolutional neural networks and machine learning, I decided to search for articles on the use of deep learning for crime forecasting that generates a heatmap.

The first paper found for the study brought the use of Spatial-Temporal Residual Networks (ST-ResNet) for crime forecasting. Bao et al. used this technique, as a regression problem to predict the number of crimes hourly on a grid divided map over a region in Los Angeles city where their great contribution was to apply a spatial and temporal regularization technique on the dataset.

This ANN model uses the concept of aggregating heatmaps according to the concept of trends, period and closeness by using convolutional neural networks where each of these terms defines the temporal distance between the heatmap analysis, convolutional neural networks, and machine learning, and have its own internal architecture inside the model were it learns features related to those distances as we can see in figure 4. Residual convoluted units are used to learn the spatial dimension for each of those maps.


 Structure of ST-ResNet

Figure 4 — Structure of ST-ResNet



2.1. A valuable lesson learned — Fail quickly and move forward

Grabbing a dataset available by Kaggle containing data about crimes in Los Angeles, I tried for two weeks to replicate their study with all the technique recommended by them without results that resembles the ones they showed in their article with that level of perfection. I entered in contact with the authors asking for clarification but just got some vague answers.

Finally, I gave up on this approach when I talked with an author of another article that cites them and he said that even himself failed to reproduce their results.

A valuable lesson to share here is that if you are not progressing with an idea, put a deadline to end it and move on for the next one. I searched for related articles that could cast some light on my problem and found alternative approaches.



2.2. Finding the best-fit model

Another deep learning model found during the study uses a more complex combination of convolutional neural networks, heatmap analysis, machine learning and the LSTM model to predict the number and category of crimes called SFTT and proposed by Panagiotis et al.

The SFTT model, presented in figure 5, receives as input a sequence of heat maps aggregated in time by classes and outputs a future heat map with predicted hotspots that according to the article definition are places where at least a single crime happened and category probability.


 Structure of SFTT 

Figure 5 — Structure of SFTT


Even though the model showed good results for the article’s authors I couldn’t manage to get good results. In my implementations or the model terribly overfits, or the number of parameters to train was astonishing reaching like 32 million and my machine learning algorithm was not capable of processing that in a reasonable time.

Unfortunately, I was still with the “perfect score” mentality from the St-ResNet article when working on this model and most of the time still getting frustrated with what I was achieving. Only later when I read Udo Schlegel’s master thesis, Towards Crime Forecasting Using Deep Learning, where he makes use of encoder-decoder and GAN’s to predict heatmaps showing results that looked more similar to the ones I found in the past, I changed my mind on crime prediction.

Crime predictability, as I would discover near the challenge’s end, is about predicting trends and regions and not lonely criminal minds. You can’t play Minority Report like in Tom Cruise movie in this field.

Even failing at using this model, I would reconsider retaking the study and implementation using it in the future since it can help immensely in the cases we would like to make a clear distinction between predicting ordinary cases and rape against women. In my opinion, the kind of intervention or warning needed in both situations should be treated differently since the impact on the last one can be a life-changer.

Finally, after a lot of study and research, I found an article that fitted the most on the solution I was looking for. Noah et al. proposed Convolutional neural networks, machine learning algorithms, and the LSTM model (figure 6) to predict burglary in Sweden by using heatmap analysis. They turned a regression problem into a classification one. Instead of predicting the number of crimes that could potentially occur in the future, they tried to classify each cell with a risky probability.



Structure of Conv + LSTM

Figure 6 — Structure of Conv + LSTM



The model’s input is a sequence of binary maps in time where each cell contains one if at least an incident happened there (burglary in the article’s case) and zero otherwise and outputs a risk probability map for the next period. Since we had little data to work with, it was easier to turn our samples into binary maps than aggregated sum maps like the ones used for regression problems. Being loose in aggregation let the spatial dimension to be explored even further enabling to amplify the resolution of the heatmaps since accumulation was not a necessity anymore but just the simple existence of a single incident inside the debilitated cell square using convolutional neural networks and machine learning algorithms.




The Solution

The model that fits best in our case was the Conv + LSTM.

The dataset provided was stored in a spreadsheet containing around 11.000 inputs where each row had a reported incident that happened to a woman on a certain place on the globe with information like incident date, description, latitude, longitude, place, category, and so on.

From this point the following pipeline was adopted in order to build a solution:

  • Data Selection
  • Data Pre Processing
  • Modeling
  • Evaluation
  • Prediction

1. Data Selection

Most of the open datasets used for crime studies contain lots of detailed samples reaching numbers that vary from 100.000 to 10.000.000 reports since they are generally provided by the local police department which contains a good computerized system capable of collecting, organizing, and storing this valuable information over a delimited region.

Since it’s not imaginable to build global scale heatmap analysis using machine learning algorithms for crime prediction with a useful resolution we selected the cities which concentrate most of the data, Delhi and Mumbai with approximately 3.000 samples for each of those places.

After the selection, a spatial and temporal boundary was made that could encompass the majority of useful data for training the algorithm.

For the spatial part, an algorithm that selects the best latitude and longitude boundaries based on a threshold number of events and grid division was used. The goal was to cut the number of places with an irrelevant number of occurrences. As in figure 7, the first dataset that encompassed Delhi had a lot of zeros near the latitude and longitude boundaries and after the application of the algorithm, we reduced the space where most of the relevant data were located with minimum loss.


 Boundaries selection for spatial dimension on Delhi dataset. Fig. (a) is the original dataset with all reports while fig. (b) is the dataset which encompasses the most valuable data.

Figure 7 — Boundaries selection for spatial dimension on the Delhi dataset. Fig. (a) is the original dataset with all reports while fig. (b) is the dataset that encompasses the most valuable data.


On the temporal dimension, we had data from 2002 until 2019. Grouping them by month it was possible to verify that not all years had a relevant or useful amount of information so the strategy was to select just reports between the concentration range as it is possible to visualize in figure 8.


Boundaries selection for temporal dimension on Delhi dataset. Fig. (a) is the original dataset with all reports ranging from 2002 to 2019 while fig. (b) shows data just selected from 2013 to 2017.

Figure 8 — Boundaries selection for temporal dimension on the Delhi dataset. Fig. (a) is the original dataset with all reports ranging from 2002 to 2019 while fig. (b) shows data just selected from 2013 to 2017.



Once useful spatial and temporal data were selected in the next step we created the heatmaps.


2. Data pre-processing

In the data pre-processing step, we defined the spatial and temporal granularity of our aggregated data in order to produce the heatmap analysis using machine learning.

For this challenge, a 32 by 32 grid size was used on a daily basis. These values seem aggressive for a small dataset but as stated early, binary maps allowed more space between cells since they need just one incident on that place and daily granularity was used, even with lots of missing values between days, because with the data augmentation technique used below we still got some good results.

Heatmaps were made using aggregation technique where first we converted the latitude and longitude from our data samples into a matrix coordinate and after summed all those samples that fell in the same coordinate on the same temporal granularity as demonstrated on figure 9 by using convolutional neural networks.


 Conversion between latitude and longitude into a heatmap.

Figure 9 — Conversion between latitude and longitude into a heatmap.



After creating the heatmaps we still had many missing daily maps that could potentially represent a problem when stacking them into a sequence for the input data for the ConvLSTM model. One solution could be to assume zero-valued heatmaps for these cases but since we already had too many sparse matrices doing this strategy would just contribute to model overfitting towards zero.

The strategy used to fill this gap was to upsample these missing heatmaps using a linear interpolation between the first and next heatmap on the time sequence using the total missing period as the division factor. With this data augmentation method, it was possible to rise the dataset from an amount of 586 binary maps to 1546 daily maps.

After synthetically creating the missing heatmaps a threshold value was arbitrarily chosen to convert them into binary maps (figure 10) since we stated that the model works with risky and non-risky hotspot prediction using heatmaps and convolutional neural networks.


Example of conversion from a heatmap into a binary map using a 0.5 threshold.

Figure 10 — Example of conversion from a heatmap into a binary map using a 0.5 threshold.


3. Modeling

A deep learning model composed of Convolutional Network Networks and the LSTM, as in figure 6, was used as the main algorithm for heatmaps prediction. The model’s input is a sequence of temporal binary maps and the output of the next map of the sequence containing the risk probability of each cell.

Convolutional networks are great for learning the relation between nearby cells in the map while the LSTM networks learn the relation between map sequences in time. Combining them together and you have a powerful algorithm that learns the spatial and temporal properties of our dataset.

The training and target sets were made by grouping a sequence of 16 daily binary maps for input and using the next one as target creating 576 unique sequences. It is important and necessary to state that maps artificially created were skipped when selected as a target since they don’t translate into a real scenario.

For the train and test splitting the decision, the boundary was based on the relevant temporal data between the period of 2013 to 2017 where are samples before the second half of 2016 were selected to train the model and the second semester to validate it.

A total of 100 epochs was used to train the model with a learning rate of 0.0001 in batches of size 8. An amount of 20 % of the training data was used to validate the mode during the process. Adam optimizer was used to correct the learning direction.

Heatmaps are very large sparse matrices (matrix with lots of zeroes), so to equilibrated the loss that naturally towards the easy way meaning learning to predict only zeros we use a weighted cross-entropy function for backpropagation to put more emphasis on missing ones.

It is interesting to mention that on this model we are not encoding the heatmaps in a higher-dimensional space, but squashing them into lower dimension learning the most significant features using the Max Pooling 3D layers after the LSTM that learns the sequence features.


4. Evaluation

To evaluate the model we used a separate test set to validate its behavior on unseen data and avoid bias.

Despite the fact that we are dealing with a classification problem we cannot assume an arbitrary threshold like 0.5 and assume that all values above this line are considered risks and all below neutral cells. Our goal is to predict regions that have higher chances of an incident to occur and not exactly the spot that a crime will happen. Besides that, we trained on large sparse matrices so its natural that all values will lean towards zero.

A more practical way is to define a percentile among all cell values and decide that above this threshold we will classify it as risky. So for example, if we define that the risky threshold is above the 96th percentile, we first count the number of cells for each value predicted, group them and take the value that represents the 96th threshold. On numpy there is a function called np.percentile that is useful for this case. On Noah et al. they recommend taking the percentile for the entire test prediction instead of for each sample to average the value between different seasons.


 Binary predicted maps using different percentiles to define risky areas.

Figure 11 — Binary predicted maps using different percentiles to define risky areas.



After deciding the percentile value and converting the analog results into a binary map we measure the score against the true map. To not penalize the prediction so hard against missing results that can lead up to adjust the model in a way that we end up with a totally overfitted algorithm we give some score for the predicted cell in neighboring areas around hotspots since we didn’t miss so much.

Instead of the four classification labels, we have six:

The following table represents the labels for each classification:


Labels for each prediction

Table 1 — Labels for each prediction



With this table in mind, we count each of those labels for each predicted cell as seen in figure 12.


Figure 12 — Label prediction according to defined percentile threshold.

Figure 12 — Label prediction according to the defined percentile threshold.


For the example above we have the following results depending on the selected percentile value:


Predictions count for different percentiles thresholds.

Table 2 — Predictions count for different percentiles thresholds.


The lower the percentile threshold, the higher the number of cells we classify as risky, and the higher is our model accuracy, but is it good?

We have to find the perfect balance between risk classification and model accuracy to get the best end-user experience.

A conservative view with a lower threshold would lead to a high number of correct classification but at the same time creating the wrong perception for the users that the entire map is a dangerous zone. On the other side if we rise too much the percentile we would barely classify any place as a hotspot and incurring serious error: FALSE NEGATIVE.

There is not a single problem that can be worse for a person’s experience than trust on an application that is telling them that they are stepping into a safe zone and end being a victim of a crime. In fact, depending on the proportion it takes in the media it can be the end for the entire system we are building to help people.

So, between all those labels, what are the most important ones to look at? Noah at. al state in the article that Correct, False Negative Neighbor, and False Negative should be taken into consideration when doing the model evaluation. The goal is to find the balance between False Negative Vs. Correct + False Negative Neighbor rate and Accuracy.


Model evaluation against different thresholds percentiles. It is possible to note that the higher the percentile threshold value the higher the accuracy (black line) but at the same time the more false positives we have.

Figure 13 — Model evaluation against different thresholds percentiles. It is possible to note that the higher the percentile threshold value the higher the accuracy (black line) but at the same time the more false positives we have.


On figure 13 we can check the model evaluation for percentile that ranges from 85th to 99th threshold. The higher the value the more false negatives we have despite increasing the model’s accuracy. That happens because we evaluate more neutral cells (True Negatives) as correct. Those cells are also important to correctly identify since we want to guide the user on a safe space while avoid indicating too many places as risky while they are not.


5. The prediction

There are some guidelines suggested on Noah et al. to make the prediction in a way people have an easy understanding of the generated output. The first way is just to display the cells classified as risky with one color while muting the others while the second way is to define a group of threshold values to classify each cell.

So as a general guideline for model prediction is as following:

  • Make a prediction for a single sample or a whole batch.
  • Select a percentile to evaluate all cells from the output prediction that will be used as a threshold to classify all values for each cell that falls above as risky and below as neutral or another defining label.
  • Convert the analog values into discrete ones using the threshold list from above.

On figure 14 there are two examples using the 96th and 99th threshold as risky values colored in red. The other values are made using lower threshold selections like the 91st, 75th, and 25th percentile.

The two black dots on the red represent the crime that occurred on that particular day and both were captured by the model. It is easy to check that the 99th percentile prediction gives a more accurate risky area but it has at the same time more chance to miss important cells since it aggressively tries to classify just areas were we are almost certain that something can happen. Tenting should be done using different percentiles in order to find the one that best adapts to the user’s needs.


 Heatmaps predictions using different thresholds. Figure (a) uses a 96th percentile while figure (b) uses 99th.

Figure 14 — Heatmaps predictions using different thresholds. Figure (a) uses the 96th percentile while figure (b) uses the 99th.


An important problem to state is related to the border effect caused by the padding on the Convolutional Neural Networks and machine learning algorithms. Zero paddings are not the best way to deal with the borders, especially on small images like heatmaps, and they should be ignored when putting on production since they hardly translate into reality.

One way to improve the resolution of the output is to use a bilinear interpolation technique and increase the heatmap size providing a more fine-tuned resolution when presenting to the final user like the one shown in figure 15.



 High-resolution heatmap over a region of Delhi. The reddish the area the riskers it is considered by the algorithm. The black dot represents the crime that occurred in that region on the specific date.

Figure 15 — High-resolution heatmap over a region of Delhi. The reddish the area the riskers it is considered by the algorithm. The black dot represents the crime that occurred in that region on a specific date.



Final considerations

There are many solutions out there using deep learning models on crime forecasting using heatmaps and convolutional neural networks. This article tried to contribute to showing an approach that fitted best on the resources that we had for this challenge while telling what and why the other solutions didn’t work.

There is room for improvements such as,

  • Testing different spatial dimensions (grid size)
  • Testing different temporal dimensions (time granularity) like aggregating heatmaps in weeks, months, and so on.
  • Testing different upsampling techniques for filling the gaps when no data is presented.
  • Trying to implement another padding technique for removing border effect on the convolutional network.
  • Trying to convert this problem from classification to regression and use the other proposed deep learning approaches.


More About Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

Using AI To Prevent Gang Violence by Analyzing Tweets

Using AI To Prevent Gang Violence by Analyzing Tweets

Applying AI and machine learning to understand gang language and detect threatening tweets related to gang violence.


The problem

“Some believe that bolstering school security will deter violence, but this reactionary measure only addresses part of the problem. Instead, we must identify threats, mitigate risk, and protect children and staff before an act of violence occurs.” — Jeniffer Peters, Founder of Voice4Impact (Project Partner)

Chicago is considered the most gang-infested city in the United States, with a population of over 100,000 active members from nearly 60 factions. Gang warfare and retaliation are common in Chicago. In 2020 Chicago has seen a 43% rise in killings so far compared to 2019.



The solution

It was noticed that gangs often use Twitter to communicate with fellow gang members as well as threat other gang members. Gang language is a mixture of icons and some gang terms.


Sample Gang Language


The project team split the work into two parts:

  • Implement a machine-learning algorithm to understand gang language and detect threatening tweets related to gang violence.
  • Find co-relation between threatening tweets and actual gang violence.


Part 1: Detecting violent gang language and influential members

The goal was to classify tweets as threatening or non-threatening so that the threatening ones can be routed to intervention specialists who will then decide what action to take.


Step 1: Labeling tweets collaboratively

First, a tool was created to label tweets faster and train the machine learning model. We were only provided the raw tweets. Searching the web, we found LightTag, which is a product designed for exactly this but it is a paid product once you exceed the comically low number of free labels.

We needed a simpler solution that does everything we need, and nothing else. So, we turned to a trusted old friend: Google Spreadsheets. A custom Google Spreadsheet was made (the template publicly available here). It features a scoreboard, so labelers get credit for their contribution, and a mechanism to have at least two people label each tweet to ensure the quality of labels.




To ensure the quality of our labels, we decided we need at least two labels on every tweet, and if they are not the same, a third label would be required to break the tie. Row color-coding makes it easy to see which rows are finished. If the row has been labeled once, it will be colored green. If the row has been labeled twice and the two labels do not agree, it will be colored red. Also on the scoreboard page, is a count of how many tweets are labeled once, labeled twice with conflicting labels, and finished on each page.


Step 2: Sentiment analysis (with probability value) of tweets being violent

The sentiment analysis team built a machine learning model to predict whether the tweets are threatening or non-threatening. But first, we needed to address the challenges of an imbalanced dataset where over 90% of the tweet feed was non-threatening, and the scarcity small size of the labeled dataset. We tested multiple techniques, including loss functions specifically designed for imbalanced datasets, undersampling, transfer learning from existing word embeddings algorithms, and ensemble models. We then combined the reservoir of violent signal words to come up with a probability value (the probability that a tweet is more prone to using violent words) against each tweet.


Step 3: Detect influential members in the Twitter gang network

Next, we wanted to identify the influential members of the network. A network analysis resulted in a directed graph and by using the Girvan Newmann algorithm, the communities in the networks could be also detected. Using PageRank values of each node, the influential members were identified.


5 steps to build an effective network analysis of tweets

1. Using python’s networkX, a graph using the mentions and authors of the tweets were created

Network Analysis Gang Violence

Network analysis


Make sure to read this detailed article on the network analysis.

The nodes represent mentions in the tweet/author of a tweet. Edge A →B means B was mentioned in the tweet posted by A.

2. Thousands of tweets were used to create a directed graph and using Girvan Newmann algorithm, the communities in the networks were detected. Also, using PageRank values of each node, the influential members in the network could be identified. This value is not crucial to the network analysis but can be useful if one tries to track any gang member who is influential in the network.

3. The members in the communities are either authors or mentions. So, the tweets were then tagged with the community number based on the mention or author names.

4. The total number of signal keywords in all the communities was calculated and so was the total number of signal words for individual communities.

5. The final result was a dataset of tweets that had the community tag and probability of using violent words — based on usage of signal words within the community relative to all the communities. For example, In the picture below, members from Community 1 who are authors or mentions in the tweets are more likely to be inclined towards using violent keywords. So, the tweets which contain authors/mentions from this community are contextually more violent.



Also, the network analysis can give an insight into which members are more influentials within the community. One can get a notion by looking at the PageRank values of the members of the community. The greater the PageRank, the more influential a member is.


Page Rank vs Gang Member


Part 2: Correlation between actual violence and tweets

Next, we wanted to understand, if there is any co-relation between actual Crimes and mentions of ‘Gun’ in a threatening tweet.

Below is the correlation between the two metrics on the same day, 1-day, and 2-day shift.


Same day


1-day shift


2-day shift


Through this analysis, we can see that there is a correlation between the number of crimes and the use of a gun in threatening tweets with a 2-day shift. This can be very useful for authorities to prevent gang violence.


More about Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.


Omdena Delivers Technology to Reunite Families After Earthquakes

Omdena Delivers Technology to Reunite Families After Earthquakes

Leader in Collaborative AI Solutions Omdena Delivers Technology to Reunite Families After Earthquakes


Omdena Logo


By Laura Clark Murray 



For Immediate Release

February 11, 2020

Leader in Collaborative AI Solutions Omdena Delivers Technology to Reunite Families After Earthquakes

Omdena builds on an impressive record of delivering artificial intelligence solutions for social good with a global community of more than 700 participants from more than 70 countries.

Palo Alto, California (February 11, 2020) — Omdena, the global platform for building collaborative AI-based solutions, today announced the delivery of AI technology to help parents and children reunite in the aftermath of an earthquake. The system identifies safe paths through predicted earthquake damage in Istanbul, where experts have long warned of a devastating earthquake.

A diverse team of thirty-three AI experts, engaged citizens, and data scientists worked together from remote locations around the world, in partnership with Impact Hub Istanbul. The project represents Omdena’s tenth completed project challenge since the company’s launch in May 2019, and the third in the humanitarian sector.

“At Omdena, we’re building a new model of innovation, where communities can come together to solve their problems, share their data and build solutions,” said Rudradeb Mitra, Founder of Omdena. “For this challenge, the diverse team members from around the world devoted their AI and problem-solving skills to helping the
families of Istanbul.”

Social innovators at Impact Hub Istanbul set out to address the critical concerns their community will face in the immediate hours after an earthquake hits. They turned to Omdena for its proven expertise in quickly transforming broad problem definitions into targeted AI solutions. In eight-weeks, the Omdena team of collaborators has delivered a working AI prototype that identifies safe routes after an earthquake in one district of Istanbul, for expected expansion city-wide.

Semih Boyaci, Co-Founder of Impact Hub Istanbul, said: “The result of this stunning process with Omdena was an AI-powered tool that helps families reunite in case of an earthquake by calculating the shortest and safest roads. It was incredible to witness the productivity level when a group of talented people comes together around common motivations to solve problems for the social good.”

This Omdena Earthquake Response AI Challenge will be featured at the United Nations AI for Good Global Summit 2020 in Geneva, Switzerland in May 2020.


For media inquires contact: Laura Clark Murray, Omdena, laura@omdena.com

About Omdena: Omdena is a platform for building AI solutions to real-world problems through global collaboration. Our partners include the UN World Food Programme and the UN Refugee Agency. Omdena is an Innovation Partner of the United Nations AI for Good Global Summit 2020. Learn more at Omdena.com.

Learn more about Omdena’s Earthquake Response AI Challenge at: omdena.com/projects/ai-earthquake/

About Impact Hub Istanbul: Impact Hub Istanbul is part of a global network of changemakers focused on social innovation. Learn more at https://istanbul.impacthub.net/

About the AI for Good Global Summit 2020: The AI for Good series is the leading action-oriented, global & inclusive United Nations platform on AI. The Summit is organized every year in Geneva by the ITU with XPRIZE Foundation in partnership with 37 sister United Nations agencies, Switzerland and ACM. The goal is to identify practical applications of AI and scale those solutions for global impact. Learn more at https://aiforgood.itu.int/

Stay in touch via our newsletter.

Be notified (a few times a month) about top-notch articles, new real-world projects, and events with our community of changemakers.

Sign up here