California is unusually prone to earthquakes because it exists on the San Andreas Fault. The presence of hundreds of fault lines has led to over 10,000 earthquakes per year in California.
How can a city prepare to respond to an earthquake? There are many ways that a local city/ government can respond but protecting the safety and lives of individuals is the priority during any disaster situation. Emergency evacuation planning will help reduce confusion, minimize injuries, and ultimately save lives.
The goal of this 4-week challenge was to provide a safe and fast route planning for families to evacuate after an earthquake in the Los Angeles area.
In this article, we will focus on the overall end to end complete pipeline from risk factor collection to their use in finding safe paths in the aftermath of an earthquake. In this 4-week challenge, a group of collaborators worked on understanding different risk factors relevant for earthquakes, collecting them using OSMnx package and other sources, combining them and then using it to find a safe path for a given source and destination. The team also identified potential shelters, and parks, so a user can choose them as relevant destinations when finding safe paths in the aftermath of an earthquake. This study presents a Proof of Concept for safe paths focusing on 3 neighborhoods – Chatsworth, Northridge and North Hills which lie in the San Fernando valley in Los Angeles, California. Figure 1 below describes an overall data flow pipeline for this project.
Data Collection and Preparation:
All the risk factors were collected and derived based on the features available in the OSMnx for a geographical area except for the liquefaction risk factor.
OSMnx lets you download geospatial data from OpenStreetMap and give you access to real-world street networks. A typical OSMnx graph is a Multidigraph consisting of nodes and edges with nodes representing intersections/dead-ends and edges representing street segments that link them. OSMnx also provides access to other geospatial objects like buildings, parks, schools etc. The OSMnx graph for the three neighborhoods used in this challenge is shown in figure 2. This Multidigraph can be converted to and from GeoPandas node and edge GeoDataFrames, making them extremely versatile to manipulate and add additional information on nodes/edges in any geographical area. The below section explains the approach used for souring each risk factor.
A total of five risk factors relevant in the context of earthquake – distance, road width, building density, road speed and Liquefaction were considered as a proxy for travel safety in this challenge. Distance risk factor captures the distance of the road from the nearby buildings, and hence higher the distance the lower the risk for travelling on that road. Road width risk factor is based on the type of road and for residential roads this factor would be quite high whereas for highways the risk factor would be low. Speed of the road is very similar to road width, higher the speed of the road, the wider the roads and lower the risk. Density risk factor is based on building density in each area and hence higher the building density the higher the risk to travel on those roads. The last risk factor considered is liquefaction which considers the soil of an underlying geographical area. Below we document each risk factor in detail.
Distance risk factor
Distance risk factor is based on how far a particular road pixel is from nearby buildings. For the calculation and quantifying the distance, we need buildings/roads segmented in the underlying map. OSMnx provides a get building footprints function which was used to segment the buildings from the road. Next, the distance transformation was applied using distance_transform_edt function in the Python Scipy library. This distance transformation function calculates the Euclidean distance of a foreground pixel from background pixel. Figure 3 below shows a building footprint from OSMnx, and the image transformed after applying the distance transformation.
A sample distance transformation of the bottom left-hand corner of the footprint in figure 3 is shown in figure 4 below. The figure 4 also shows the Euclidean distance, with buildings in red and the distance from the buildings are conditionally formatted using colors. The distance is shown as integers and are conditionally formatted with red pixels the buildings whereas green/yellow pixels signify distance from the red pixels. Finally, the Euclidean distance is used to calculate an average distance for an edge from the nearby buildings in an OSMnx graph. This process is repeated for all neighborhoods and distance score is stored on all the edges, so it is available for downstream functionalities.
Density risk factor
Density risk factor is based on how sparse or dense the buildings are in a surrounding area. The density score is calculated by dividing a geographical area in a small grid and fetching the number of buildings using the OSMnx function geometries_from_bbox with tag as ‘building’. Edges/roads present in that grid inherit the building density score for that grid.
Other risk factors: Speed and width risk factors are based on the max speed and road category defined for an edge in the OSMnx graph. Higher speed was used as a proxy for wider and safer roads. Road width uses the attribute “highway” type of an edge to give a risk score to a given edge or a road. Roads of category “residential” are tagged as riskier and those of the type “motorway”. Both these risk factors give a discrete risk score for edges between 0 and 5. Liquefaction zones identify where the stability of foundation soils must be investigated, and countermeasures undertaken in the design and construction of buildings for human occupancy. Areas marked as Liquefaction zones are assumed to be more prone to danger. The liquefaction zones were sourced from the Geohub LA website https://geohub.lacity.org/datasets/liquefaction-zones/explore.
Research on safety during earthquake revealed that it is advisable to travel to open spaces if you are outside. For this scenario, parks are possible destinations that people might travel to during or aftermath of an earthquake. People might also want to seek special shelters in the aftermath of an earthquake. To cater for these possible safe destinations, the team also tagged the nodes near the park or shelters as special evacuation nodes, so they are accessible to the users while finding a path. Parks were sourced from underlying footprints data available in osmnx and shelters were sourced from this website
Once all the above data is assembled and stored on the nodes and the edges GeoDataFrame, we are ready to proceed to the next stage of integrating all the data into a single graph with a single combined risk score column. GeoDataFrames make it easy to integrate all the different risk factors and neighborhoods by merging /joining the DataFrames. Figure 5 and 6 shows the sample nodes and edges of the GeoDataFrame after integration.
After integration of risk factors and merging the neighborhoods, the different risk factors are ready for analysis and combined into a single score. Figure 9 below shows a combined risk score and its visualization on the OSMnx graph. The combined risk is ready to use for all downstream path finding tasks.
The next step in the pipeline is to use path finding algorithms to find a path from a source to a user chosen destination. Here two path finding algorithms were explored – A* search and Hierarchical A* search. In both path finding algorithms, weight needs to be specified which is used as the edge weight. For this project, the weight supplied to the algorithms is based on the combined risk score. For a particular source and destination, 3 different weight factors – length, combined risk and length * combined risks were considered, and the output analyzed. Figure 10 shows a path between a source and destination using 3 different weights using A * search. The best weight is the one which combines both length and risk, as travelling longer distances may expose users to lower risk but for longer time. Whereas focusing only on length would mean that the user can reach destination sooner but would be exposed to high risk during his/her travel.
Hierarchical A* search algorithm was also developed based on the research paper For Hierarchical A* search, a hierarchical abstract pre-computed graph is calculated offline and can be loaded on the disk. This hierarchical abstract graph is computed as follows:
- Divide the graph into blocks, in this case we use a 5 x 5 grid i.e., 25 blocks.
- Find an entry and exit border nodes between the blocks and these form the intra-edges of the graph.
- Next the inter-edges between the border nodes within the blocks are computed and stored.
All the edges are calculated based on combined risk * length as the weight using A* search.
At runtime graph path search, the path can be found by combining 3 different sub-paths:
- Path 1 – Source node s to nearest node sa in the hierarchical abstract graph using the lower level OSMnx detailed graph. (s -> sa)
- Path 2 – Destination node d to nearest node da in the hierarchical abstract graph using the lower level OSMnx detailed graph. (d -> da)
- Path 3 – Path between source sa and destination node da using the hierarchical abstract graph. (sa -> da)
Figure 11 below shows hierarchical abstract graph for the 3 neighborhoods along with a runtime path search for a source and destination address using hierarchical A* search algorithm.
Comparing the time and length/risk reported between these two algorithms reveals that Hierarchical A* search is faster when it comes to finding path for longer distances, but the risk/length reported is near optimal compared to A* search. Figure 12 has the details below. The time taken to find the path was calculated as an average of 10 runs for each different test case like same /different neighbourhood and path/shelters.
The last step in the path finding pipeline was to build a web application using python and Streamlit. Figure 13 below the web application functionality. The web application allows a user to enter the source address and choose the destination type – parks, shelters or custom destination. A safe path along with both length and risk are also displayed for that path. The web application also displays a heatmap based on combined risk and the nearest parks and shelters in the neighbourhood as seen in figure 14.
This 4-week challenge was a lot of fun and we learned the different libraries /functionalities when working with spatial data. The team has done an incredible amount of work in a short span of time and established a methodology which can be expanded to a wider geographical area or allow integration of more risk factors like building height, building age etc. Hierarchical path finding shows a lot of promise and in future can be used to deploy and integrate to find more paths faster. You can access our code on GitHub. We also have a YouTube presentation which can be accessed here.
A big thank you to all collaborators who made this project a success: Joyce Annie, Soham Sawant, Hussain Ravat, Rhey Ann Magcalas, Adarsh Rajasekaran, Jaya and Isha Terdal. A special thanks to Nishrin Kachwala, who is chapter lead for Omdena Silicon Valley Chapter.