Massachusetts Institute of Technology
GeoPandas: Static Maps and Spatial Join
July 20, 2017
The case study
My first major mapping project at the University of Chicago is a choropleth map of Sri Lankan Army (SLA) officer deaths between 1981 and 1999. The data is collected painstakingly by my colleague, Winston Berg. I am deeply grateful to Professor Paul Staniland, who made the project possible by hiring me. Any mistakes remain my own. You can see a copy of the codes here and the final products on my GitHub repository. The final product is shown in Figure 1.
The data table includes the names of army officers who died as well as their place of death, which may be a village or city. In order to assign each place of death a larger geographical category — province, district, or division — we must attach a longitude and a latitude to each place in the dataset. Thankfully, GeoPandas has just the function.
"out.csv" describes deaths data in which every place name is matched with a longitude and latitude.
Figure 1: Static map with GeoPandas
Read your shapefile into a geo-dataframe:
Make a new column called coordinates, which is based on the "geometry" column:
Label the districts:
To make a heat/choropleth map, we have to count how many deaths there are per district. For this, we use GeoPanda's spatial join function. The spatial join function matches points in the deaths dataset to the polygons in the geo-dataframe. Here, I use "within". You can also use "contains " or "intersects".
Rename the spatially joined table so the column heads make sense. Next, count and sum up the deaths in each district. Save it as an Excel sheet and check for formatting and errors:
Import the Excel sheet and draw a bar graph with Pandas:
Rename the column in the geo-dataframe so it matches with the deaths dataframe:
Merge the geo-dataframe with the deaths dataframe:
Figure 2: SLA officer deaths by year
Now we can produce the map. The default for "scheme" is quantiles. You can also use "equal_interval" or "fisher_jenks". Fisher_jenks sets categories by minimizing in-group variance and maximizing inter-group variance.
Your final product would look like Figure 1. "Alpha" in the codes above changes the shade of colors. Of course, you can also count the deaths by year and create a series of maps like Figure 2 shown on the left.