Wenyan Deng

Ph.D. Candidate

Massachusetts Institute of Technology

Matplotlib: Choropleth and Scatter Maps

August 24, 2017

In last week's post, we saw how to overlay a choropleth map (polygons) with a scatter map (points) and a roads (linestrings) map. In today's post, I will use a similar set of data to recreate the map in its static version on Matplotlib. See my codes and final product at my Github repository.

The case study

This study uses a set of data on Sri Lanka voter turnout and Freedom Party’s (SLFP) vote share in the 1970 parliamentary election, as well as the police station attacks data we used in a previous post. This is the same set of data that we used last week.

Getting started

As usual, open a jupyter notebook and import the following:

%matplotlib inline

import geopandas as gpd

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

Load and process data files

Get the polling divisions shapefile:

geo_df = gpd.read_file("data1/map.shp")

geo_df['coords'] = geo_df['geometry'].apply(lambda x: x.representative_point().coords[:])

geo_df['coords'] = [coords[0] for coords in geo_df['coords']]

geo_df.plot()

geo_df.to_csv("geo.csv")

geo_df.rename(columns = {"polling_di" : "Divisions"}, inplace = True)

Get the roads shapefile. Unlike the roads shapefile last week for the interactive map, I have simplified the roads shapefile used in this week in QGIS, so that the shapefile only includes trunk and primary roads, eliminating all secondary and tertiary roads. Unlike the interactive map, we can't zoom in on the static map, so too many roads will cover up other important information, like police station and voter turnout. To learn how to do that, follow this guide here.

roads_df = gpd.read_file("dataroads/main_roads/main_roads.shp")

roads_df['coords'] = roads_df['geometry'].apply(lambda x: x.representative_point().coords[:])

roads_df['coords'] = [coords[0] for coords in roads_df['coords']]

Load the 1970 electoral turnout data:

df1 = pd.read_excel("1970_SLFP.xlsx")

df1 = df1.dropna(subset = ['share'])

df1.head()

Merge the geo-dataframe with the SLFP vote share file, on electoral divisions ("Divisions"):

geo_merge = geo_df.merge(df1, on='Divisions', how='left')

col_name = geo_merge.columns[0]

geo_merge = geo_merge.dropna(subset = ['share'])

geo_merge.head()

Load the attacked police stations shapefile:

import shapely.wkt


police1 = gpd.read_file('srilanka_policestations/controlled.shp')

police1['geometry'] = police1['geometry'].apply(shapely.wkt.loads)

And the not attacked police stations shapefile:

police2 = gpd.read_file('srilanka_policestations/notcontrolled.shp')

police2['geometry'] = police2['geometry'].apply(shapely.wkt.loads)

Mapping

Now that the data is ready, start mapping with the choropleth layer:

ft = "Turnout"

plate = geo_merge.to_crs(epsg=4269)

ax = plate.plot(column = ft, scheme = "fisher_jenks", k = 9, cmap = "Blues", legend = True, alpha = 0.65, linewidth = 0.9, figsize = (60, 40))

Then add the police data:

police1.plot(ax=ax, marker='o', color='red', markersize=7)

police2.plot(ax=ax, marker='o', color='gold', markersize=7)

Then add the roads:

roads_df.plot(ax=ax, color='0.30', linewidth = 1)

Add title and label the electoral divisions:

ax.set_title("Parliamentary Election Turnout (1970)", fontsize = 40)

ax.set_axis_off()

for idx, row in geo_df.iterrows():

plt.annotate(s=row['Divisions'], xy=row['coords'], horizontalalignment='center')

Another example

Last week, we created a map on SLFP vote share with police station attacks. We can do a static map that presents the same information using the method outlined above.

First, import SLFP vote share data:

df2 = pd.read_excel("1970_SLFP.xlsx")

df2 = df2.dropna(subset = ['share'])

Merge vote share data with geo dataframe:

geo_merge2 = geo_df.merge(df2, on='Divisions', how='left')

col_name = geo_merge2.columns[0]

geo_merge2 = geo_merge2.dropna(subset = ['share'])

geo_merge2.head()

Map:

ft = "share"

plate = geo_merge2.to_crs(epsg=4269)

ax = plate.plot(column = ft, scheme = "fisher_jenks", k = 9, cmap = "Oranges", legend = True, alpha = 0.65, linewidth = 0.9, figsize = (60, 40))


police1.plot(ax=ax, marker='o', color='blue', markersize=7)

police2.plot(ax=ax, marker='o', color='green', markersize=7)


roads_df.plot(ax=ax, color='0.3', linewidth = 1)


ax.set_title("SLFP Vote Share (Parliamentary Election, 1970)", fontsize = 40)

ax.set_axis_off()

for idx, row in geo_df.iterrows():

plt.annotate(s=row['Divisions'], xy=row['coords'], horizontalalignment='center')

Your final product, with the two maps side-by-side, would look something like this: