#!/usr/bin/env python # coding: utf-8 # # Women's March and Tea Party, by the Numbers # # The Tea Party protests that took the country by storm in 2009 had an outsized impact on the legislative process. The recent Women's March and associated movement could potentially have a similar effect, so I was curious to see how the two compared in size and location. Below, I look at the distribution and size of the marches, compare turnout by city, then look at the portion of each state that attended protests. # # Overall, there were ten times more Women's Marchers (**4,157,678**) than Tea Party marchers (**310,960**). Interestingly, both protests had a similar median number of marchers (**322** vs **450**), although the mean was substantially higher for the Women's March (**6673** vs. **903**). Finally, almost every state had a larger percentage of the population turnout for the Women's March, with Colorado leading the way at **2.9%**. This means that although the march was more concentrated in cities, it was still a grassroots event distributed geographically throughout the 50 states. # # If the energy Women's March can be harnessed, it could have an even larger impact than the Tea Party. We may already be seeing the results in congress and town halls. # # # If you're viewing this notebook on Github, view it in NBViewer [here](http://nbviewer.jupyter.org/gist/psthomas/79b61a107205a90b3660bb4649fb2672) instead to see the interactive plots and tables. # In[97]: get_ipython().run_line_magic('matplotlib', 'inline') import pandas as pd import numpy as np import matplotlib.pyplot as plt import matplotlib import statsmodels.formula.api as smf import statsmodels.api as sm import json from IPython.display import HTML matplotlib.style.use('ggplot') # # Import the Data # # Jeremy Pressman, Erica Chenoweth and others recently finished compiling all the Women's March [data](https://docs.google.com/spreadsheets/d/1xa0iLqYKz8x9Yc_rfhtmSOJQ2EGgeUVjvV4A8LsIaxY/htmlview?sle=true#gid=0) and 538 compiled [data](https://fivethirtyeight.com/features/tea-parties-appear-to-draw-at-least/) on the Tea Party protests a few years ago, so I'll be using both those sources. I got the state level population [data](https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=PEP_2015_PEPANNRES&src=pt) from the US Census, and the voter turnout [data](https://docs.google.com/spreadsheets/d/133Eb4qQmOxNvtesw2hdVns073R68EZx4SfCnP4IGQf8/htmlview?sle=true#gid=19) from David Wasserman. The boxplot shows that the median march size was actually very similar between cities (**322** vs **450**). The mean, however, was an order of magnitude higher for the Women's March (**6673**), and there are more outliers at the high end of the march size. There were also ten times more Women's Marchers (**4,157,678**) than Tea Party marchers (**310,960**). # In[126]: #Merge dataframes on city, state city_df = march_df.merge(tea_df, how='outer', on=['city', 'state']) #Copy for distributions unfcity_df = city_df.copy() # Fill 0, assume cities without data had no marchers. # Note, it's possible the 538 data is less complete than Women's March. city_df.fillna(value=0, inplace=True) #Boxplot fig, ax = plt.subplots() ax.set_yscale('symlog') ax.set_ylim(1, 1e6) unfcity_df.plot.box(figsize=(10,7), ax=ax, meanline=True, showmeans=True, color='gray', sym='k.') plt.ylabel("Protesters (log)") plt.show() #Print total marchers print("Total Women's March: " + '{:,.0f}'.format(city_df['march_num'].sum())) print("Total Tea Party: " + '{:,.0f}'.format(city_df['tea_num'].sum())) unfcity_df.describe() # ## Cities Compared # # Below is an interactive scatter plot of the number of protesters in each city for each movement. Every state except West Virginia had a larger percentage participate in the Women's March, with Colorado leading with **2.9%** of their population. California had the largest total number of protesters, at **910,830**. # In[130]: # Group city_df by state, sum state_df = city_df.groupby(by='state', as_index=False).sum() # Merge with vote and population dataframes: state_df = state_df.merge(vote_df, how='inner') state_df = state_df.merge(pop_df, how='inner') state_df['tea_pct'] = (state_df['tea_num'] / state_df['pop2016']) * 100 state_df['march_pct'] = (state_df['march_num'] / state_df['pop2016']) * 100 # Leave DC out, marchers exceed population state_df = state_df[state_df['state'] != 'DC'] state_df = state_df.sort_values(by='march_pct', ascending=False).reset_index(drop=True) interactive_table(state_df, width=600, height=500) # ## How do state turnouts compare? # In[131]: fig, ax = plt.subplots(figsize=(10,8)) plt.scatter(x=state_df['march_pct'], y=state_df['tea_pct'], marker='', alpha=0.7, color="steelblue", label='_nolegend_') #marker='o' A = state_df['march_pct'] B = state_df['tea_pct'] C = state_df['state'] D = range(len(C)) for a,b,c,d in zip(A, B, C, D): #if d % 50 == 0: #Annotate every n ax.annotate('%s' % c, xy=(a,b), textcoords='data') plt.xlabel("Women's Marchers (% Population)") plt.ylabel("Tea Party Marchers (% Population)") x = pd.DataFrame({'line': np.linspace(0, 3, 10)}) plt.plot(x, x, 'k--', alpha=0.7, label='Equal (1:1)') # Average State Ratio = 0.008984/0.001222 = 7.35 times % of women's marchers plt.plot(x, x/7, '--', color="gray", alpha=0.8, label='Average State Ratio (7:1)') ax.set_xlim(0,3.0) ax.set_ylim(0,0.5) plt.legend() plt.show() # ## Did blue states have more marchers? # # The Democratic margin is a fairly good indicator for the Women's March participation. 