Created by: SmirkyGraphs. Code: GitHub. Source: BOE.
$525,881.75 was raised from 992 contributions from 29 different states
This looks specifically at donations where the person was living in Rhode Island
Comparing donations based on whether they live in RI or not
489 unique Employers
65,309 raised from Retirees
38,680 raised from Homemakers
16,731 from Self-Employed
15,261 from "Info Requested" (Left Empty)
Top 5 Companies: RI Medical Imaging, Pfizer Inc, General Dynamics, Citizens Bank, Pannone Lopes & Devereaux & West LLC
All Values Included
Extras Removed
In state 37% out of state 63%
# For data
import pandas as pd
from pandas import Series,DataFrame
import numpy as np
# For visualization
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('darkgrid')
%matplotlib inline
sns.set()
import datetime
my_color = sns.color_palette()
# loading the data
df = pd.read_csv("gina.csv", parse_dates=['receipt_dt'])
# removing personal address
df = df.drop(['address'], axis=1)
# Preview
df.head()
contbr_nm | first_nm | last_nm | tran_type | contb_type | receipt_dt | contb_amt | city | state | zip | employer | employ_address | employ_city | employ_state | employ_zip | weekday | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Ingrid Ardaya | Ingrid | Ardaya | Credit/Debit | Individual | 2017-07-01 | 5.0 | Providence | RI | 2906 | Disabled | 11 North Avenue | Providence | RI | 2906 | Saturday |
1 | Ingrid Ardaya | Ingrid | Ardaya | Credit/Debit | Individual | 2017-07-01 | 5.0 | Providence | RI | 2906 | Disabled | 11 North Avenue | Providence | RI | 2906 | Saturday |
2 | Edna Panaggio | Edna | Panaggio | Credit/Debit | Individual | 2017-07-01 | 5.0 | Cranston | RI | 02920-4529 | Homemaker | 200 Hoffman Ave | Cranston | RI | 02920-4529 | Saturday |
3 | Eve Savitzky | Eve | Savitzky | Credit/Debit | Individual | 2017-07-01 | 25.0 | Providence | RI | 2906 | Homemaker | 21 Lincoln Ave | Providence | RI | 2906 | Saturday |
4 | Anna Siegler | Anna | Siegler | Credit/Debit | Individual | 2017-07-02 | 50.0 | Chicago | IL | 60637 | Retired | 5715 S. Kenwood Ave, Apt 4N | Chicago | IL | 60637 | Sunday |
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 992 entries, 0 to 991 Data columns (total 16 columns): contbr_nm 992 non-null object first_nm 971 non-null object last_nm 971 non-null object tran_type 992 non-null object contb_type 992 non-null object receipt_dt 992 non-null datetime64[ns] contb_amt 992 non-null float64 city 991 non-null object state 991 non-null object zip 988 non-null object employer 965 non-null object employ_address 930 non-null object employ_city 931 non-null object employ_state 931 non-null object employ_zip 916 non-null object weekday 992 non-null object dtypes: datetime64[ns](1), float64(1), object(14) memory usage: 124.1+ KB
df.shape
(992, 16)
df.contb_amt.sum()
525881.75
There are 992 Donations in Q3 making a total of $525,881.75
Those missing First/Last name are PAC's/Party donations
date_df = df.groupby(['receipt_dt'],as_index=False).sum()
date_df.plot('receipt_dt','contb_amt',figsize=(12,6),marker='', legend=False,
linestyle='-',color='purple', xlim=('2017-07-01','2017-10-01'))
<matplotlib.axes._subplots.AxesSubplot at 0x30e958cfd0>
# Top 5 Days
date_df.sort_values(by='contb_amt',ascending=False).head()
receipt_dt | contb_amt | |
---|---|---|
79 | 2017-09-27 | 26710.0 |
77 | 2017-09-25 | 23230.0 |
49 | 2017-08-28 | 22805.0 |
50 | 2017-08-29 | 20925.0 |
52 | 2017-08-31 | 20032.6 |
count_df = df.groupby(['receipt_dt'],as_index=False).count()
count_df.plot('receipt_dt','contb_amt',figsize=(12,6),marker='',linestyle='-',color='purple', xlim=('2017-07-01','2017-10-01'))
<matplotlib.axes._subplots.AxesSubplot at 0x30e9688d68>
mean_df = df.groupby(['receipt_dt'],as_index=False).mean()
mean_df.plot('receipt_dt','contb_amt',figsize=(12,6),marker='',linestyle='-',color='purple', xlim=('2017-07-01','2017-10-01'))
<matplotlib.axes._subplots.AxesSubplot at 0x30ea78b358>
weekday_df = df
weekday_df['weekday'] = pd.Categorical(weekday_df['weekday'],
categories=['Monday','Tuesday','Wednesday','Thursday',
'Friday','Saturday', 'Sunday'], ordered=True)
weekday_df_sum = weekday_df.pivot_table(index=weekday_df['weekday'], values='contb_amt',
aggfunc='sum').plot(kind='bar',rot=0,legend=False, title='Sum of Donations by Weekday')
weekday_df_count = weekday_df.pivot_table(index=weekday_df['weekday'], values='contb_amt',
aggfunc='count').plot(kind='bar',rot=0, legend=False, title='Count of Donations by Weekday')
weekday_df_sum = weekday_df.pivot_table(index=weekday_df['weekday'], values='contb_amt',
aggfunc='mean').plot(kind='bar',rot=0, legend=False, title='Average Donated by Weekday')
df['month'] = df['receipt_dt'].dt.month
df['receipt_dt'].dt.month.value_counts()
9 549 8 306 7 137 Name: receipt_dt, dtype: int64
month_sum = df.pivot_table(index=df['month'], values='contb_amt',
aggfunc='sum').plot(kind='bar',rot=0,legend=False, title='Sum of Donations by Month')
month_sum = df.pivot_table(index=df['month'], values='contb_amt',
aggfunc='count').plot(kind='bar',rot=0,legend=False, title='Count of Donations by Month')
df.tran_type.value_counts()
Credit/Debit 525 Check 448 In-Kind 10 Other 9 Name: tran_type, dtype: int64
df.tran_type.value_counts(normalize=True)
Credit/Debit 0.529234 Check 0.451613 In-Kind 0.010081 Other 0.009073 Name: tran_type, dtype: float64
sns.factorplot('tran_type',data=df,kind="count")
<seaborn.axisgrid.FacetGrid at 0x30eace8c18>
df['contb_amt'].sum()
525881.75
df['contb_amt'].mode()
0 1000.0 dtype: float64
df['contb_amt'].describe()
count 992.000000 mean 530.122732 std 410.813074 min 1.000000 25% 100.000000 50% 500.000000 75% 1000.000000 max 1174.250000 Name: contb_amt, dtype: float64
I was surprised to see that the average donation was 530 compared to the presidential race when it was only 100
Lowest donation was 1 and highest was 1174
df['contb_amt'].hist(bins=25)
<matplotlib.axes._subplots.AxesSubplot at 0x30eac7e5f8>
df['contb_amt'].value_counts().head()
1000.0 380 500.0 140 250.0 110 25.0 63 10.0 53 Name: contb_amt, dtype: int64
Surprisingly 1000 was the most frequent donation
The top most frequent donation values were much higher then those during the presidential race
# 1 State was labeled "Ri" So replace it with RI
df = df.replace(['Ri'],'RI')
df.state.nunique()
29
df.state.value_counts()
RI 600 NY 76 MA 62 CT 43 CO 29 TX 28 DC 19 CA 18 MD 17 FL 15 NJ 15 VA 10 IL 8 PA 7 OR 7 AZ 6 NH 5 WA 4 VT 4 TN 3 MI 3 NM 3 HI 2 SC 2 WI 1 AL 1 GA 1 MO 1 NC 1 Name: state, dtype: int64
where_sum = df.pivot_table(index=df['state'], values='contb_amt', aggfunc='sum').sort_values(
by='contb_amt').plot(kind='barh', rot=0, legend=False, title='Total Donated by State')
where_count = df.pivot_table(index=df['state'], values='contb_amt', aggfunc='count').sort_values(
by='contb_amt').plot(kind='barh', rot=0, legend=False, title='Count of Donations by State')
where_avg = df.pivot_table(index=df['state'], values='contb_amt', aggfunc='mean').sort_values(
by='contb_amt').plot(kind='barh', rot=0, legend=False, title='Avg Donated by State')
df.city.nunique()
225
# Top 5 Cities
city_df = df.pivot_table('contb_amt',index='city',aggfunc='sum')
city_df = city_df.sort_values(by="contb_amt",ascending=False)
city_df.head()
contb_amt | |
---|---|
city | |
Providence | 62241.43 |
New York | 44350.00 |
Barrington | 22470.00 |
Jamestown | 21775.00 |
Denver | 21000.00 |
city_sum = df.pivot_table(index=df['city'], values='contb_amt', aggfunc='sum').sort_values(
by='contb_amt').nlargest(5, 'contb_amt').plot(kind='barh', color=my_color, legend=False, title='Total Donated')
city_count = df.pivot_table(index=df['city'], values='contb_amt', aggfunc='count').sort_values(
by='contb_amt').nlargest(5, 'contb_amt').plot(kind='barh', color=my_color, legend=False, title='Num of Donations')
# Just donations from RI
RI_df = df[df.state == 'RI']
RI_df.city.unique()
array(['Providence', 'Cranston', 'Lincoln', 'Barrington', 'Cumberland', 'Pascoag', 'Smithfield', 'East Greenwich', 'Johnston', 'Pawtucket', 'Jamestown', 'Wakefield', 'Saunderstown', 'Portsmouth', 'Newport', 'West Warwick', 'North Kingstown', 'Warwick', 'PROVIDENCE', 'Tiverton', 'Harmony', 'Bristol', 'Warren', 'West Greenwich', 'Westerly', 'Riverside', 'N Kingstown', 'Rumford', 'Narragansett', 'Exeter', 'Coventry', 'East Providence', 'South Kingstown', 'North Providence', 'Middletown', 'Charlestown', 'North Kingstownq', 'Foster', 'Block Island', 'Scituate', 'Little Compton', 'New Shoreham', 'Peace Dale', 'Central Falls', 'North Scituate', 'Glocester', 'E Greenwich', 'Woonsocket', 'Albion', 'Kingston'], dtype=object)
# Connect small towns to the City/Town they're part of
RI_df = RI_df.replace(['Pascoag'],'Burrillville')
RI_df = RI_df.replace(['Wakefield','Kingston','Peace Dale'],'South Kingstown')
RI_df = RI_df.replace(['Saunderstown','N Kingstown','North Kingstownq'],'North Kingstown')
RI_df = RI_df.replace(['E Greenwich'],'East Greenwich')
RI_df = RI_df.replace(['PROVIDENCE'],'Providence')
RI_df = RI_df.replace(['Harmony'],'Glocester')
RI_df = RI_df.replace(['Riverside','Rumford'],'East Providence')
RI_df = RI_df.replace(['Block Island'],'New Shoreham')
RI_df = RI_df.replace(['North Scituate'],'Scituate')
RI_df = RI_df.replace(['Albion'],'Lincoln')
RI_df.city.nunique()
36
ri_city = df.pivot_table(index=RI_df['city'], values='contb_amt', aggfunc='count').sort_values(ascending=True,
by='contb_amt').plot(kind='barh', figsize=(12,9), legend=False, title='Num of Donations by City')
ri_city = df.pivot_table(index=RI_df['city'], values='contb_amt', aggfunc='sum').sort_values(ascending=True,
by='contb_amt').plot(kind='barh', figsize=(12,9), legend=False, title='Total Donated by City')
RI_df['city'].value_counts().sum()
600
RI_df['city'].value_counts().head()
Providence 175 East Greenwich 44 Barrington 44 Jamestown 38 Cranston 35 Name: city, dtype: int64
RI_Sum = RI_df.pivot_table('contb_amt',index='city',aggfunc='sum')
RI_Sum = RI_Sum.sort_values(by='contb_amt', ascending=False)
RI_Sum.sum()
contb_amt 248681.22 dtype: float64
RI_Sum
contb_amt | |
---|---|
city | |
Providence | 70988.43 |
Barrington | 22470.00 |
Jamestown | 21775.00 |
East Greenwich | 19170.00 |
Cranston | 13061.00 |
North Kingstown | 9585.00 |
Newport | 8390.00 |
Westerly | 8305.00 |
Lincoln | 7360.00 |
East Providence | 7170.79 |
Warwick | 6490.00 |
Narragansett | 6310.00 |
Portsmouth | 5505.00 |
Bristol | 5380.00 |
South Kingstown | 4935.00 |
Charlestown | 4100.00 |
North Providence | 3525.00 |
Pawtucket | 3130.00 |
Johnston | 3125.00 |
Exeter | 3035.00 |
Cumberland | 2500.00 |
Middletown | 2375.00 |
Warren | 2100.00 |
Scituate | 1500.00 |
West Greenwich | 1200.00 |
Coventry | 1060.00 |
Foster | 1000.00 |
Smithfield | 950.00 |
Tiverton | 550.00 |
West Warwick | 500.00 |
Burrillville | 500.00 |
Woonsocket | 275.00 |
Glocester | 225.00 |
Little Compton | 100.00 |
Central Falls | 25.00 |
New Shoreham | 11.00 |
# dictionary of RI Counties
county_map = {'Barrington': 'BRISTOL',
'Bristol': 'BRISTOL',
'Burrillville': 'PROVIDENCE',
'Central Falls': 'PROVIDENCE',
'Charlestown': 'WASHINGTON',
'Coventry': 'KENT',
'Cranston': 'PROVIDENCE',
'Cumberland': 'PROVIDENCE',
'East Greenwich': 'KENT',
'East Providence': 'PROVIDENCE',
'Exeter': 'WASHINGTON',
'Foster': 'PROVIDENCE',
'Glocester': 'PROVIDENCE',
'Hopkinton': 'WASHINGTON',
'Jamestown': 'NEWPORT',
'Johnston': 'PROVIDENCE',
'Lincoln': 'PROVIDENCE',
'Little Compton': 'NEWPORT',
'Middletown': 'NEWPORT',
'Narragansett': 'WASHINGTON',
'Newport': 'NEWPORT',
'New Shoreham': 'WASHINGTON',
'North Kingstown': 'WASHINGTON',
'North Providence': 'PROVIDENCE',
'North Smithfield': 'PROVIDENCE',
'Pawtucket': 'PROVIDENCE',
'Portsmouth': 'NEWPORT',
'Providence': 'PROVIDENCE',
'Richmond': 'WASHINGTON',
'Scituate': 'PROVIDENCE',
'Smithfield': 'PROVIDENCE',
'South Kingstown': 'WASHINGTON',
'Tiverton': 'NEWPORT',
'Warren': 'BRISTOL',
'Warwick': 'KENT',
'Westerly': 'WASHINGTON',
'West Greenwich': 'KENT',
'West Warwick': 'KENT',
'Woonsocket': 'PROVIDENCE'}
# creating a party column and mapping party to canidate
RI_df['County'] = RI_df.city.map(county_map)
RI_df['County'].value_counts()
PROVIDENCE 303 WASHINGTON 86 NEWPORT 77 KENT 70 BRISTOL 64 Name: County, dtype: int64
ri_city = df.pivot_table(index=RI_df['County'], values='contb_amt', aggfunc='count').sort_values(ascending=True,
by='contb_amt').plot(kind='barh', legend=False, title='Num of Donations by County')
ri_city = df.pivot_table(index=RI_df['County'], values='contb_amt', aggfunc='sum').sort_values(ascending=True,
by='contb_amt').plot(kind='barh', legend=False, title='Total Donated by County')
def in_ri(state):
if state == 'RI':
return 'in state'
else:
return 'out of state'
df['lives'] = df['state'].apply(in_ri)
df_lives = df
df['lives'] = pd.Categorical(df['lives'], categories=['in state','out of state'], ordered=True)
df.head()
contbr_nm | first_nm | last_nm | tran_type | contb_type | receipt_dt | contb_amt | city | state | zip | employer | employ_address | employ_city | employ_state | employ_zip | weekday | month | lives | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Ingrid Ardaya | Ingrid | Ardaya | Credit/Debit | Individual | 2017-07-01 | 5.0 | Providence | RI | 2906 | Disabled | 11 North Avenue | Providence | RI | 2906 | Saturday | 7 | in state |
1 | Ingrid Ardaya | Ingrid | Ardaya | Credit/Debit | Individual | 2017-07-01 | 5.0 | Providence | RI | 2906 | Disabled | 11 North Avenue | Providence | RI | 2906 | Saturday | 7 | in state |
2 | Edna Panaggio | Edna | Panaggio | Credit/Debit | Individual | 2017-07-01 | 5.0 | Cranston | RI | 02920-4529 | Homemaker | 200 Hoffman Ave | Cranston | RI | 02920-4529 | Saturday | 7 | in state |
3 | Eve Savitzky | Eve | Savitzky | Credit/Debit | Individual | 2017-07-01 | 25.0 | Providence | RI | 2906 | Homemaker | 21 Lincoln Ave | Providence | RI | 2906 | Saturday | 7 | in state |
4 | Anna Siegler | Anna | Siegler | Credit/Debit | Individual | 2017-07-02 | 50.0 | Chicago | IL | 60637 | Retired | 5715 S. Kenwood Ave, Apt 4N | Chicago | IL | 60637 | Sunday | 7 | out of state |
count_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='count').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Count of Donation')
print(df['lives'].value_counts())
(df['lives'].value_counts(normalize=True))
in state 600 out of state 392 Name: lives, dtype: int64
in state 0.604839 out of state 0.395161 Name: lives, dtype: float64
60% (600) Were from Rhode Island
40% (392) Were from another state
mean_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='mean').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Average Donation')
sum_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='sum').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Total Donated')
percent_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='sum')
total_sum = percent_df.contb_amt.sum()
df['lives'] = df['state'].apply(in_ri)
percent_df['Percent'] = percent_df['contb_amt'] / total_sum
percent_df.head()
contb_amt | Percent | |
---|---|---|
lives | ||
in state | 248681.22 | 0.472884 |
out of state | 277200.53 | 0.527116 |
mean_df = df.pivot_table(index=df['lives'], values='contb_amt', aggfunc='mean')
mean_df.head()
contb_amt | |
---|---|
lives | |
in state | 414.468700 |
out of state | 707.144209 |
employer_df = df.pivot_table('contb_amt',index='employer',aggfunc='sum')
# Combining Electric Boat & Genral Dynamics
employer_df.loc['General Dynamics'] = employer_df.loc['Electric Boat Corporation'] + employer_df.loc['General Dynamics']
employer_df.drop('Electric Boat Corporation',inplace=True)
employer_df = employer_df.sort_values(by = 'contb_amt',ascending=True)
employer_df.count()
contb_amt 489 dtype: int64
Donations from people who worked at 489 different companies, lets narrow it down to companies over $1000
# Getting all employer records over $1000
employer_df = employer_df[employer_df['contb_amt'] > 1000]
employer_df.plot(kind='barh',figsize=(10,16))
<matplotlib.axes._subplots.AxesSubplot at 0x30eb1f4780>
# Graphing Only Companies
employer_df.drop('Homemaker',inplace=True)
employer_df.drop('Retired',inplace=True)
employer_df.drop('Self Employed',inplace=True)
employer_df.drop('Info Requested',inplace=True)
employer_df = employer_df.sort_values(by = 'contb_amt',ascending=True)
# Getting all employer records over $1000
employer_df = employer_df[employer_df['contb_amt'] > 1000]
employer_df.plot(kind='barh',figsize=(10,16))
<matplotlib.axes._subplots.AxesSubplot at 0x30ecb26978>
def in_ri(employ_state):
if employ_state == 'RI':
return 'in state'
else:
return 'out of state'
df['works'] = df['employ_state'].apply(in_ri)
df.head()
contbr_nm | first_nm | last_nm | tran_type | contb_type | receipt_dt | contb_amt | city | state | zip | employer | employ_address | employ_city | employ_state | employ_zip | weekday | month | lives | works | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Ingrid Ardaya | Ingrid | Ardaya | Credit/Debit | Individual | 2017-07-01 | 5.0 | Providence | RI | 2906 | Disabled | 11 North Avenue | Providence | RI | 2906 | Saturday | 7 | in state | in state |
1 | Ingrid Ardaya | Ingrid | Ardaya | Credit/Debit | Individual | 2017-07-01 | 5.0 | Providence | RI | 2906 | Disabled | 11 North Avenue | Providence | RI | 2906 | Saturday | 7 | in state | in state |
2 | Edna Panaggio | Edna | Panaggio | Credit/Debit | Individual | 2017-07-01 | 5.0 | Cranston | RI | 02920-4529 | Homemaker | 200 Hoffman Ave | Cranston | RI | 02920-4529 | Saturday | 7 | in state | in state |
3 | Eve Savitzky | Eve | Savitzky | Credit/Debit | Individual | 2017-07-01 | 25.0 | Providence | RI | 2906 | Homemaker | 21 Lincoln Ave | Providence | RI | 2906 | Saturday | 7 | in state | in state |
4 | Anna Siegler | Anna | Siegler | Credit/Debit | Individual | 2017-07-02 | 50.0 | Chicago | IL | 60637 | Retired | 5715 S. Kenwood Ave, Apt 4N | Chicago | IL | 60637 | Sunday | 7 | out of state | out of state |
# Including Extras
count_df = df.pivot_table(index=df['works'], values='contb_amt', aggfunc='count').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Count of Donation')
df['works'].value_counts()
in state 557 out of state 435 Name: works, dtype: int64
# Including Extras
count_df = df.pivot_table(index=df['works'], values='contb_amt', aggfunc='sum').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Total Donated')
# Including Extras getting % of sum
percent_df = df.pivot_table(index=df['works'], values='contb_amt', aggfunc='sum')
total_sum = df.contb_amt.sum()
percent_df['Percent'] = percent_df['contb_amt'] / total_sum
percent_df.head()
contb_amt | Percent | |
---|---|---|
works | ||
in state | 230305.00 | 0.437941 |
out of state | 295576.75 | 0.562059 |
# Removing Extras
emp_df = df[df.employer != 'Homemaker']
emp_df = df[df.employer != 'Retired']
emp_df = df[df.employer != 'Self Employed']
emp_df = df[df.employer != 'Info Requested']
emp_df = df[df.employer != 'Disabled']
# Extras Removed
count_df = emp_df.pivot_table(index=df['works'], values='contb_amt', aggfunc='count').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Count of Donation')
emp_df['works'].value_counts()
in state 541 out of state 434 Name: works, dtype: int64
# Extras Removed
count_df = emp_df.pivot_table(index=df['works'], values='contb_amt', aggfunc='sum').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Total Donated')
df.contbr_nm.nunique()
904
df.first_nm.value_counts().head()
David 31 Michael 25 Susan 20 William 19 Robert 19 Name: first_nm, dtype: int64
df.last_nm.value_counts().head()
Ardaya 16 Kelly 9 Richardson 6 Watson 5 Pande 5 Name: last_nm, dtype: int64
df.contb_type.value_counts()
Individual 966 Interest Received 8 PAC 6 In-Kind - Individual 5 In-Kind - Party 4 Refund/Rebate 1 In-Kind - PAC 1 Party 1 Name: contb_type, dtype: int64
df.contb_type.value_counts(normalize=True)
Individual 0.973790 Interest Received 0.008065 PAC 0.006048 In-Kind - Individual 0.005040 In-Kind - Party 0.004032 Refund/Rebate 0.001008 In-Kind - PAC 0.001008 Party 0.001008 Name: contb_type, dtype: float64
don_1k = df[df['contb_amt'] >= 1000]
don_1k = df[(df['contb_amt'] >= 1000)]
don_1k.lives.value_counts(normalize=True)
out of state 0.630208 in state 0.369792 Name: lives, dtype: float64
sum_df = don_1k.pivot_table(index=df_lives['lives'], values='contb_amt', aggfunc='sum').plot(kind='bar',
rot=0, color=my_color, legend=False, title='Total Donated')
don_df_100 = df[df.contb_amt <= 100]
don_df_250 = df[df.contb_amt <= 250]
don_df_350 = df[df.contb_amt <= 350]
don_df_500 = df[df.contb_amt <= 500]
don_df_750 = df[df.contb_amt <= 750]
don_df_1000 = df[df.contb_amt <= 1000]
# Concatinating the datasets together
frames = [don_df_100, don_df_250, don_df_350, don_df_500, don_df_750, don_df_1000]
don_concat = pd.concat(frames, keys=['100', '250', '350', '500', '750', '1000'])
# resetting the index and dropping the columns we don't need
don_concat = don_concat.reset_index()
# Pivoting by the amt ranges
don_concat = don_concat.pivot_table('contb_amt',index='level_0',columns = 'lives',aggfunc='sum')
new_index= ['100', '250', '350', '500', '750', '1000']
don_concat = don_concat.reindex(new_index)
don_concat.head()
lives | in state | out of state |
---|---|---|
level_0 | ||
100 | 5683.68 | 2901.53 |
250 | 33314.22 | 10850.53 |
350 | 34914.22 | 11800.53 |
500 | 86789.22 | 30700.53 |
750 | 103444.22 | 35200.53 |
don_concat[['in state','out of state']].plot(kind='bar',figsize=(12,4))
plt.xlabel('Ammount')
locs, labels = plt.xticks()
plt.setp(labels, rotation=360)
plt.title('In State vs. Out of State')
<matplotlib.text.Text at 0x30ecdef400>
# Top 4 Donated Values
don_25 = df[df.contb_amt == 25]
don_250 = df[df.contb_amt == 250]
don_500 = df[df.contb_amt == 500]
don_1000 = df[df.contb_amt == 1000]
# Concatinating the datasets together
frames = [don_25, don_250, don_500, don_1000]
don_concat = pd.concat(frames, keys=['25', '250', '500', '1000'])
#resetting the index and dropping the columns we don't need
don_concat = don_concat.reset_index()
# Pivoting by the amt ranges
don_concat = don_concat.pivot_table('contb_amt',index='level_0',columns = 'lives',aggfunc='sum')
don_concat.head()
lives | in state | out of state |
---|---|---|
level_0 | ||
1000 | 138000.0 | 242000.0 |
25 | 1275.0 | 300.0 |
250 | 22000.0 | 5500.0 |
500 | 51500.0 | 18500.0 |
new_index= ['25', '250', '500', '1000']
don_concat = don_concat.reindex(new_index)
don_concat.head()
lives | in state | out of state |
---|---|---|
level_0 | ||
25 | 1275.0 | 300.0 |
250 | 22000.0 | 5500.0 |
500 | 51500.0 | 18500.0 |
1000 | 138000.0 | 242000.0 |
don_concat[['in state','out of state']].plot(kind='bar',figsize=(12,4))
plt.xlabel('Ammount')
locs, labels = plt.xticks()
plt.setp(labels, rotation=360)
plt.title('In State vs. Out of State')
<matplotlib.text.Text at 0x30ed5c0ba8>