This is a projct for gender analysis of conference speakers over time. We implemented multiple spiders using Scrapy package for scraping speaker names from conference websites and used combination of SexMachine package and Genderize.io for inferring gender.
Lets first load the data to see what information we have collected so far:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set_palette("deep", desat=.6)
sns.set_context(rc={"figure.figsize": (11, 6)})
import matplotlib as mpl
mpl.rcParams['font.sans-serif'].insert(0, 'Arial')
mpl.rcParams['font.sans-serif'].insert(0, 'Liberation Sans')
mpl.rcParams['font.family'] = 'sans-serif'
df = pd.read_csv('./ndata.csv')
Lets see how many conferences we scrapped.
df['conference'].unique().size
133
This means that we have 133 unique conferences scrapied. Now lets see how many unique (conference, year) pairs we have
conf_year = df.groupby(['conference', 'year'])
len(conf_year)
262
So we have 262 conference, year pairs. In average 2 years data for a conference.
Now lets see what data we have for speakers gender. There are around 1000 names which are marked as 'andy' which means that the gender for those cannot be identified. We are going to ignore those names as we can assume that gender distrubution among unidentified names should be the same as in identified ones. In other words we assum that there is no reason for the gender prediction to work better or worse for the certain gender of names.
There are also names marked as mostly_male, mostly_female. We could merged them with the male, female classes but for now are going to ignore those ones as well.
df['gender'].describe()
count 12234 unique 5 top male freq 9854 Name: gender, dtype: object
df['gender'].value_counts()
male 9854 female 1094 mostly_male 577 andy 563 mostly_female 146 dtype: int64
# take only male and female, ignore the rest
df = df[(df.gender == 'male') | (df.gender == 'female')]
df['gender'].value_counts()
male 9854 female 1094 dtype: int64
Now when we have our data prepared lets plot the data to visually see whats going on.
gender_plot= df['gender'].value_counts().plot(kind = 'bar', title = 'Overall gender frequency in the collected data')
plt.savefig('gender_plot.png', bbox_inches='tight')
By using pandas magical functionality we can calculate frequency table of genders per year and then plot it.
per_year = pd.crosstab(df['year'], df['gender'])
per_year
gender | female | male |
---|---|---|
year | ||
2001 | 10 | 236 |
2002 | 11 | 156 |
2003 | 20 | 142 |
2004 | 11 | 184 |
2005 | 17 | 205 |
2006 | 8 | 201 |
2007 | 25 | 324 |
2008 | 30 | 444 |
2009 | 38 | 538 |
2010 | 64 | 777 |
2011 | 95 | 1154 |
2012 | 244 | 2518 |
2013 | 371 | 2001 |
2014 | 150 | 974 |
14 rows × 2 columns
per_ear_plot = per_year.plot(kind='bar', stacked=True, title="Gender per year")
plt.savefig(r'gender_per_year.png', bbox_inches='tight')
but this does not show proportions for that we can normalize data per year to get percentages for male/female
per_year_perc = per_year.div(per_year.sum(axis = 1), axis = 0)
per_ear_plot = per_year_perc.plot(kind='bar', stacked=True, title="Gender frequency per year")
plt.savefig(r'gender_freq_per_year.png', bbox_inches='tight')
As we can see percentage of women speakers is always below 20%.
It would be interesting to see if there is any conference where percentage of women speakers is 20% or above.
conf_year = df.groupby(['conference', 'year'])
conf_year['gender'].value_counts(normalize = True).to_dict()
{('Acts as Conference', 2009L, 'male'): 1.0, ('Adhearsion Conf', 2011L, 'male'): 1.0, ('Adhearsion Conf', 2012L, 'male'): 1.0, ('Agile Roots', 2009L, 'female'): 0.15789473684210525, ('Agile Roots', 2009L, 'male'): 0.84210526315789469, ('Agile Roots', 2010L, 'female'): 0.21212121212121213, ('Agile Roots', 2010L, 'male'): 0.78787878787878785, ('Airbnb Tech Talks', 2012L, 'female'): 0.125, ('Airbnb Tech Talks', 2012L, 'male'): 0.875, ('Aloha Ruby Conf', 2012L, 'female'): 0.045454545454545456, ('Aloha Ruby Conf', 2012L, 'male'): 0.95454545454545459, ('AltDevConf', 2012L, 'female'): 0.16666666666666666, ('AltDevConf', 2012L, 'male'): 0.83333333333333337, ('Ancient City Ruby', 2013L, 'female'): 0.10000000000000001, ('Ancient City Ruby', 2013L, 'male'): 0.90000000000000002, ('ArrrrCamp', 2010L, 'female'): 0.14285714285714285, ('ArrrrCamp', 2010L, 'male'): 0.8571428571428571, ('ArrrrCamp', 2011L, 'female'): 0.15384615384615385, ('ArrrrCamp', 2011L, 'male'): 0.84615384615384615, ('ArrrrCamp', 2012L, 'female'): 0.045454545454545456, ('ArrrrCamp', 2012L, 'male'): 0.95454545454545459, ('ArrrrCamp', 2013L, 'female'): 0.052631578947368418, ('ArrrrCamp', 2013L, 'male'): 0.94736842105263153, ('Barcelona Ruby Conference', 2012L, 'female'): 0.0625, ('Barcelona Ruby Conference', 2012L, 'male'): 0.9375, ('Big Ruby', 2013L, 'female'): 0.071428571428571425, ('Big Ruby', 2013L, 'male'): 0.9285714285714286, ('Big Ruby', 2014L, 'female'): 0.26666666666666666, ('Big Ruby', 2014L, 'male'): 0.73333333333333328, ('Burlington Ruby', 2013L, 'female'): 0.16666666666666666, ('Burlington Ruby', 2013L, 'male'): 0.83333333333333337, ('Canadian University Software Engineering Conference', 2013L, 'female'): 0.23529411764705882, ('Canadian University Software Engineering Conference', 2013L, 'male'): 0.76470588235294112, ('Cascadia Ruby', 2011L, 'male'): 1.0, ('Cascadia Ruby', 2012L, 'female'): 0.35294117647058826, ('Cascadia Ruby', 2012L, 'male'): 0.6470588235294118, ('Cascadia Ruby', 2013L, 'female'): 0.2857142857142857, ('Cascadia Ruby', 2013L, 'male'): 0.7142857142857143, ('Chef Conf', 2012L, 'female'): 0.030303030303030304, ('Chef Conf', 2012L, 'male'): 0.96969696969696972, ('ChiPy', 2010L, 'male'): 1.0, ('ChiPy', 2012L, 'male'): 1.0, ('ChiPy', 2013L, 'female'): 0.11764705882352941, ('ChiPy', 2013L, 'male'): 0.88235294117647056, ('ChiPy', 2014L, 'male'): 1.0, ('Chicago Djangonauts', 2011L, 'male'): 1.0, ('Chicago Djangonauts', 2012L, 'male'): 1.0, ('Chicago Djangonauts', 2013L, 'male'): 1.0, ('Chicago Erlang User Group', 2012L, 'male'): 1.0, ('Chicago Erlang User Group', 2013L, 'male'): 1.0, ('Chicago Freelancers', 2013L, 'male'): 1.0, ('Chicago Web Conf', 2012L, 'female'): 0.25, ('Chicago Web Conf', 2012L, 'male'): 0.75, ('ChicagoLUG', 2013L, 'male'): 1.0, ('Continuum', 2013L, 'female'): 0.088235294117647065, ('Continuum', 2013L, 'male'): 0.91176470588235292, ('Copenhagen JS', 2012L, 'male'): 1.0, ('DevCon5', 2012L, 'male'): 1.0, ('DevconTLV Feb', 2013L, 'male'): 1.0, ('DevconTLV January', 2014L, 'male'): 1.0, ('DevconTLV June', 2013L, 'male'): 1.0, ('DevconTLV October', 2013L, 'male'): 1.0, ('DjangoCon', 2009L, 'male'): 1.0, ('DjangoCon', 2010L, 'male'): 1.0, ('DjangoCon', 2011L, 'male'): 1.0, ('DjangoCon', 2012L, 'male'): 1.0, ('DjangoCon AU', 2013L, 'male'): 1.0, ('Ember Conf', 2014L, 'female'): 0.20000000000000001, ('Ember Conf', 2014L, 'male'): 0.80000000000000004, ('Emerging Languages Camp', 2010L, 'female'): 0.16666666666666666, ('Emerging Languages Camp', 2010L, 'male'): 0.83333333333333337, ('Enthought', 2012L, 'female'): 0.025974025974025976, ('Enthought', 2012L, 'male'): 0.97402597402597402, ('Eric', 2013L, 'female'): 0.48148148148148145, ('Eric', 2013L, 'male'): 0.51851851851851849, ('Erlang DC', 2013L, 'female'): 0.076923076923076927, ('Erlang DC', 2013L, 'male'): 0.92307692307692313, ('Erlang Factory', 2013L, 'female'): 0.090909090909090912, ('Erlang Factory', 2013L, 'male'): 0.90909090909090906, ('Erlang Factory Lite Krakow, Poland', 2012L, 'male'): 1.0, ('Erlang Factory Lite Moscow, Russia', 2012L, 'male'): 1.0, ('Erlang Factory SF Bay Area', 2012L, 'female'): 0.018518518518518517, ('Erlang Factory SF Bay Area', 2012L, 'male'): 0.98148148148148151, ('Euro Clojure', 2012L, 'male'): 1.0, ('EuroPython', 2006L, 'male'): 1.0, ('EuroPython', 2008L, 'male'): 1.0, ('EuroPython', 2009L, 'male'): 1.0, ('EuroPython', 2010L, 'female'): 0.11764705882352941, ('EuroPython', 2010L, 'male'): 0.88235294117647056, ('EuroPython', 2011L, 'female'): 0.095652173913043481, ('EuroPython', 2011L, 'male'): 0.90434782608695652, ('EuroPython', 2012L, 'female'): 0.078260869565217397, ('EuroPython', 2012L, 'male'): 0.92173913043478262, ('EuroPython', 2013L, 'female'): 0.076190476190476197, ('EuroPython', 2013L, 'male'): 0.92380952380952386, ('EuroSciPy', 2008L, 'male'): 1.0, ('EuroSciPy', 2009L, 'female'): 0.066666666666666666, ('EuroSciPy', 2009L, 'male'): 0.93333333333333335, ('EuroSciPy', 2010L, 'female'): 0.05128205128205128, ('EuroSciPy', 2010L, 'male'): 0.94871794871794868, ('EuroSciPy', 2011L, 'female'): 0.042553191489361701, ('EuroSciPy', 2011L, 'male'): 0.95744680851063835, ('EuroSciPy', 2012L, 'female'): 0.035714285714285712, ('EuroSciPy', 2012L, 'male'): 0.9642857142857143, ('Farmhouse Conf', 2011L, 'female'): 0.54545454545454541, ('Farmhouse Conf', 2011L, 'male'): 0.45454545454545453, ('Flourish', 2012L, 'female'): 0.18181818181818182, ('Flourish', 2012L, 'male'): 0.81818181818181823, ('Fosdem', 2012L, 'female'): 0.072289156626506021, ('Fosdem', 2012L, 'male'): 0.92771084337349397, ('Fosdem', 2014L, 'female'): 0.055555555555555552, ('Fosdem', 2014L, 'male'): 0.94444444444444442, ('FreeGeek Chicago', 2012L, 'female'): 0.38461538461538464, ('FreeGeek Chicago', 2012L, 'male'): 0.61538461538461542, ('FreeGeek Chicago', 2013L, 'female'): 0.33333333333333331, ('FreeGeek Chicago', 2013L, 'male'): 0.66666666666666663, ('Fronteers', 2010L, 'female'): 0.17647058823529413, ('Fronteers', 2010L, 'male'): 0.82352941176470584, ('Fronteers', 2011L, 'female'): 0.15384615384615385, ('Fronteers', 2011L, 'male'): 0.84615384615384615, ('Fronteers', 2012L, 'female'): 0.125, ('Fronteers', 2012L, 'male'): 0.875, ('Fronteers', 2013L, 'female'): 0.17647058823529413, ('Fronteers', 2013L, 'male'): 0.82352941176470584, ('GORUCO', 2008L, 'male'): 1.0, ('GORUCO', 2009L, 'female'): 0.14285714285714285, ('GORUCO', 2009L, 'male'): 0.8571428571428571, ('GORUCO', 2012L, 'female'): 0.076923076923076927, ('GORUCO', 2012L, 'male'): 0.92307692307692313, ('GORUCO', 2013L, 'female'): 0.25, ('GORUCO', 2013L, 'male'): 0.75, ('Garden City Ruby', 2014L, 'female'): 0.21428571428571427, ('Garden City Ruby', 2014L, 'male'): 0.7857142857142857, ('Golden Gate Ruby Conference', 2009L, 'male'): 1.0, ('Golden Gate Ruby Conference', 2010L, 'female'): 0.23076923076923078, ('Golden Gate Ruby Conference', 2010L, 'male'): 0.76923076923076927, ('Golden Gate Ruby Conference', 2011L, 'female'): 0.076923076923076927, ('Golden Gate Ruby Conference', 2011L, 'male'): 0.92307692307692313, ('Golden Gate Ruby Conference', 2012L, 'female'): 0.22222222222222221, ('Golden Gate Ruby Conference', 2012L, 'male'): 0.77777777777777779, ('Golden Gate Ruby Conference', 2013L, 'female'): 0.21052631578947367, ('Golden Gate Ruby Conference', 2013L, 'male'): 0.78947368421052633, ('HTML 5.tx', 2013L, 'female'): 0.16666666666666666, ('HTML 5.tx', 2013L, 'male'): 0.83333333333333337, ('Ictev', 2013L, 'female'): 0.45070422535211269, ('Ictev', 2013L, 'male'): 0.54929577464788737, ('Ignite Buffalo', 2013L, 'female'): 0.20000000000000001, ('Ignite Buffalo', 2013L, 'male'): 0.80000000000000004, ('Ignite RailsConf', 2012L, 'male'): 1.0, ('International Conference on Functional Programming', 2012L, 'male'): 1.0, ('JRuby Conference', 2009L, 'male'): 1.0, ('JSConf', 2011L, 'male'): 1.0, ('JSConf', 2012L, 'female'): 0.20454545454545456, ('JSConf', 2012L, 'male'): 0.79545454545454541, ('JSConf EU', 2013L, 'female'): 0.22448979591836735, ('JSConf EU', 2013L, 'male'): 0.77551020408163263, ('Jax Conf', 2012L, 'male'): 1.0, ('Jenkins User Conference San Francisco', 2012L, 'male'): 1.0, ('Kiwi PyCon', 2013L, 'male'): 1.0, ('Kk', 2012L, 'female'): 0.14705882352941177, ('Kk', 2012L, 'male'): 0.8529411764705882, ('Kod.io', 2014L, 'female'): 0.23529411764705882, ('Kod.io', 2014L, 'male'): 0.76470588235294112, ('LA Ruby Conference', 2009L, 'male'): 1.0, ('LA Ruby Conference', 2010L, 'female'): 0.20000000000000001, ('LA Ruby Conference', 2010L, 'male'): 0.80000000000000004, ('LA Ruby Conference', 2011L, 'male'): 1.0, ('LA Ruby Conference', 2012L, 'male'): 1.0, ('LA Ruby Conference', 2013L, 'female'): 0.125, ('LA Ruby Conference', 2013L, 'male'): 0.875, ('LA Ruby Conference', 2014L, 'female'): 0.10000000000000001, ('LA Ruby Conference', 2014L, 'male'): 0.90000000000000002, ('LXJS', 2012L, 'female'): 0.034482758620689655, ('LXJS', 2012L, 'male'): 0.96551724137931039, ('La', 2012L, 'female'): 0.19444444444444445, ('La', 2012L, 'male'): 0.80555555555555558, ('Lca', 2013L, 'female'): 0.096385542168674704, ('Lca', 2013L, 'male'): 0.90361445783132532, ('Lone Star Ruby Conference', 2009L, 'female'): 0.11764705882352941, ('Lone Star Ruby Conference', 2009L, 'male'): 0.88235294117647056, ('Lone Star Ruby Conference', 2010L, 'female'): 0.076923076923076927, ('Lone Star Ruby Conference', 2010L, 'male'): 0.92307692307692313, ('Lone Star Ruby Conference', 2011L, 'female'): 0.064516129032258063, ('Lone Star Ruby Conference', 2011L, 'male'): 0.93548387096774188, ('Lone Star Ruby Conference', 2013L, 'female'): 0.15384615384615385, ('Lone Star Ruby Conference', 2013L, 'male'): 0.84615384615384615, ('Madison Ruby', 2011L, 'female'): 0.125, ('Madison Ruby', 2011L, 'male'): 0.875, ('Madison Ruby', 2012L, 'female'): 0.095238095238095233, ('Madison Ruby', 2012L, 'male'): 0.90476190476190477, ('Madison Ruby', 2013L, 'female'): 0.375, ('Madison Ruby', 2013L, 'male'): 0.625, ('MagRails', 2011L, 'female'): 0.125, ('MagRails', 2011L, 'male'): 0.875, ('MongoDB', 2012L, 'male'): 1.0, ('Mountain rb', 2010L, 'male'): 1.0, ('MountainWest RubyConf', 2007L, 'male'): 1.0, ('MountainWest RubyConf', 2008L, 'female'): 0.076923076923076927, ('MountainWest RubyConf', 2008L, 'male'): 0.92307692307692313, ('MountainWest RubyConf', 2009L, 'female'): 0.076923076923076927, ('MountainWest RubyConf', 2009L, 'male'): 0.92307692307692313, ('MountainWest RubyConf', 2010L, 'female'): 0.10000000000000001, ('MountainWest RubyConf', 2010L, 'male'): 0.90000000000000002, ('MountainWest RubyConf', 2011L, 'male'): 1.0, ('MountainWest RubyConf', 2012L, 'female'): 0.11764705882352941, ('MountainWest RubyConf', 2012L, 'male'): 0.88235294117647056, ('MountainWest RubyConf', 2013L, 'female'): 0.125, ('MountainWest RubyConf', 2013L, 'male'): 0.875, ('Nickel City Ruby Conference', 2013L, 'female'): 0.20000000000000001, ('Nickel City Ruby Conference', 2013L, 'male'): 0.80000000000000004, ('Northeast Scala Symposium', 2012L, 'male'): 1.0, ('OSCON', 2001L, 'female'): 0.04065040650406504, ('OSCON', 2001L, 'male'): 0.95934959349593496, ('OSCON', 2002L, 'female'): 0.065868263473053898, ('OSCON', 2002L, 'male'): 0.93413173652694614, ('OSCON', 2003L, 'female'): 0.12345679012345678, ('OSCON', 2003L, 'male'): 0.87654320987654322, ('OSCON', 2004L, 'female'): 0.056410256410256411, ('OSCON', 2004L, 'male'): 0.94358974358974357, ('OSCON', 2005L, 'female'): 0.076576576576576572, ('OSCON', 2005L, 'male'): 0.92342342342342343, ('OSCON', 2006L, 'female'): 0.039800995024875621, ('OSCON', 2006L, 'male'): 0.96019900497512434, ('OSCON', 2007L, 'female'): 0.088607594936708861, ('OSCON', 2007L, 'male'): 0.91139240506329111, ('OSCON', 2008L, 'female'): 0.086021505376344093, ('OSCON', 2008L, 'male'): 0.91397849462365588, ('OSCON', 2009L, 'female'): 0.092307692307692313, ('OSCON', 2009L, 'male'): 0.90769230769230769, ('OSCON', 2010L, 'female'): 0.072368421052631582, ('OSCON', 2010L, 'male'): 0.92763157894736847, ('OSCON', 2011L, 'female'): 0.089403973509933773, ('OSCON', 2011L, 'male'): 0.91059602649006621, ('OSCON', 2012L, 'female'): 0.13134328358208955, ('OSCON', 2012L, 'male'): 0.86865671641791042, ('OSCON', 2013L, 'female'): 0.20274914089347079, ('OSCON', 2013L, 'male'): 0.79725085910652926, ('OSCON', 2014L, 'female'): 0.18650793650793651, ('OSCON', 2014L, 'male'): 0.81349206349206349, ('OpenStack On Ales', 2013L, 'male'): 1.0, ('PSF', 2012L, 'female'): 0.084415584415584416, ('PSF', 2012L, 'male'): 0.91558441558441561, ('PSF', 2013L, 'female'): 0.12060301507537688, ('PSF', 2013L, 'male'): 0.87939698492462315, ('Pacific Northwest Scala', 2013L, 'male'): 1.0, ('Pumping Station: One', 2013L, 'male'): 1.0, ('PuppetConf', 2012L, 'female'): 0.063829787234042548, ('PuppetConf', 2012L, 'male'): 0.93617021276595747, ('PyCon AU', 2010L, 'male'): 1.0, ('PyCon AU', 2011L, 'female'): 0.5, ('PyCon AU', 2011L, 'male'): 0.5, ('PyCon AU', 2012L, 'male'): 1.0, ('PyCon AU', 2013L, 'male'): 1.0, ('PyCon Australia', 2013L, 'female'): 0.11904761904761904, ('PyCon Australia', 2013L, 'male'): 0.88095238095238093, ('PyCon CA', 2012L, 'female'): 0.15254237288135594, ('PyCon CA', 2012L, 'male'): 0.84745762711864403, ('PyCon CA', 2013L, 'female'): 0.17777777777777778, ('PyCon CA', 2013L, 'male'): 0.82222222222222219, ('PyCon DE', 2012L, 'female'): 0.018867924528301886, ('PyCon DE', 2012L, 'male'): 0.98113207547169812, ('PyCon DE', 2013L, 'female'): 0.016949152542372881, ('PyCon DE', 2013L, 'male'): 0.98305084745762716, ('PyCon US', 2007L, 'female'): 0.033333333333333333, ('PyCon US', 2007L, 'male'): 0.96666666666666667, ('PyCon US', 2008L, 'female'): 0.035714285714285712, ('PyCon US', 2008L, 'male'): 0.9642857142857143, ('PyCon US', 2009L, 'female'): 0.031746031746031744, ('PyCon US', 2009L, 'male'): 0.96825396825396826, ('PyCon US', 2010L, 'female'): 0.080459770114942528, ('PyCon US', 2010L, 'male'): 0.91954022988505746, ('PyCon US', 2011L, 'female'): 0.01282051282051282, ('PyCon US', 2011L, 'male'): 0.98717948717948723, ('PyCon US', 2012L, 'female'): 0.057142857142857141, ('PyCon US', 2012L, 'male'): 0.94285714285714284, ('PyCon US', 2013L, 'female'): 0.14285714285714285, ('PyCon US', 2013L, 'male'): 0.8571428571428571, ('PyCon US', 2014L, 'female'): 0.26315789473684209, ('PyCon US', 2014L, 'male'): 0.73684210526315785, ('PyGotham', 2011L, 'female'): 0.5, ('PyGotham', 2011L, 'male'): 0.5, ('PyOhio', 2010L, 'male'): 1.0, ('PyOhio', 2011L, 'male'): 1.0, ('PyOhio', 2012L, 'female'): 0.038461538461538464, ('PyOhio', 2012L, 'male'): 0.96153846153846156, ('PyOhio', 2013L, 'female'): 0.096774193548387094, ('PyOhio', 2013L, 'male'): 0.90322580645161288, ('PyTennessee', 2014L, 'female'): 0.25925925925925924, ('PyTennessee', 2014L, 'male'): 0.7407407407407407, ('Pygotham', 2012L, 'female'): 0.15384615384615385, ('Pygotham', 2012L, 'male'): 0.84615384615384615, ('Rails Israel', 2012L, 'male'): 1.0, ('Rails Israel', 2013L, 'female'): 0.083333333333333329, ('Rails Israel', 2013L, 'male'): 0.91666666666666663, ('RailsConf', 2012L, 'female'): 0.032258064516129031, ('RailsConf', 2012L, 'male'): 0.967741935483871, ('RailsConf', 2013L, 'female'): 0.13235294117647059, ('RailsConf', 2013L, 'male'): 0.86764705882352944, ('Rest Fest', 2012L, 'female'): 0.043478260869565216, ('Rest Fest', 2012L, 'male'): 0.95652173913043481, ('Rocky Mountain Ruby', 2011L, 'female'): 0.043478260869565216, ('Rocky Mountain Ruby', 2011L, 'male'): 0.95652173913043481, ('Rocky Mountain Ruby', 2012L, 'female'): 0.041666666666666664, ('Rocky Mountain Ruby', 2012L, 'male'): 0.95833333333333337, ('Rocky Mountain Ruby', 2013L, 'female'): 0.086956521739130432, ('Rocky Mountain Ruby', 2013L, 'male'): 0.91304347826086951, ('Ruby Conf Australia', 2013L, 'female'): 0.14285714285714285, ('Ruby Conf Australia', 2013L, 'male'): 0.8571428571428571, ('Ruby Conference', 2007L, 'female'): 0.032258064516129031, ('Ruby Conference', 2007L, 'male'): 0.967741935483871, ('Ruby Conference', 2008L, 'female'): 0.027027027027027029, ('Ruby Conference', 2008L, 'male'): 0.97297297297297303, ('Ruby Conference', 2009L, 'female'): 0.050000000000000003, ('Ruby Conference', 2009L, 'male'): 0.94999999999999996, ('Ruby Conference', 2010L, 'female'): 0.078125, ('Ruby Conference', 2010L, 'male'): 0.921875, ('Ruby Conference', 2011L, 'female'): 0.050847457627118647, ('Ruby Conference', 2011L, 'male'): 0.94915254237288138, ('Ruby Conference', 2012L, 'female'): 0.11363636363636363, ('Ruby Conference', 2012L, 'male'): 0.88636363636363635, ('Ruby Conference', 2013L, 'female'): 0.14000000000000001, ('Ruby Conference', 2013L, 'male'): 0.85999999999999999, ('Ruby Hoedown', 2007L, 'female'): 0.125, ('Ruby Hoedown', 2007L, 'male'): 0.875, ('Ruby Hoedown', 2008L, 'female'): 0.0625, ('Ruby Hoedown', 2008L, 'male'): 0.9375, ('Ruby Hoedown', 2010L, 'male'): 1.0, ('Ruby Lugdunum (RuLu)', 2012L, 'male'): 1.0, ('Ruby Midwest', 2011L, 'female'): 0.095238095238095233, ('Ruby Midwest', 2011L, 'male'): 0.90476190476190477, ('Ruby Midwest', 2013L, 'female'): 0.1875, ('Ruby Midwest', 2013L, 'male'): 0.8125, ('Ruby Nation', 2012L, 'male'): 1.0, ('Ruby On Ales', 2011L, 'male'): 1.0, ('Ruby On Ales', 2012L, 'female'): 0.16666666666666666, ('Ruby On Ales', 2012L, 'male'): 0.83333333333333337, ('Ruby On Ales', 2013L, 'female'): 0.26666666666666666, ('Ruby On Ales', 2013L, 'male'): 0.73333333333333328, ('RubyConf India', 2012L, 'female'): 0.037037037037037035, ('RubyConf India', 2012L, 'male'): 0.96296296296296291, ('RubyConf India', 2013L, 'male'): 1.0, ('RubyConf Uruguay', 2010L, 'male'): 1.0, ('RubyConf Uruguay', 2013L, 'female'): 0.10526315789473684, ('RubyConf Uruguay', 2013L, 'male'): 0.89473684210526316, ('Ruby|Web Conference', 2010L, 'male'): 1.0, ('SciPy', 2008L, 'male'): 1.0, ('SciPy', 2009L, 'male'): 1.0, ('SciPy', 2010L, 'female'): 0.046153846153846156, ('SciPy', 2010L, 'male'): 0.9538461538461539, ('SciPy', 2011L, 'female'): 0.063492063492063489, ('SciPy', 2011L, 'male'): 0.93650793650793651, ('SciPy', 2012L, 'male'): 1.0, ('SciPy', 2013L, 'female'): 0.10000000000000001, ('SciPy', 2013L, 'male'): 0.90000000000000002, ('Scotland Ruby', 2011L, 'male'): 1.0, ('Steel City Ruby', 2012L, 'female'): 0.125, ('Steel City Ruby', 2012L, 'male'): 0.875, ('Steel City Ruby', 2013L, 'female'): 0.36363636363636365, ('Steel City Ruby', 2013L, 'male'): 0.63636363636363635, ('Sunny Conf', 2010L, 'male'): 1.0, ('The Next Web', 2012L, 'female'): 0.086956521739130432, ('The Next Web', 2012L, 'male'): 0.91304347826086951, ('Troy', 2013L, 'female'): 0.10526315789473684, ('Troy', 2013L, 'male'): 0.89473684210526316, ('Waza', 2012L, 'male'): 1.0, ('Web Directions Code', 2012L, 'female'): 0.14285714285714285, ('Web Directions Code', 2012L, 'male'): 0.8571428571428571, ('Web Directions South', 2012L, 'male'): 1.0, ('Web Rebels', 2012L, 'male'): 1.0, ('Wicked Good Ruby', 2013L, 'female'): 0.16666666666666666, ('Wicked Good Ruby', 2013L, 'male'): 0.83333333333333337, ('Windy City DB', 2010L, 'male'): 1.0, ('Windy City DB', 2011L, 'female'): 0.25, ('Windy City DB', 2011L, 'male'): 0.75, ('Windy City DB', 2012L, 'male'): 1.0, ('Windy City Go', 2011L, 'female'): 0.16666666666666666, ('Windy City Go', 2011L, 'male'): 0.83333333333333337, ('Windy City Go', 2012L, 'female'): 0.25, ('Windy City Go', 2012L, 'male'): 0.75, ('Windy City Rails', 2009L, 'male'): 1.0, ('Windy City Rails', 2010L, 'male'): 1.0, ('Windy City Rails', 2011L, 'male'): 1.0, ('Windy City Rails', 2012L, 'male'): 1.0, ('X.Org Developer Conference', 2012L, 'male'): 1.0, ('confoo.ca', 2010L, 'female'): 0.043956043956043959, ('confoo.ca', 2010L, 'male'): 0.95604395604395609, ('confoo.ca', 2011L, 'female'): 0.019230769230769232, ('confoo.ca', 2011L, 'male'): 0.98076923076923073, ('confoo.ca', 2012L, 'female'): 0.066666666666666666, ('confoo.ca', 2012L, 'male'): 0.93333333333333335, ('confoo.ca', 2013L, 'female'): 0.13186813186813187, ('confoo.ca', 2013L, 'male'): 0.86813186813186816, ('confoo.ca', 2014L, 'female'): 0.13095238095238096, ('confoo.ca', 2014L, 'male'): 0.86904761904761907, ('curtin', 2014L, 'female'): 0.5, ('curtin', 2014L, 'male'): 0.5, ('developerweek.com', 2013L, 'female'): 0.12621359223300971, ('developerweek.com', 2013L, 'male'): 0.87378640776699024, ('developerweek.com', 2014L, 'female'): 0.13953488372093023, ('developerweek.com', 2014L, 'male'): 0.86046511627906974, ('djangocon.eu', 2011L, 'female'): 0.068965517241379309, ('djangocon.eu', 2011L, 'male'): 0.93103448275862066, ('djangocon.eu', 2013L, 'female'): 0.034482758620689655, ('djangocon.eu', 2013L, 'male'): 0.96551724137931039, ('djangocon.eu', 2014L, 'female'): 0.10344827586206896, ('djangocon.eu', 2014L, 'male'): 0.89655172413793105, ('eurUko', 2012L, 'female'): 0.071428571428571425, ('eurUko', 2012L, 'male'): 0.9285714285714286, ('jQuery Conference San Francisco', 2012L, 'female'): 0.083333333333333329, ('jQuery Conference San Francisco', 2012L, 'male'): 0.91666666666666663, ('jQuery Conference UK', 2012L, 'male'): 1.0, ('js.chi();', 2012L, 'male'): 1.0, ('meet.js SUMMIT', 2012L, 'female'): 0.0625, ('meet.js SUMMIT', 2012L, 'male'): 0.9375, ('openstack Summit Portland', 2013L, 'female'): 0.16666666666666666, ('openstack Summit Portland', 2013L, 'male'): 0.83333333333333337, ('openstack summit fall', 2012L, 'female'): 0.094736842105263161, ('openstack summit fall', 2012L, 'male'): 0.90526315789473688, ('railsberry', 2012L, 'female'): 0.10000000000000001, ('railsberry', 2012L, 'male'): 0.90000000000000002, ('rockymtnruby.com', 2010L, 'male'): 1.0, ('rockymtnruby.com', 2011L, 'female'): 0.043478260869565216, ('rockymtnruby.com', 2011L, 'male'): 0.95652173913043481, ('rockymtnruby.com', 2012L, 'female'): 0.17647058823529413, ('rockymtnruby.com', 2012L, 'male'): 0.82352941176470584, ('rockymtnruby.com', 2013L, 'female'): 0.1111111111111111, ('rockymtnruby.com', 2013L, 'male'): 0.88888888888888884, ('strangeloop.com', 2009L, 'male'): 1.0, ('strangeloop.com', 2011L, 'female'): 0.085714285714285715, ('strangeloop.com', 2011L, 'male'): 0.91428571428571426, ('strangeloop.com', 2012L, 'female'): 0.080000000000000002, ('strangeloop.com', 2012L, 'male'): 0.92000000000000004, ('strangeloop.com', 2013L, 'female'): 0.2441860465116279, ('strangeloop.com', 2013L, 'male'): 0.7558139534883721, ('strataconf', 2011L, 'female'): 0.125, ('strataconf', 2011L, 'male'): 0.875, ('strataconf', 2012L, 'female'): 0.16107382550335569, ('strataconf', 2012L, 'male'): 0.83892617449664431, ('strataconf', 2013L, 'female'): 0.17679558011049723, ('strataconf', 2013L, 'male'): 0.82320441988950277}
Lets plot a few of them
conf_data = df[(df.conference == 'Golden Gate Ruby Conference') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked = True, title = 'Golden Gate Ruby Conference')
<matplotlib.axes.AxesSubplot at 0xbd4d8ec>
conf_data = df[(df.conference == 'strangeloop.com') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked = True, title = 'strangeloop.com')
<matplotlib.axes.AxesSubplot at 0xc0dbacc>
conf_data = df[(df.conference == 'PyCon US') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked = True, title = 'PyCon US')
<matplotlib.axes.AxesSubplot at 0xbc7396c>
conf_data = df[(df.conference == 'Cascadia Ruby') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked = True, title = 'Cascadia Ruby')
<matplotlib.axes.AxesSubplot at 0xc11928c>
conf_data = df[(df.conference == 'Farmhouse Conf') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked = True, title = 'Farmhouse Conf')
<matplotlib.axes.AxesSubplot at 0xc58396c>
Farmhouse Conf. is the only conference we scrapped that has about the same number of female and male speakers. The conference is not stricktly about coding. Besides the equal balance of female and male speakers was enforced by the rules of the conference