My friend Andrew Key and I are developing a series of Data Science Challenges to help each other and others become better Data Scientists by working on hands-on projects.
This notebook contains my work for the first challenge, Exploring the Meetup API in my city of choice, beautiful Vancouver, BC.
Use the Meetup API to explore meetups in your city of choice.
Guide Questions:
Below are some guide line questions to get you started:
I will focus on exploring the Tech Meetups in Vancouver. Specifically, I will look at the size and activity of these meetups.
As I explore the data, I will also raise additional questions that could be further explored as extensions to this project.
# import libraries
import datetime
import math
import meetup.api
import os
import pprint
import requests
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from mpl_toolkits.basemap import Basemap
from pandas.plotting import scatter_matrix
%matplotlib inline
Make sure you have set up MEETUP_API_KEY in your environment:
In Ubuntu:
/etc/environment
as super user - e.g. sudo vim /etc/environment
MEETUP_API_KEY="YOUR MEETUP API KEY"
env | grep -i meetup
In MacOS:
~/.bash_profile
export MEETUP_API_KEY="YOUR MEETUP API KEY"
at the end of the fileenv | grep -i meetup
assert 'MEETUP_API_KEY' in os.environ, (
"You need MEETUP_API_KEY in your environment"
)
client = meetup.api.Client()
Let's first create some helper functions.
I got the parse_response
function from my friend Andy. Below are the explanations from his notebook:
This helper function helps us parse out the two most useful different pieces:
def parse_response(response):
'''returns two dataframes, meta and results:
- meta: a vertically aligned dataframe, where each row is an element
of the response.meta dictionary
- results: a horizontally aligned dataframe, where each column is
an element of the response.results dictionary'''
meta = pd.DataFrame.from_dict(response.meta, orient='index')
results = pd.DataFrame.from_dict(response.results)
return (meta, results)
We are interested in exploring Tech Meetup groups based in Vancouver, BC. We will use the GetGroups method to get the information we need.
The method requires the following information:
Category ID
We don't know what Category ID to use. We will have to find this via the GetCategories method.
Location Information
We know the city, province and country that we are interested in. The following parameters are what we need to provide in the GetGroups
method to specify a location:
These 5 parameters limit the returned groups to within a 6th parameter, radius miles of the specified location.
Let's use the city and country parameters, but we are not sure what the API considers a valid country code. We can use the GetCities method to get this information.
Let's use the GetCategories method to discover the different categories and their IDs:
categories_resp = client.GetCategories()
29/30 (10 seconds remaining)
categories_meta_df, categories_df = parse_response(categories_resp)
categories_meta_df
0 | |
---|---|
next | |
method | Categories |
total_count | 33 |
link | https://api.meetup.com/2/categories |
count | 33 |
description | Returns a list of Meetup group categories |
lon | None |
title | Categories |
url | https://api.meetup.com/2/categories?offset=0&f... |
id | |
updated | 1450292956000 |
lat | None |
We observe there are 33 Meetup categories.
Let's use this opportunity to understand the Meetup API by looking at the next
field in the metadata.
Meetup API v2 response payloads are returned in pages, one at a time. Each page by default returns 200 records. If there is more than 1 page of results, the metadata will provide the URI for the next API call in the sequence via the next
field. We can take advantage of this to programmatically get each subsequent page until we have collected all of the records we are looking for.
In the above metadata, since there are only 33 categories (total_count
is 33), they are all returned in the first page. Therefore, the next
field is empty as there is no need for any subsequent API calls to get more records.
categories_df
id | name | shortname | sort_name | |
---|---|---|---|---|
0 | 1 | Arts & Culture | Arts | Arts & Culture |
1 | 18 | Book Clubs | Book Clubs | Book Clubs |
2 | 2 | Career & Business | Business | Career & Business |
3 | 3 | Cars & Motorcycles | Auto | Cars & Motorcycles |
4 | 4 | Community & Environment | Community | Community & Environment |
5 | 5 | Dancing | Dancing | Dancing |
6 | 6 | Education & Learning | Education | Education & Learning |
7 | 8 | Fashion & Beauty | Fashion | Fashion & Beauty |
8 | 9 | Fitness | Fitness | Fitness |
9 | 10 | Food & Drink | Food & Drink | Food & Drink |
10 | 11 | Games | Games | Games |
11 | 13 | Movements & Politics | Movements | Movements & Politics |
12 | 14 | Health & Wellbeing | Well-being | Health & Wellbeing |
13 | 15 | Hobbies & Crafts | Crafts | Hobbies & Crafts |
14 | 16 | Language & Ethnic Identity | Languages | Language & Ethnic Identity |
15 | 12 | LGBT | LGBT | LGBT |
16 | 17 | Lifestyle | Lifestyle | Lifestyle |
17 | 20 | Movies & Film | Films | Movies & Film |
18 | 21 | Music | Music | Music |
19 | 22 | New Age & Spirituality | Spirituality | New Age & Spirituality |
20 | 23 | Outdoors & Adventure | Outdoors | Outdoors & Adventure |
21 | 24 | Paranormal | Paranormal | Paranormal |
22 | 25 | Parents & Family | Moms & Dads | Parents & Family |
23 | 26 | Pets & Animals | Pets | Pets & Animals |
24 | 27 | Photography | Photography | Photography |
25 | 28 | Religion & Beliefs | Beliefs | Religion & Beliefs |
26 | 29 | Sci-Fi & Fantasy | Sci fi | Sci-Fi & Fantasy |
27 | 30 | Singles | Singles | Singles |
28 | 31 | Socializing | Social | Socializing |
29 | 32 | Sports & Recreation | Sports | Sports & Recreation |
30 | 33 | Support | Support | Support |
31 | 34 | Tech | Tech | Tech |
32 | 36 | Writing | Writing | Writing |
We see the 33 category records, where each category record has an ID, name, short name and a sort name.
Tech's category id is 34. We will use this.
cities_resp = client.GetCities(country='Canada', query='Vancouver')
28/30 (10 seconds remaining)
cities_meta_df, cities_df = parse_response(cities_resp)
cities_meta_df
0 | |
---|---|
next | |
method | Cities |
total_count | 2 |
link | https://api.meetup.com/2/cities |
count | 2 |
description | Returns Meetup cities. This method supports se... |
lon | None |
title | Cities |
url | https://api.meetup.com/2/cities?offset=0&query... |
id | |
updated | 1263132772000 |
lat | None |
cities_df
city | country | id | lat | localized_country_name | lon | member_count | name_string | ranking | state | zip | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | Vancouver | ca | 857064 | 49.279999 | Canada | -123.040001 | 39959 | Vancouver, British Columbia, Canada | 0 | BC | V5K 0A1 |
1 | Vancouver | us | 98660 | 45.639999 | USA | -122.680000 | 4699 | Vancouver, Washington, USA | 1 | WA | 98660 |
Observation:
Our query returned 2 Vancouvers - one based in Canada and another in the USA. So the country code used (Canada
) was not valid.
But from our results, we can see the correct country code is ca. Let's confirm this.
cities_resp = client.GetCities(country='ca', query='Vancouver')
cities_meta_df, cities_df = parse_response(cities_resp)
cities_df
27/30 (10 seconds remaining)
city | country | id | lat | localized_country_name | lon | member_count | name_string | ranking | state | zip | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | Vancouver | ca | 857064 | 49.279999 | Canada | -123.040001 | 39959 | Vancouver, British Columbia, Canada | 0 | BC | V5K 0A1 |
Now we have found the correct country code (ca
) and confirmed we are getting the one city that interests us.
We now have the necessary information to use the GetGroups
method to get Tech Meetup groups based in Vancouver, BC:
# searching for Vancouver, Canada tech groups
groups_resp = client.GetGroups(category_id=34, city='Vancouver', country='ca')
groups_meta_df, groups_df = parse_response(groups_resp)
groups_meta_df
26/30 (10 seconds remaining)
0 | |
---|---|
next | https://api.meetup.com/2/groups?country=ca&off... |
method | Groups |
total_count | 434 |
link | https://api.meetup.com/2/groups |
count | 200 |
description | None |
lon | -123.04 |
title | Meetup Groups v2 |
url | https://api.meetup.com/2/groups?country=ca&off... |
id | |
updated | 1551248722000 |
lat | 49.28 |
Observations
From the response's metadata, we observe that there are 433 Tech Meetup groups in Vancouver, BC. Per the earlier explanation about how the Meetup API sends back pages of results, we only received a single page of 200 group records.
Let's build a function to collect all of the group records spanning multiple pages and return them as a single Dataframe. This function was adapted from my friend Andy's notebook:
def get_all_groups_as_df(category_id=34, city='Vancouver', country='ca', radius=25):
'''returns all groups as a dataframe'''
groups_df_list = []
page_number = 0
next_page = None
while next_page != '':
groups_resp = client.GetGroups(
category_id=category_id,
city=city,
country=country,
offset=page_number,
radius=radius
)
next_page = groups_resp.meta['next']
_, group_df = parse_response(groups_resp)
groups_df_list.append(group_df)
page_number += 1
return pd.concat(groups_df_list, ignore_index=True)
groups_df = get_all_groups_as_df()
groups_df.shape
25/30 (8 seconds remaining) 24/30 (7 seconds remaining) 23/30 (6 seconds remaining)
(434, 22)
We successfully got all 433 Tech Meetup groups by checking the size of our dataframe against the total_count
from the response's metadata.
Time to explore :)
As a side bar, let's look at an example of a group in its raw JSON format. The pprint.pprint()
function is a great way of showing JSON data / Python Dictionaries:
pprint.pprint(groups_resp.results[0])
{'category': {'id': 34, 'name': 'tech', 'shortname': 'tech'}, 'city': 'Vancouver', 'country': 'CA', 'created': 1034097731000, 'description': '<p><span>The Vancouver Blogger Meetup Group is a platform ' 'where blogger and social media professionals and enthusiasts ' 'interact and network. This platform aims to be a reference ' 'and provide valuable information to graduates, professionals, ' 'small and medium sized businesses. <br><br>We are ' 'currently working on revitalizing this MeetUp Group and will ' 'soon bring up some topics for Meetups. Please, join our group ' 'and feel free to contact us at any time. Stay ' 'tuned! <br><br>Thank you and enjoy!</span></p>', 'group_photo': {'base_url': 'https://secure.meetupstatic.com', 'highres_link': 'https://secure.meetupstatic.com/photos/event/2/4/f/f/highres_438549471.jpeg', 'photo_id': 438549471, 'photo_link': 'https://secure.meetupstatic.com/photos/event/2/4/f/f/600_438549471.jpeg', 'thumb_link': 'https://secure.meetupstatic.com/photos/event/2/4/f/f/thumb_438549471.jpeg', 'type': 'event'}, 'id': 21636, 'join_mode': 'approval', 'lat': 49.279998779296875, 'link': 'https://www.meetup.com/The-Vancouver-Blogger-Meetup-Group/', 'lon': -123.12999725341797, 'members': 652, 'name': 'The Vancouver Blogger Meetup Group', 'organizer': {'member_id': 66805072, 'name': 'Fabiano Maciel', 'photo': {'base_url': 'https://secure.meetupstatic.com', 'highres_link': 'https://secure.meetupstatic.com/photos/member/c/b/3/3/highres_242992019.jpeg', 'photo_id': 242992019, 'photo_link': 'https://secure.meetupstatic.com/photos/member/c/b/3/3/member_242992019.jpeg', 'thumb_link': 'https://secure.meetupstatic.com/photos/member/c/b/3/3/thumb_242992019.jpeg', 'type': 'member'}}, 'rating': 4.32, 'state': 'BC', 'timezone': 'Canada/Pacific', 'topics': [{'id': 389, 'name': 'Small Business', 'urlkey': 'smallbiz'}, {'id': 1238, 'name': 'Marketing', 'urlkey': 'marketing'}, {'id': 4422, 'name': 'Social Networking', 'urlkey': 'socialnetwork'}, {'id': 8304, 'name': 'WordPress', 'urlkey': 'wordpress'}, {'id': 15322, 'name': 'Creative Writing', 'urlkey': 'creative-writing'}, {'id': 15401, 'name': 'Social Media', 'urlkey': 'social-media'}, {'id': 15585, 'name': 'Online Marketing', 'urlkey': 'online-marketing'}, {'id': 15720, 'name': 'Professional Networking', 'urlkey': 'professional-networking'}, {'id': 16728, 'name': 'Writing', 'urlkey': 'writing'}, {'id': 17325, 'name': 'Small Business Marketing Strategy', 'urlkey': 'small-business-marketing-strategy'}, {'id': 17742, 'name': 'Social Media Marketing', 'urlkey': 'social-media-marketing'}, {'id': 18167, 'name': 'Writing Workshops', 'urlkey': 'writing-workshops'}, {'id': 19882, 'name': 'Entrepreneurship', 'urlkey': 'entrepreneurship'}, {'id': 23199, 'name': 'SEO (Search Engine Optimization)', 'urlkey': 'seo-search-engine-optimization'}, {'id': 44508, 'name': 'Blogging', 'urlkey': 'blogs'}], 'urlname': 'The-Vancouver-Blogger-Meetup-Group', 'utc_offset': -28800000, 'visibility': 'public', 'who': 'Friends'}
Let's explore our dataframe.
groups_df.columns
Index(['category', 'city', 'country', 'created', 'description', 'group_photo', 'id', 'join_mode', 'lat', 'link', 'lon', 'members', 'name', 'organizer', 'rating', 'state', 'timezone', 'topics', 'urlname', 'utc_offset', 'visibility', 'who'], dtype='object')
groups_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 434 entries, 0 to 433 Data columns (total 22 columns): category 434 non-null object city 434 non-null object country 434 non-null object created 434 non-null int64 description 434 non-null object group_photo 237 non-null object id 434 non-null int64 join_mode 434 non-null object lat 434 non-null float64 link 434 non-null object lon 434 non-null float64 members 434 non-null int64 name 434 non-null object organizer 431 non-null object rating 434 non-null float64 state 434 non-null object timezone 434 non-null object topics 434 non-null object urlname 434 non-null object utc_offset 434 non-null int64 visibility 434 non-null object who 434 non-null object dtypes: float64(3), int64(4), object(15) memory usage: 74.7+ KB
groups_df.head(5)
category | city | country | created | description | group_photo | id | join_mode | lat | link | ... | name | organizer | rating | state | timezone | topics | urlname | utc_offset | visibility | who | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | {'name': 'tech', 'id': 34, 'shortname': 'tech'} | Vancouver | CA | 1034097731000 | <p><span>The Vancouver Blogger Meetup Group is... | {'highres_link': 'https://secure.meetupstatic.... | 21636 | approval | 49.279999 | https://www.meetup.com/The-Vancouver-Blogger-M... | ... | The Vancouver Blogger Meetup Group | {'member_id': 66805072, 'name': 'Fabiano Macie... | 4.32 | BC | Canada/Pacific | [{'urlkey': 'smallbiz', 'name': 'Small Busines... | The-Vancouver-Blogger-Meetup-Group | -28800000 | public | Friends |
1 | {'name': 'tech', 'id': 34, 'shortname': 'tech'} | Vancouver | CA | 1036650741000 | <p>Hello Everyone,</p>\n<p>My name is Nathan a... | NaN | 71383 | open | 49.279999 | https://www.meetup.com/My-Internet-Media-Groups/ | ... | My Internet Media Groups | {'member_id': 104036462, 'name': 'Nathan W.', ... | 4.25 | BC | Canada/Pacific | [{'urlkey': 'photoshop', 'name': 'Photoshop', ... | My-Internet-Media-Groups | -28800000 | public | Future Moguls |
2 | {'name': 'tech', 'id': 34, 'shortname': 'tech'} | Vancouver | CA | 1072723179000 | <p>As the largest software development meetup ... | NaN | 144695 | open | 49.279999 | https://www.meetup.com/VanDev/ | ... | VanDev: Vancouver's Software Developers Network | {'member_id': 273611362, 'name': 'Bevan Hunt',... | 4.39 | BC | Canada/Pacific | [{'urlkey': 'softwaredev', 'name': 'Software D... | VanDev | -28800000 | public | Software Developers |
3 | {'name': 'tech', 'id': 34, 'shortname': 'tech'} | Vancouver | CA | 1142734447000 | <p>Vancouver Web Design is British Columbia's ... | {'highres_link': 'https://secure.meetupstatic.... | 229052 | open | 49.290001 | https://www.meetup.com/webdesign-395/ | ... | The Vancouver Web Design Meetup Group | {'member_id': 87486782, 'name': 'Darrell Hadde... | 4.40 | BC | Canada/Pacific | [{'urlkey': 'webstandards', 'name': 'Web Stand... | webdesign-395 | -28800000 | public | Vancouver Web Design Meetup |
4 | {'name': 'tech', 'id': 34, 'shortname': 'tech'} | Vancouver | CA | 1184711936000 | <p>Do you develop with Ruby? Make web applicat... | {'highres_link': 'https://secure.meetupstatic.... | 573925 | open | 49.279999 | https://www.meetup.com/vancouver-ruby/ | ... | Vancouver Ruby Meetup Group | {'member_id': 10374191, 'name': 'Philippe Creu... | 4.59 | BC | Canada/Pacific | [{'urlkey': 'ruby', 'name': 'Ruby', 'id': 1040... | vancouver-ruby | -28800000 | public | Ruby Developers |
5 rows × 22 columns
# explore the uniqueness of each column's values
def display_columns_unique_values(df):
'''display each columns unique values
- if only 1, will display it
- if unable to get unique values, will display data structure type
'''
for column in df.columns:
try:
unique_values = df[column].unique()
num_unique_values = len(unique_values)
result = f"Column {column.upper():>15}: {num_unique_values:}"
if num_unique_values == 1:
result += f"|Unique values: {unique_values}"
print(result)
# some columns contain dict objects which aren't hashable so they are ignored
except TypeError as tp:
print(f"Column {column.upper()}: Type: {type(df[column][0])}")
pass
display_columns_unique_values(groups_df)
Column CATEGORY: Type: <class 'dict'> Column CITY: 11 Column COUNTRY: 1|Unique values: ['CA'] Column CREATED: 434 Column DESCRIPTION: 434 Column GROUP_PHOTO: Type: <class 'dict'> Column ID: 434 Column JOIN_MODE: 2 Column LAT: 22 Column LINK: 434 Column LON: 35 Column MEMBERS: 343 Column NAME: 434 Column ORGANIZER: Type: <class 'dict'> Column RATING: 67 Column STATE: 1|Unique values: ['BC'] Column TIMEZONE: 2 Column TOPICS: Type: <class 'list'> Column URLNAME: 434 Column UTC_OFFSET: 1|Unique values: [-28800000] Column VISIBILITY: 3 Column WHO: 193
groups_df['timezone'].unique()
array(['Canada/Pacific', 'America/Vancouver'], dtype=object)
groups_df['category'][0]
{'name': 'tech', 'id': 34, 'shortname': 'tech'}
Observations:
Our dataframe has 22 columns. We could drop a few columns.
A few columns contain only 1 unique value: country
('CA'), state
('BC') and utc_offset
(representing UTC-8).
An additional column (timezone
) only has 2 values (['Canada/Pacific', 'America/Vancouver']
) which essentially represent the same information.
The category
column which contains dictionaries is also unnecessary as we are only looking at Tech meetups.
Let's go ahead and drop these columns.
columns_to_drop = ['country', 'state', 'utc_offset', 'timezone', 'category']
groups_df = groups_df.drop(columns=columns_to_drop)
groups_df.shape
(434, 17)
groups_df.columns
Index(['city', 'created', 'description', 'group_photo', 'id', 'join_mode', 'lat', 'link', 'lon', 'members', 'name', 'organizer', 'rating', 'topics', 'urlname', 'visibility', 'who'], dtype='object')
groups_df.head(5)
city | created | description | group_photo | id | join_mode | lat | link | lon | members | name | organizer | rating | topics | urlname | visibility | who | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Vancouver | 1034097731000 | <p><span>The Vancouver Blogger Meetup Group is... | {'highres_link': 'https://secure.meetupstatic.... | 21636 | approval | 49.279999 | https://www.meetup.com/The-Vancouver-Blogger-M... | -123.129997 | 652 | The Vancouver Blogger Meetup Group | {'member_id': 66805072, 'name': 'Fabiano Macie... | 4.32 | [{'urlkey': 'smallbiz', 'name': 'Small Busines... | The-Vancouver-Blogger-Meetup-Group | public | Friends |
1 | Vancouver | 1036650741000 | <p>Hello Everyone,</p>\n<p>My name is Nathan a... | NaN | 71383 | open | 49.279999 | https://www.meetup.com/My-Internet-Media-Groups/ | -123.120003 | 1866 | My Internet Media Groups | {'member_id': 104036462, 'name': 'Nathan W.', ... | 4.25 | [{'urlkey': 'photoshop', 'name': 'Photoshop', ... | My-Internet-Media-Groups | public | Future Moguls |
2 | Vancouver | 1072723179000 | <p>As the largest software development meetup ... | NaN | 144695 | open | 49.279999 | https://www.meetup.com/VanDev/ | -123.129997 | 6949 | VanDev: Vancouver's Software Developers Network | {'member_id': 273611362, 'name': 'Bevan Hunt',... | 4.39 | [{'urlkey': 'softwaredev', 'name': 'Software D... | VanDev | public | Software Developers |
3 | Vancouver | 1142734447000 | <p>Vancouver Web Design is British Columbia's ... | {'highres_link': 'https://secure.meetupstatic.... | 229052 | open | 49.290001 | https://www.meetup.com/webdesign-395/ | -123.110001 | 4996 | The Vancouver Web Design Meetup Group | {'member_id': 87486782, 'name': 'Darrell Hadde... | 4.40 | [{'urlkey': 'webstandards', 'name': 'Web Stand... | webdesign-395 | public | Vancouver Web Design Meetup |
4 | Vancouver | 1184711936000 | <p>Do you develop with Ruby? Make web applicat... | {'highres_link': 'https://secure.meetupstatic.... | 573925 | open | 49.279999 | https://www.meetup.com/vancouver-ruby/ | -123.110001 | 2624 | Vancouver Ruby Meetup Group | {'member_id': 10374191, 'name': 'Philippe Creu... | 4.59 | [{'urlkey': 'ruby', 'name': 'Ruby', 'id': 1040... | vancouver-ruby | public | Ruby Developers |
display_columns_unique_values(groups_df)
Column CITY: 11 Column CREATED: 434 Column DESCRIPTION: 434 Column GROUP_PHOTO: Type: <class 'dict'> Column ID: 434 Column JOIN_MODE: 2 Column LAT: 22 Column LINK: 434 Column LON: 35 Column MEMBERS: 343 Column NAME: 434 Column ORGANIZER: Type: <class 'dict'> Column RATING: 67 Column TOPICS: Type: <class 'list'> Column URLNAME: 434 Column VISIBILITY: 3 Column WHO: 193
Observations:
Now that we've done a little bit of clean up, let's look at the 17 remaining columns.
city
: It is interesting that there are 11 unique cities. We should look into this.
Several columns match the size of our data so these columns contain values unique to each group
created
description
id
link
name
urlname
3 columns contain either lists or dictionaries. We'd have to do some more parsing if we are to explore these columns.
group_photo
organier
topics
lat
and lon
: Short names for latitude and longitude. We could plot these on a map to see where groups tend to be based.
join_mode
: We could look into the different join modes and how many groups use one versus the other
members
& ratings
: We could look into the distribution of members and ratings across the different groups
visibility
: We could see how visible groups tend to be
who
: We could see how many Tech Groups have the same or similar target group
Looking at a portion of our dataframe, the uniqueness, data ranges and tendencies of each column, and the raw JSON format of one group, helps us better understand our data and its structure. This leads us to better questions and areas of exploration. These observations and questions lead to additional project extension ideas.
We observed 11 unique values in the city
column. I had expected only 1 city as we had specified 'Vancouver' in our GetGroups
method earlier.
Let's explore these:
groups_df['city'].unique()
array(['Vancouver', 'Langley', 'Surrey', 'North Vancouver', 'Burnaby', 'Port Moody', 'New Westminster', 'Richmond', 'West Vancouver', 'Port Coquitlam', 'Coquitlam'], dtype=object)
So meetups which are not strictly located in Vancouver are also returned. Looking more closely at the city parameter for the GetGroups
method:
Aha! I forgot that the method returns Meetup groups within a radius of 25 miles of the specified location. So the 433 Tech Meetup groups we got are actually based within 25 miles of Vancouver, BC and spread across 11 cities.
Let's explore how many Tech meetups are based in each city:
groups_df.groupby(by='city').count()
created | description | group_photo | id | join_mode | lat | link | lon | members | name | organizer | rating | topics | urlname | visibility | who | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
city | ||||||||||||||||
Burnaby | 7 | 7 | 3 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
Coquitlam | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Langley | 5 | 5 | 1 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
New Westminster | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
North Vancouver | 8 | 8 | 2 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
Port Coquitlam | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Port Moody | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
Richmond | 3 | 3 | 1 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
Surrey | 9 | 9 | 5 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 |
Vancouver | 394 | 394 | 223 | 394 | 394 | 394 | 394 | 394 | 394 | 394 | 391 | 394 | 394 | 394 | 394 | 394 |
West Vancouver | 3 | 3 | 0 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
Over 90% of Tech Meetups are based in the city of Vancouver. Coming in second, Surrey has 9 Tech Meetups.
Before disregarding the non-Vancouver based Tech Meetups, let's check where the largest Tech Meetups by members are based:
groups_df.sort_values('members', ascending=False).head(20)
city | created | description | group_photo | id | join_mode | lat | link | lon | members | name | organizer | rating | topics | urlname | visibility | who | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
172 | Vancouver | 1448718394000 | <p>TechVancouver brings together Vancouver's t... | {'highres_link': 'https://secure.meetupstatic.... | 19162038 | open | 49.279999 | https://www.meetup.com/TechVancouverOrg/ | -123.040001 | 8797 | TechVancouver | {'member_id': 14219538, 'name': 'Alexander Nor... | 4.78 | [{'urlkey': 'softwaredev', 'name': 'Software D... | TechVancouverOrg | public | Technologists |
2 | Vancouver | 1072723179000 | <p>As the largest software development meetup ... | NaN | 144695 | open | 49.279999 | https://www.meetup.com/VanDev/ | -123.129997 | 6949 | VanDev: Vancouver's Software Developers Network | {'member_id': 273611362, 'name': 'Bevan Hunt',... | 4.39 | [{'urlkey': 'softwaredev', 'name': 'Software D... | VanDev | public | Software Developers |
60 | Vancouver | 1347316761000 | <p><b>WHY ATTEND TECH SOCIALS?</b> Simple. To ... | {'highres_link': 'https://secure.meetupstatic.... | 4869742 | open | 49.279999 | https://www.meetup.com/HackerNestVAN/ | -123.120003 | 6715 | HackerNest Vancouver Tech Socials | {'member_id': 77652922, 'name': 'HackerNest', ... | 4.75 | [{'urlkey': 'softwaredev', 'name': 'Software D... | HackerNestVAN | public | HackerNesters |
98 | Vancouver | 1389135143000 | <p><b>What is Cowork & Coffee?</b></p>\n<p... | {'highres_link': 'https://secure.meetupstatic.... | 11908342 | open | 49.279999 | https://www.meetup.com/codecoffeeyvr/ | -123.139999 | 6299 | Code & Coffee (Vancouver) | {'member_id': 8906905, 'name': 'Jenn Cooper', ... | 4.65 | [{'urlkey': 'opensource', 'name': 'Open Source... | codecoffeeyvr | public | Developers |
137 | Vancouver | 1422291923000 | <p>Startup Founder 101 brings together aspirin... | {'highres_link': 'https://secure.meetupstatic.... | 18381330 | open | 49.279999 | https://www.meetup.com/Vancouver-Startup-Found... | -123.040001 | 5772 | Vancouver Startup Founder 101 | {'member_id': 11314237, 'name': 'Jonathan Gree... | 4.58 | [{'urlkey': 'webdesign', 'name': 'Web Design',... | Vancouver-Startup-Founder-101 | public | Aspiring Founders |
16 | Vancouver | 1263596978000 | <p>Meeting the right people is crucial. We loo... | {'highres_link': 'https://secure.meetupstatic.... | 1583286 | open | 49.279999 | https://www.meetup.com/Vancouver-Tech-Co-Found... | -123.110001 | 5120 | Vancouver Pixel Crafters | {'member_id': 11767095, 'name': 'Hussein Halla... | 4.62 | [{'urlkey': 'softwaredev', 'name': 'Software D... | Vancouver-Tech-Co-Founders | public | Entrepreneurs |
3 | Vancouver | 1142734447000 | <p>Vancouver Web Design is British Columbia's ... | {'highres_link': 'https://secure.meetupstatic.... | 229052 | open | 49.290001 | https://www.meetup.com/webdesign-395/ | -123.110001 | 4996 | The Vancouver Web Design Meetup Group | {'member_id': 87486782, 'name': 'Darrell Hadde... | 4.40 | [{'urlkey': 'webstandards', 'name': 'Web Stand... | webdesign-395 | public | Vancouver Web Design Meetup |
23 | Vancouver | 1273291070000 | <p>Vancouver is home to many game developers a... | {'highres_link': 'https://secure.meetupstatic.... | 1642334 | open | 49.270000 | https://www.meetup.com/Vancouver-Indie-Game-De... | -123.120003 | 4829 | Full Indie | {'member_id': 2874080, 'name': 'Andrew Hamilto... | 4.61 | [{'urlkey': 'gamedev', 'name': 'Game Developme... | Vancouver-Indie-Game-Developers | public | Indies |
192 | Vancouver | 1462151582000 | <p>We invite startup leaders and experts to sh... | {'highres_link': 'https://secure.meetupstatic.... | 19905099 | open | 49.259998 | https://www.meetup.com/YVR-Startups/ | -123.139999 | 4720 | YVR Startups | {'member_id': 41331852, 'name': 'Ed', 'photo':... | 4.67 | [{'urlkey': 'webdesign', 'name': 'Web Design',... | YVR-Startups | public | Entrepreneurs |
36 | Vancouver | 1300139469000 | <p>This group is for anyone interested in uniq... | {'highres_link': 'https://secure.meetupstatic.... | 1798770 | open | 49.279999 | https://www.meetup.com/Vancouver-Data-Visualiz... | -123.110001 | 3892 | Vancouver Data Visualization | {'member_id': 13904544, 'name': 'Wayne Carriga... | 4.52 | [{'urlkey': 'opensource', 'name': 'Open Source... | Vancouver-Data-Visualization | public | Vizers |
52 | Vancouver | 1328480160000 | <p>Vancouver's Bitcoin meetup is a space for p... | {'highres_link': 'https://secure.meetupstatic.... | 3238812 | open | 49.279999 | https://www.meetup.com/bitcoinvan/ | -123.099998 | 3831 | Bitcoin Vancouver | {'member_id': 63655392, 'name': 'freddie', 'ph... | 4.55 | [{'urlkey': 'opensource', 'name': 'Open Source... | bitcoinvan | public | Members |
56 | Vancouver | 1338076931000 | <p><span><span>As professionals and/or hobbyis... | {'highres_link': 'https://secure.meetupstatic.... | 3946322 | open | 49.259998 | https://www.meetup.com/PolyglotVancouver/ | -123.120003 | 3793 | PolyglotVancouver | {'member_id': 10575808, 'name': 'Tavis Rudd', ... | 4.62 | [{'urlkey': 'opensource', 'name': 'Open Source... | PolyglotVancouver | public | Polyglots |
26 | Vancouver | 1292618090000 | <p>Agile software development means that you p... | {'highres_link': 'https://secure.meetupstatic.... | 1751788 | open | 49.279999 | https://www.meetup.com/Agile-Vancouver/ | -123.120003 | 3789 | Agile Vancouver | {'member_id': 6885511, 'name': 'Susan Patch', ... | 4.43 | [{'urlkey': 'softwaredev', 'name': 'Software D... | Agile-Vancouver | public | Agilists |
144 | Vancouver | 1428133116000 | <p>Welcome to the Data Science LEARNING Group!... | {'highres_link': 'https://secure.meetupstatic.... | 18532714 | open | 49.290001 | https://www.meetup.com/LearnDataScience/ | -123.120003 | 3678 | Learn Data Science | {'member_id': 53206922, 'name': 'Charles Iliya... | 4.84 | [{'urlkey': 'ai', 'name': 'Artificial Intellig... | LearnDataScience | public | Data Scientists |
7 | Vancouver | 1203115506000 | <p>NetSquared gathers together nonprofits and ... | {'highres_link': 'https://secure.meetupstatic.... | 1013148 | open | 49.270000 | https://www.meetup.com/net2van/ | -123.099998 | 3409 | NetSquared Vancouver: Tech4Good and Nonprofits | {'member_id': 37748782, 'name': 'NetSquared', ... | 4.38 | [{'urlkey': 'databasepro', 'name': 'Database P... | net2van | public | Members |
14 | Vancouver | 1253472613000 | <p>VanJS is a meetup for web developers downto... | {'highres_link': 'https://secure.meetupstatic.... | 1526926 | open | 49.279999 | https://www.meetup.com/vancouver-javascript-de... | -123.110001 | 3327 | VanJS: Vancouver JavaScript Developers | {'member_id': 9749039, 'name': 'Allen Pike', '... | 4.63 | [{'urlkey': 'webstandards', 'name': 'Web Stand... | vancouver-javascript-developers | public | JavaScript Developers |
142 | Vancouver | 1427043156000 | <p>We host complimentary technology workshops ... | {'highres_link': 'https://secure.meetupstatic.... | 18515170 | open | 49.259998 | https://www.meetup.com/BrainStation-Vancouver/ | -123.099998 | 3291 | BrainStation Vancouver | {'member_id': 185462544, 'name': 'BrainStation... | 4.82 | [{'urlkey': 'css', 'name': 'CSS', 'id': 1973},... | BrainStation-Vancouver | public | Creators |
85 | Vancouver | 1375470268000 | <p>Style & Class is a meetup for those who... | {'highres_link': 'https://secure.meetupstatic.... | 9636222 | open | 49.259998 | https://www.meetup.com/styleandclass/ | -123.139999 | 2895 | Style & Class | {'member_id': 13411316, 'name': 'Steve Fisher'... | 4.74 | [{'urlkey': 'webstandards', 'name': 'Web Stand... | styleandclass | public | Those that create the web |
110 | Vancouver | 1398141265000 | <p>This is an IBM sponsored Big Data meetup gr... | {'highres_link': 'https://secure.meetupstatic.... | 14038802 | open | 49.279999 | https://www.meetup.com/Big-Data-Developers-in-... | -123.120003 | 2872 | Big Data Developers in Vancouver | {'member_id': 93599762, 'name': 'Nancy Berlin'... | 4.64 | [{'urlkey': 'opensource', 'name': 'Open Source... | Big-Data-Developers-in-Vancouver | public | Big Data Enthusiasts |
11 | Vancouver | 1240125181000 | Meet other Vancouver users of WordPress, the W... | {'highres_link': 'https://secure.meetupstatic.... | 1438007 | open | 49.270000 | https://www.meetup.com/Vancouver-WordPress-Mee... | -123.150002 | 2705 | The Vancouver WordPress Meetup Group | {'member_id': 72560962, 'name': 'WordPress', '... | 4.49 | [{'urlkey': 'php', 'name': 'PHP', 'id': 455}, ... | Vancouver-WordPress-Meetup-Group | public | WordPress Enthusiasts |
The top 20 largest Tech Meetups by number of members are all based in Vancouver. So we can drop the non-Vancouver based Tech Meetups without it affecting our original question of exploring group's number of members and activities.
vancouver_tech_groups = groups_df[groups_df['city'] == 'Vancouver']
vancouver_tech_groups.shape
(394, 17)
vancouver_tech_groups.head(5)
city | created | description | group_photo | id | join_mode | lat | link | lon | members | name | organizer | rating | topics | urlname | visibility | who | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Vancouver | 1034097731000 | <p><span>The Vancouver Blogger Meetup Group is... | {'highres_link': 'https://secure.meetupstatic.... | 21636 | approval | 49.279999 | https://www.meetup.com/The-Vancouver-Blogger-M... | -123.129997 | 652 | The Vancouver Blogger Meetup Group | {'member_id': 66805072, 'name': 'Fabiano Macie... | 4.32 | [{'urlkey': 'smallbiz', 'name': 'Small Busines... | The-Vancouver-Blogger-Meetup-Group | public | Friends |
1 | Vancouver | 1036650741000 | <p>Hello Everyone,</p>\n<p>My name is Nathan a... | NaN | 71383 | open | 49.279999 | https://www.meetup.com/My-Internet-Media-Groups/ | -123.120003 | 1866 | My Internet Media Groups | {'member_id': 104036462, 'name': 'Nathan W.', ... | 4.25 | [{'urlkey': 'photoshop', 'name': 'Photoshop', ... | My-Internet-Media-Groups | public | Future Moguls |
2 | Vancouver | 1072723179000 | <p>As the largest software development meetup ... | NaN | 144695 | open | 49.279999 | https://www.meetup.com/VanDev/ | -123.129997 | 6949 | VanDev: Vancouver's Software Developers Network | {'member_id': 273611362, 'name': 'Bevan Hunt',... | 4.39 | [{'urlkey': 'softwaredev', 'name': 'Software D... | VanDev | public | Software Developers |
3 | Vancouver | 1142734447000 | <p>Vancouver Web Design is British Columbia's ... | {'highres_link': 'https://secure.meetupstatic.... | 229052 | open | 49.290001 | https://www.meetup.com/webdesign-395/ | -123.110001 | 4996 | The Vancouver Web Design Meetup Group | {'member_id': 87486782, 'name': 'Darrell Hadde... | 4.40 | [{'urlkey': 'webstandards', 'name': 'Web Stand... | webdesign-395 | public | Vancouver Web Design Meetup |
4 | Vancouver | 1184711936000 | <p>Do you develop with Ruby? Make web applicat... | {'highres_link': 'https://secure.meetupstatic.... | 573925 | open | 49.279999 | https://www.meetup.com/vancouver-ruby/ | -123.110001 | 2624 | Vancouver Ruby Meetup Group | {'member_id': 10374191, 'name': 'Philippe Creu... | 4.59 | [{'urlkey': 'ruby', 'name': 'Ruby', 'id': 1040... | vancouver-ruby | public | Ruby Developers |
display_columns_unique_values(vancouver_tech_groups)
Column CITY: 1|Unique values: ['Vancouver'] Column CREATED: 394 Column DESCRIPTION: 394 Column GROUP_PHOTO: Type: <class 'dict'> Column ID: 394 Column JOIN_MODE: 2 Column LAT: 9 Column LINK: 394 Column LON: 20 Column MEMBERS: 323 Column NAME: 394 Column ORGANIZER: Type: <class 'dict'> Column RATING: 66 Column TOPICS: Type: <class 'list'> Column URLNAME: 394 Column VISIBILITY: 3 Column WHO: 184
# drop the city column
vancouver_tech_groups = vancouver_tech_groups.drop(columns=['city'])
Now that we have limited the groups to the city of Vancouver and further cleaned up our dataframe, let's limit the columns to the ones that interest us:
vancouver_tech_groups.columns
Index(['created', 'description', 'group_photo', 'id', 'join_mode', 'lat', 'link', 'lon', 'members', 'name', 'organizer', 'rating', 'topics', 'urlname', 'visibility', 'who'], dtype='object')
# columns to keep
keep_columns = ['id', 'name', 'members', 'rating', 'join_mode', 'visibility',
'who', 'lat', 'lon']
vancouver_tech_groups = vancouver_tech_groups[keep_columns]
vancouver_tech_groups.head(5)
id | name | members | rating | join_mode | visibility | who | lat | lon | |
---|---|---|---|---|---|---|---|---|---|
0 | 21636 | The Vancouver Blogger Meetup Group | 652 | 4.32 | approval | public | Friends | 49.279999 | -123.129997 |
1 | 71383 | My Internet Media Groups | 1866 | 4.25 | open | public | Future Moguls | 49.279999 | -123.120003 |
2 | 144695 | VanDev: Vancouver's Software Developers Network | 6949 | 4.39 | open | public | Software Developers | 49.279999 | -123.129997 |
3 | 229052 | The Vancouver Web Design Meetup Group | 4996 | 4.40 | open | public | Vancouver Web Design Meetup | 49.290001 | -123.110001 |
4 | 573925 | Vancouver Ruby Meetup Group | 2624 | 4.59 | open | public | Ruby Developers | 49.279999 | -123.110001 |
Let's explore the distribution of members across Tech Meetups in Vancouver.
vancouver_tech_groups['members'].describe()
count 394.000000 mean 680.951777 std 1108.259677 min 1.000000 25% 86.000000 50% 295.000000 75% 699.250000 max 8797.000000 Name: members, dtype: float64
vancouver_tech_groups['members'].plot.hist()
<matplotlib.axes._subplots.AxesSubplot at 0x7f27113cc2b0>
vancouver_tech_groups['members'].plot.box()
<matplotlib.axes._subplots.AxesSubplot at 0x7f2710d6a898>
vancouver_tech_groups['members'].value_counts(bins = 10, normalize=True).sort_index(ascending=False) * 100
(7917.4, 8797.0] 0.253807 (7037.8, 7917.4] 0.000000 (6158.2, 7037.8] 0.761421 (5278.6, 6158.2] 0.253807 (4399.0, 5278.6] 1.015228 (3519.4, 4399.0] 1.269036 (2639.8, 3519.4] 1.522843 (1760.2, 2639.8] 5.329949 (880.6, 1760.2] 11.167513 (-7.797000000000001, 880.6] 78.426396 Name: members, dtype: float64
Observations:
From the descriptive statistics table that the describe()
function provides us, we observe:
Our histogram is a very rightly skewed plot, further demonstrating how most Tech Meetups have members in the 100's and not 1000's.
The box plot shows the large number of outliers. What is interesting is that there are a few clusters of outliers around the 2000 - 4000, another cluster around ~5000 and the rest spread out a bit more until we get to out largest group of 8778 members.
Lastly, the custom frequency table shows us that:
vancouver_tech_groups[vancouver_tech_groups['members'] == 1]
id | name | members | rating | join_mode | visibility | who | lat | lon | |
---|---|---|---|---|---|---|---|---|---|
433 | 31260167 | Vancouver Amazon Seller Meetup AMZ One Step | 1 | 0.0 | open | public | Members | 49.279999 | -123.040001 |
vancouver_tech_groups[vancouver_tech_groups['id'] == 31231458]
id | name | members | rating | join_mode | visibility | who | lat | lon | |
---|---|---|---|---|---|---|---|---|---|
431 | 31231458 | Vancouver Fintech Startups Meetup | 5 | 0.0 | approval | public | Members | 49.279999 | -123.040001 |
There is one Meetup with 1 member is the Vancouver Lucene/Solr Meetup.
Members can only join the meetup by being approved, which may explain the low number of members if no one has been approved.
NOTE: When I first ran this on Sunday, February 24th 2019, there was only one Meetup with 1 member, the Vancouver FinTech Startups Meetup. When I ran this again on Monday, February 25th 2019, a different Meetup group had 1 member: Vancouver Lucene/Solr Meetup. The former Meetup now had 3 members. This may indicate some fluctuation in Meetup Groups as people are joining/leaving groups.
Let's grab the largest and smallest Tech Meetup groups and compare them.
The largest and smallest groups are defined below:
Largest Groups: Groups with members larger than 10x the median number of members.
Smallest Groups: Groups with members less than 1/10th of the median number of members.
Dealing with locations
After some back and forth research done by plotting the largest and smallest Vancouver tech groups, we observed the challenges of dealing with two separate location values (latitude
and longitude
). A new column is created which combines both latitutde
and longitude
columns. This column will allow us to know exactly how many unique locations there really are.
vancouver_tech_groups['location'] = list(
zip(vancouver_tech_groups['lat'], vancouver_tech_groups['lon'])
)
display_columns_unique_values(vancouver_tech_groups)
Column ID: 394 Column NAME: 394 Column MEMBERS: 323 Column RATING: 66 Column JOIN_MODE: 2 Column VISIBILITY: 3 Column WHO: 184 Column LAT: 9 Column LON: 20 Column LOCATION: 45
Now, we can tell that there are 45 unique locations for the 393 Tech Meetups in Vancouver.
Additional Helper Functions
def get_target_group_details(df):
'''provides information on the target group'''
unique_targets = df['who'].unique()
print(f"There are {len(unique_targets)} unique targets: {unique_targets}\n")
for target in unique_targets:
unique_target = df[df['who'] == target]
print(f"There are {unique_target.shape[0]} meetup(s) which target {target}")
median_members = vancouver_tech_groups['members'].median()
median_members
295.0
Largest Tech Groups
Let's grab the largest tech meetups:
largest_tech_groups = vancouver_tech_groups[vancouver_tech_groups['members'] > 10 * median_members]
print(largest_tech_groups.shape)
largest_tech_groups.sort_values('members', ascending=False)
(17, 10)
id | name | members | rating | join_mode | visibility | who | lat | lon | location | |
---|---|---|---|---|---|---|---|---|---|---|
172 | 19162038 | TechVancouver | 8797 | 4.78 | open | public | Technologists | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
2 | 144695 | VanDev: Vancouver's Software Developers Network | 6949 | 4.39 | open | public | Software Developers | 49.279999 | -123.129997 | (49.279998779296875, -123.12999725341797) |
60 | 4869742 | HackerNest Vancouver Tech Socials | 6715 | 4.75 | open | public | HackerNesters | 49.279999 | -123.120003 | (49.279998779296875, -123.12000274658203) |
98 | 11908342 | Code & Coffee (Vancouver) | 6299 | 4.65 | open | public | Developers | 49.279999 | -123.139999 | (49.279998779296875, -123.13999938964844) |
137 | 18381330 | Vancouver Startup Founder 101 | 5772 | 4.58 | open | public | Aspiring Founders | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
16 | 1583286 | Vancouver Pixel Crafters | 5120 | 4.62 | open | public | Entrepreneurs | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
3 | 229052 | The Vancouver Web Design Meetup Group | 4996 | 4.40 | open | public | Vancouver Web Design Meetup | 49.290001 | -123.110001 | (49.290000915527344, -123.11000061035156) |
23 | 1642334 | Full Indie | 4829 | 4.61 | open | public | Indies | 49.270000 | -123.120003 | (49.27000045776367, -123.12000274658203) |
192 | 19905099 | YVR Startups | 4720 | 4.67 | open | public | Entrepreneurs | 49.259998 | -123.139999 | (49.2599983215332, -123.13999938964844) |
36 | 1798770 | Vancouver Data Visualization | 3892 | 4.52 | open | public | Vizers | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
52 | 3238812 | Bitcoin Vancouver | 3831 | 4.55 | open | public | Members | 49.279999 | -123.099998 | (49.279998779296875, -123.0999984741211) |
56 | 3946322 | PolyglotVancouver | 3793 | 4.62 | open | public | Polyglots | 49.259998 | -123.120003 | (49.2599983215332, -123.12000274658203) |
26 | 1751788 | Agile Vancouver | 3789 | 4.43 | open | public | Agilists | 49.279999 | -123.120003 | (49.279998779296875, -123.12000274658203) |
144 | 18532714 | Learn Data Science | 3678 | 4.84 | open | public | Data Scientists | 49.290001 | -123.120003 | (49.290000915527344, -123.12000274658203) |
7 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 3409 | 4.38 | open | public | Members | 49.270000 | -123.099998 | (49.27000045776367, -123.0999984741211) |
14 | 1526926 | VanJS: Vancouver JavaScript Developers | 3327 | 4.63 | open | public | JavaScript Developers | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
142 | 18515170 | BrainStation Vancouver | 3291 | 4.82 | open | public | Creators | 49.259998 | -123.099998 | (49.2599983215332, -123.0999984741211) |
Smallest Tech Groups
Let's grab the smallest tech meetups:
smallest_tech_groups = vancouver_tech_groups[vancouver_tech_groups['members'] < (median_members/10)]
print(smallest_tech_groups.shape)
smallest_tech_groups.sort_values('members')
(30, 10)
id | name | members | rating | join_mode | visibility | who | lat | lon | location | |
---|---|---|---|---|---|---|---|---|---|---|
433 | 31260167 | Vancouver Amazon Seller Meetup AMZ One Step | 1 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
432 | 31252547 | Vancouver Lucene/Solr Meetup | 2 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
431 | 31231458 | Vancouver Fintech Startups Meetup | 5 | 0.0 | approval | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
358 | 28804607 | The Efficient Professional | 7 | 0.0 | approval | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
363 | 28978937 | ManageEngine IT Management Seminar, Vancouver | 9 | 0.0 | open | public_limited | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
349 | 28518984 | Gastown Tech Meetup | 9 | 0.0 | approval | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
423 | 30948318 | Algorand Vancouver: Blockchain Community | 11 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
280 | 26300462 | Vancouver MuleSoft Meetup Group | 12 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
347 | 28458120 | Vancouver MuleSoft User Group | 12 | 0.0 | open | public | Members | 49.270000 | -123.239998 | (49.27000045776367, -123.23999786376953) |
408 | 30478918 | Vancouver Clojure & Lisp | 13 | 0.0 | approval | public | Parens | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
391 | 29919671 | Crypto Vancity | 13 | 0.0 | open | public | Disruptors | 49.290001 | -123.120003 | (49.290000915527344, -123.12000274658203) |
385 | 29818562 | Vancouver Apache Druid Meetup by Imply | 14 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
318 | 27534165 | Enterprise Resource Planning ERP Consultants V... | 15 | 0.0 | open | public | Membres | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
419 | 30791164 | Token Engineering Vancouver | 16 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
362 | 28976518 | Hyperbridge Vancouver | 16 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
370 | 29394462 | Slack Platform Community: Vancouver | 17 | 0.0 | open | public | SPC Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
428 | 31176166 | ExpoPlaza Latina 2019 | 18 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
352 | 28638957 | The Vancouver Supply Chain Meetup | 18 | 0.0 | open | public | Nodes | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
197 | 20129895 | Kong/VANCOUVER | 18 | 0.0 | open | public | Kong Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
128 | 17958442 | Vancouver Google Apps Meetup | 21 | 0.0 | open | public | Googlers | 49.259998 | -123.180000 | (49.2599983215332, -123.18000030517578) |
329 | 27908256 | Queer Women in STEM | 21 | 0.0 | approval | public_limited | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
247 | 23674891 | The Vancouver dRofus User Group | 21 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
399 | 30324509 | Vancouver WSO2 User Group | 24 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
369 | 29371748 | Vancouver VideoCoin Developer Meetup | 24 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
232 | 22396600 | Vancouver Redis Meetup | 25 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
320 | 27735218 | Vancouver Video Technology Meetup | 27 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
231 | 22290839 | Vancouver NetSuite Users Group | 28 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
178 | 19583761 | Bitcoin Hanoi | 28 | 0.0 | open | public | Bitcoiners | 49.230000 | -123.110001 | (49.22999954223633, -123.11000061035156) |
389 | 29891207 | Eyemole Arts and Technology and Friends | 29 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
269 | 25620603 | Quantaloop | 29 | 0.0 | open | public_limited | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
Let's look at where the largest and smallest Tech Meetups are located:
display_columns_unique_values(largest_tech_groups)
Column ID: 17 Column NAME: 17 Column MEMBERS: 17 Column RATING: 16 Column JOIN_MODE: 1|Unique values: ['open'] Column VISIBILITY: 1|Unique values: ['public'] Column WHO: 15 Column LAT: 4 Column LON: 6 Column LOCATION: 13
display_columns_unique_values(smallest_tech_groups)
Column ID: 30 Column NAME: 30 Column MEMBERS: 19 Column RATING: 1|Unique values: [0.] Column JOIN_MODE: 2 Column VISIBILITY: 2 Column WHO: 9 Column LAT: 5 Column LON: 5 Column LOCATION: 5
'''
projection: the map projection.
llcrnrlat: latitude of lower left hand corner of the desired map domain
urcrnrlat: latitude of upper right hand corner of the desired map domain
llcrnrlon: longitude of lower left hand corner of the desired map domain
urcrnrlon: longitude of upper right hand corner of the desired map domain
'''
lower_left_lat = 49.2
lower_left_lon = -123.316
upper_right_lat = 49.35
upper_right_lon = -123
fig = plt.figure(figsize=(40, 40))
groups = [smallest_tech_groups, largest_tech_groups]
colors = ['Blue', 'Green']
titles = ['Smallest Tech Groups', 'Largest Tech Groups']
for sp in range(0, 2):
ax = fig.add_subplot(1, 5, sp+1)
ax.set_title(titles[sp])
m = Basemap(
projection = 'merc',
llcrnrlat = lower_left_lat,
urcrnrlat = upper_right_lat,
llcrnrlon = lower_left_lon,
urcrnrlon = upper_right_lon,
resolution = 'i',
)
m.drawmapboundary(fill_color='#85A6D9')
m.drawcoastlines(color='#6D5F47', linewidth=.4)
m.drawrivers(color='#6D5F47', linewidth=.4)
m.fillcontinents(color='white',lake_color='#85A6D9')
longitudes = groups[sp]['lon'].tolist()
latitudes = groups[sp]['lat'].tolist()
x, y = m(longitudes, latitudes)
m.scatter(x, y, s=50, zorder=2, color=colors[sp], cmap='summer')
plt.show()
Observation:
From our map plots above, we observe that:
Other observations:
Let's look at how the largest and smallest Tech Meetup groups are 'marketed' by taking into consideration who they cater to.
# who do they target
get_target_group_details(largest_tech_groups)
There are 15 unique targets: ['Software Developers' 'Vancouver Web Design Meetup' 'Members' 'JavaScript Developers' 'Entrepreneurs' 'Indies' 'Agilists' 'Vizers' 'Polyglots' 'HackerNesters' 'Developers' 'Aspiring Founders' 'Creators' 'Data Scientists' 'Technologists'] There are 1 meetup(s) which target Software Developers There are 1 meetup(s) which target Vancouver Web Design Meetup There are 2 meetup(s) which target Members There are 1 meetup(s) which target JavaScript Developers There are 2 meetup(s) which target Entrepreneurs There are 1 meetup(s) which target Indies There are 1 meetup(s) which target Agilists There are 1 meetup(s) which target Vizers There are 1 meetup(s) which target Polyglots There are 1 meetup(s) which target HackerNesters There are 1 meetup(s) which target Developers There are 1 meetup(s) which target Aspiring Founders There are 1 meetup(s) which target Creators There are 1 meetup(s) which target Data Scientists There are 1 meetup(s) which target Technologists
# who do they target
get_target_group_details(smallest_tech_groups)
There are 9 unique targets: ['Googlers' 'Bitcoiners' 'Kong Members' 'Members' 'Membres' 'Nodes' 'SPC Members' 'Disruptors' 'Parens'] There are 1 meetup(s) which target Googlers There are 1 meetup(s) which target Bitcoiners There are 1 meetup(s) which target Kong Members There are 22 meetup(s) which target Members There are 1 meetup(s) which target Membres There are 1 meetup(s) which target Nodes There are 1 meetup(s) which target SPC Members There are 1 meetup(s) which target Disruptors There are 1 meetup(s) which target Parens
Observation:
Around 75% (21/29) of the smallest groups are aimed at 'Members'. 1 group is aimed at 'Membres' which I take as a misspelling for 'Members'.
In contrast, less than 12% (2/17) of the largest groups are aimed at 'Members'.
Generally, the largest groups cater to specific audiences compared to the smallest groups.
As we already saw, the smallest groups have ratings of 0. So we aren't able to compare them against the largest group.
A couple of question we could look into is:
groups_with_ratings = vancouver_tech_groups[vancouver_tech_groups['rating'] != 0.0]
print(groups_with_ratings.shape[0])
groups_with_ratings.head(5)
251
id | name | members | rating | join_mode | visibility | who | lat | lon | location | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 21636 | The Vancouver Blogger Meetup Group | 652 | 4.32 | approval | public | Friends | 49.279999 | -123.129997 | (49.279998779296875, -123.12999725341797) |
1 | 71383 | My Internet Media Groups | 1866 | 4.25 | open | public | Future Moguls | 49.279999 | -123.120003 | (49.279998779296875, -123.12000274658203) |
2 | 144695 | VanDev: Vancouver's Software Developers Network | 6949 | 4.39 | open | public | Software Developers | 49.279999 | -123.129997 | (49.279998779296875, -123.12999725341797) |
3 | 229052 | The Vancouver Web Design Meetup Group | 4996 | 4.40 | open | public | Vancouver Web Design Meetup | 49.290001 | -123.110001 | (49.290000915527344, -123.11000061035156) |
4 | 573925 | Vancouver Ruby Meetup Group | 2624 | 4.59 | open | public | Ruby Developers | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
So 251/393 groups have ratings. Let's further separate out the non-largest groups:
non_largest_groups_with_ratings = groups_with_ratings[groups_with_ratings['members'] < 10 * median_members]
print(non_largest_groups_with_ratings.shape[0])
non_largest_groups_with_ratings.sort_values('members', ascending=False).head(5)
234
id | name | members | rating | join_mode | visibility | who | lat | lon | location | |
---|---|---|---|---|---|---|---|---|---|---|
85 | 9636222 | Style & Class | 2895 | 4.74 | open | public | Those that create the web | 49.259998 | -123.139999 | (49.2599983215332, -123.13999938964844) |
110 | 14038802 | Big Data Developers in Vancouver | 2872 | 4.64 | open | public | Big Data Enthusiasts | 49.279999 | -123.120003 | (49.279998779296875, -123.12000274658203) |
11 | 1438007 | The Vancouver WordPress Meetup Group | 2705 | 4.49 | open | public | WordPress Enthusiasts | 49.270000 | -123.150002 | (49.27000045776367, -123.1500015258789) |
4 | 573925 | Vancouver Ruby Meetup Group | 2624 | 4.59 | open | public | Ruby Developers | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
20 | 1618572 | VanUE | 2620 | 4.45 | open | public | Members | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
cb_dark_blue = (0/255, 107/255, 164/255)
cb_orange = (255/255, 128/255, 14/255)
# largest tech groups
plt.scatter(largest_tech_groups['members'],
largest_tech_groups['rating'],
color=cb_dark_blue,
alpha=0.5,
label='largest')
# non-largest tech groups
plt.scatter(non_largest_groups_with_ratings['members'],
non_largest_groups_with_ratings['rating'],
color=cb_orange,
alpha=0.5,
label='non-largest')
plt.title('Members vs Ratings')
plt.xlabel('Members')
plt.ylabel('Ratings')
plt.legend()
plt.show()
Observations:
Groups with no Ratings
Let's look into groups with no ratings:
groups_with_no_ratings = vancouver_tech_groups[vancouver_tech_groups['rating'] == 0.0]
print(groups_with_no_ratings.shape[0])
143
groups_with_no_ratings['members'].describe()
count 143.000000 mean 111.062937 std 140.850296 min 1.000000 25% 35.000000 50% 60.000000 75% 135.500000 max 1084.000000 Name: members, dtype: float64
groups_with_no_ratings.sort_values('members', ascending=False).head(10)
id | name | members | rating | join_mode | visibility | who | lat | lon | location | |
---|---|---|---|---|---|---|---|---|---|---|
357 | 28775204 | Vancouver AI Tech Talk | 1084 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
371 | 29424450 | Devhub Vancouver | 646 | 0.0 | open | public | Members | 49.279999 | -123.120003 | (49.279998779296875, -123.12000274658203) |
131 | 18221599 | Vancouver ACM SIGGRAPH | 571 | 0.0 | approval | public | Computer Graphics Professionals | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
223 | 21464522 | Social After Tech | 559 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
189 | 19878077 | Futures of Learning | 449 | 0.0 | open | public | Learning Practitioners | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
380 | 29739350 | Node&JS Community Hangouts | 392 | 0.0 | open | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
297 | 26682494 | Women in Emerging Tech | 377 | 0.0 | open | public | Emerging Techies | 49.279999 | -123.110001 | (49.279998779296875, -123.11000061035156) |
324 | 27819281 | ProductTank Vancouver | 344 | 0.0 | open | public | Product People | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
356 | 28719901 | Design Systems Vancouver | 332 | 0.0 | approval | public | Members | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
326 | 27851504 | Vancity Vue.js | 326 | 0.0 | open | public | Developers | 49.279999 | -123.040001 | (49.279998779296875, -123.04000091552734) |
Observations:
# first we need the id of the groups
group_ids = largest_tech_groups['id'].tolist()
ids = ','.join(str(x) for x in group_ids)
ids
'144695,229052,1013148,1526926,1583286,1642334,1751788,1798770,3238812,3946322,4869742,11908342,18381330,18515170,18532714,19162038,19905099'
events_resp = client.GetEvents(group_id=ids, status='past')
22/30 (2 seconds remaining)
events_meta, events_df = parse_response(events_resp)
events_meta
0 | |
---|---|
next | https://api.meetup.com/2/events?offset=1&forma... |
method | Events |
total_count | 3283 |
link | https://api.meetup.com/2/events |
count | 200 |
description | Access Meetup events using a group, member, or... |
lon | |
title | Meetup Events v2 |
url | https://api.meetup.com/2/events?offset=0&forma... |
id | |
updated | 1551244894000 |
lat |
Observations
From the response's metadata, we observe that there are 3279 events for our largest tech meetups. As we specified past events, this may have given us all the past events.
Let's limit it to the past 6 months. All time data is represented as epoch milliseconds. So we need to represent the date range between now and 6 months ago in this format.
def to_millis(dt):
val = pd.to_datetime(dt)
print(f"Value:{val}|{val.value}")
return int(pd.to_datetime(dt).value / 1000000)
#right_now = to_millis(datetime.datetime.now())
right_now = 1550990827123 # represents 2019-02-24 06:47:07 AM PT
six_months_ago = int(right_now -183 * 24 * 60 * 60 * 1000)
print(f"Now: {right_now}\t Six months ago: {six_months_ago}")
Now: 1550990827123 Six months ago: 1535179627123
The original above cell was run on Sunday, February 24th 2019 around 06:47:07 AM PT. In order to keep the analysis below valid, the microsecond value that the to_millis()
function returned was hard coded in. Otherwise the date would change each time the above cell is run as you can see from the commented out line. I will consider this in future projects and find a better way of keeping the original proper date format.
events_resp = client.GetEvents(group_id=ids, status='past',
time=f"{six_months_ago},{right_now}")
events_meta, events_df = parse_response(events_resp)
events_meta
29/30 (10 seconds remaining)
0 | |
---|---|
next | https://api.meetup.com/2/events?offset=1&forma... |
method | Events |
total_count | 231 |
link | https://api.meetup.com/2/events |
count | 200 |
description | Access Meetup events using a group, member, or... |
lon | |
title | Meetup Events v2 |
url | https://api.meetup.com/2/events?offset=0&forma... |
id | |
updated | 1550814756000 |
lat |
Now we get a more reasonable amount of events: 231
Let's build a function to get all events that interest us:
def get_all_events_as_df(group_ids, event_status, time_range):
'''returns all events as a dataframe'''
print(f"Getting {event_status} events for groups '{group_ids}' in time range "
f" '{time_range}'...")
events_df_list = []
page_number = 0
next_page = None
while next_page != '':
events_resp = client.GetEvents(
group_id=group_ids,
status=event_status,
time=time_range,
offset=page_number
)
#pprint.pprint(events_resp.meta)
next_page = events_resp.meta['next']
_, event_df = parse_response(events_resp)
events_df_list.append(event_df)
page_number += 1
return pd.concat(events_df_list, ignore_index=True)
%%capture --no-display
event_status = 'past'
time_range = f"{six_months_ago},{right_now}"
events_df = get_all_events_as_df(ids, event_status, time_range)
Now we got all 231 events as originally found. Let's explore our data:
events_df.shape[0]
231
events_df.columns
Index(['created', 'description', 'duration', 'event_url', 'fee', 'group', 'headcount', 'how_to_find_us', 'id', 'maybe_rsvp_count', 'name', 'photo_url', 'rating', 'rsvp_limit', 'status', 'time', 'updated', 'utc_offset', 'venue', 'visibility', 'waitlist_count', 'yes_rsvp_count'], dtype='object')
# transposing our data makes it easier to see values across all the columns
events_df.head(5).transpose()
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
created | 1527617768000 | 1533071518000 | 1529329339000 | 1531123424000 | 1534528323000 |
description | <p>Join us for an epic, highly interactive and... | <p>This evening will be about creating geograp... | <p>Looking to tackle big data challenges? Expe... | <p>"Blockchain is the next step in the evoluti... | <p>>>>> HackerNest Tech Socials (<... |
duration | 12600000 | 7200000 | 5400000 | 7200000 | 7200000 |
event_url | https://www.meetup.com/YVR-Startups/events/251... | https://www.meetup.com/Vancouver-Data-Visualiz... | https://www.meetup.com/BrainStation-Vancouver/... | https://www.meetup.com/Vancouver-Tech-Co-Found... | https://www.meetup.com/HackerNestVAN/events/25... |
fee | NaN | NaN | {'amount': 20, 'currency': 'CAD', 'label': 'Pr... | NaN | NaN |
group | {'join_mode': 'open', 'created': 1462151582000... | {'join_mode': 'open', 'created': 1300139469000... | {'join_mode': 'open', 'created': 1427043156000... | {'join_mode': 'open', 'created': 1263596978000... | {'join_mode': 'open', 'created': 1347316761000... |
headcount | 0 | 0 | 0 | 0 | 0 |
how_to_find_us | Private dining room and/or ask for Nathan! | NaN | NaN | NaN | NaN |
id | bqfbtpyxlbkc | 253351123 | 251882094 | gjjzxpyxlbkc | 253878418 |
maybe_rsvp_count | 0 | 0 | 0 | 0 | 0 |
name | YVR Startups' Huge Facilitated Business Networ... | Visualizing large geospatial data - hands-on MapD | Intro to Data Science | Intro to Blockchain | Why it's Important, How ... | HackerNest Vancouver August Tech Social |
photo_url | https://secure.meetupstatic.com/photos/event/1... | https://secure.meetupstatic.com/photos/event/a... | https://secure.meetupstatic.com/photos/event/3... | https://secure.meetupstatic.com/photos/event/3... | https://secure.meetupstatic.com/photos/event/e... |
rating | {'count': 0, 'average': 0} | {'count': 0, 'average': 0} | {'count': 0, 'average': 0} | {'count': 0, 'average': 0} | {'count': 0, 'average': 0} |
rsvp_limit | NaN | 35 | NaN | NaN | NaN |
status | past | past | past | past | past |
time | 1535418000000 | 1535418000000 | 1535418000000 | 1535418000000 | 1535421600000 |
updated | 1535476642000 | 1535476697000 | 1535476644000 | 1535476684000 | 1535493825000 |
utc_offset | -25200000 | -25200000 | -25200000 | -25200000 | -25200000 |
venue | {'country': 'ca', 'localized_country_name': 'C... | {'zip': 'V5T 1G8', 'country': 'ca', 'localized... | {'country': 'ca', 'localized_country_name': 'C... | {'country': 'ca', 'localized_country_name': 'C... | {'zip': 'V6K 3E6', 'country': 'ca', 'localized... |
visibility | public | public | public | public | public |
waitlist_count | 0 | 45 | 0 | 0 | 0 |
yes_rsvp_count | 96 | 35 | 2 | 52 | 65 |
display_columns_unique_values(events_df)
Column CREATED: 167 Column DESCRIPTION: 181 Column DURATION: 13 Column EVENT_URL: 231 Column FEE: Type: <class 'float'> Column GROUP: Type: <class 'dict'> Column HEADCOUNT: 1|Unique values: [0] Column HOW_TO_FIND_US: 33 Column ID: 231 Column MAYBE_RSVP_COUNT: 1|Unique values: [0] Column NAME: 148 Column PHOTO_URL: 126 Column RATING: Type: <class 'dict'> Column RSVP_LIMIT: 20 Column STATUS: 1|Unique values: ['past'] Column TIME: 177 Column UPDATED: 227 Column UTC_OFFSET: 2 Column VENUE: Type: <class 'dict'> Column VISIBILITY: 1|Unique values: ['public'] Column WAITLIST_COUNT: 14 Column YES_RSVP_COUNT: 83
Filter out unneeded columns
columns_to_keep = ['group', 'time', 'duration', 'yes_rsvp_count', 'waitlist_count']
past_events_df = events_df.copy()
past_events_df = past_events_df[columns_to_keep]
past_events_df.head(5)
group | time | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|
0 | {'join_mode': 'open', 'created': 1462151582000... | 1535418000000 | 12600000 | 96 | 0 |
1 | {'join_mode': 'open', 'created': 1300139469000... | 1535418000000 | 7200000 | 35 | 45 |
2 | {'join_mode': 'open', 'created': 1427043156000... | 1535418000000 | 5400000 | 2 | 0 |
3 | {'join_mode': 'open', 'created': 1263596978000... | 1535418000000 | 7200000 | 52 | 0 |
4 | {'join_mode': 'open', 'created': 1347316761000... | 1535421600000 | 7200000 | 65 | 0 |
Clean up Group Column
# Let's look at the group column which consists of JSON data
past_events_df['group'][0]
{'join_mode': 'open', 'created': 1462151582000, 'name': 'YVR Startups', 'group_lon': -123.13999938964844, 'id': 19905099, 'urlname': 'YVR-Startups', 'group_lat': 49.2599983215332, 'who': 'Entrepreneurs'}
We only need the group id, so let's extract it into its own column and filter out this column:
# Let's extract the group id
def extract_group_id(group_json):
'''extracts the group id from group json'''
return group_json['id']
past_events_df['group_id'] = past_events_df['group'].apply(extract_group_id)
columns_to_keep = ['group_id', 'time', 'duration', 'yes_rsvp_count', 'waitlist_count']
past_events_df = past_events_df[columns_to_keep]
past_events_df.head(5)
group_id | time | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|
0 | 19905099 | 1535418000000 | 12600000 | 96 | 0 |
1 | 1798770 | 1535418000000 | 7200000 | 35 | 45 |
2 | 18515170 | 1535418000000 | 5400000 | 2 | 0 |
3 | 1583286 | 1535418000000 | 7200000 | 52 | 0 |
4 | 4869742 | 1535421600000 | 7200000 | 65 | 0 |
Change time format
The time and duration columns are in epoch milliseconds format, which is great for the computer but not so great for people.
Let's add human readable columns for these 2:
date_format = '%m/%d/%y %H:%M'
past_events_df['time_human'] = pd.to_datetime(
past_events_df['time'], unit='ms').dt.strftime(date_format)
past_events_df.head(1)
group_id | time | duration | yes_rsvp_count | waitlist_count | time_human | |
---|---|---|---|---|---|---|
0 | 19905099 | 1535418000000 | 12600000 | 96 | 0 | 08/28/18 01:00 |
def millis_to_hours_and_minutes(ms):
'''converts milliseconds to hours and minutes'''
seconds = ms / 1000
minutes, seconds = divmod(seconds, 60)
hours, minutes = divmod(minutes, 60)
return f"{int(hours)} hours, {int(minutes)} minutes"
past_events_df['duration_human'] = events_df['duration'].apply(millis_to_hours_and_minutes)
past_events_df.head(1)
group_id | time | duration | yes_rsvp_count | waitlist_count | time_human | duration_human | |
---|---|---|---|---|---|---|---|
0 | 19905099 | 1535418000000 | 12600000 | 96 | 0 | 08/28/18 01:00 | 3 hours, 30 minutes |
# rename columns
rename_map = {
'group_id': 'id', #need to change this so we can later merge with the groups df!
'time': 'time_ms',
'time_human': 'time',
'duration': 'duration_ms',
'duration_human': 'duration'
}
past_events_df = past_events_df.rename(rename_map, axis='columns')
past_events_df.head(1)
id | time_ms | duration_ms | yes_rsvp_count | waitlist_count | time | duration | |
---|---|---|---|---|---|---|---|
0 | 19905099 | 1535418000000 | 12600000 | 96 | 0 | 08/28/18 01:00 | 3 hours, 30 minutes |
# re-order columns
new_column_order = ['id', 'time', 'time_ms', 'duration', 'duration_ms',
'yes_rsvp_count', 'waitlist_count']
past_events_df = past_events_df[new_column_order]
past_events_df.head(1)
id | time | time_ms | duration | duration_ms | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|---|---|
0 | 19905099 | 08/28/18 01:00 | 1535418000000 | 3 hours, 30 minutes | 12600000 | 96 | 0 |
To properly analyze the events, we should match them back up with their respective group.
We will merge the two dataframes (largest_tech_groups
and past_events_df
) in a operation which is similar to a left join in SQL.
Left Join is picked as we want to retain all rows from our past events dataframe, while getting only the rows from our groups dataframe. Although in this case, as the events were derived from the group and assuming each group hosted at least one event in the past 6 months, we should end up retaining all events and all groups data.
We will filter out unnecessary columns and sort the ouput by group name ascending and then time descending.
print(f"Largest Tech Groups Columns:\n{largest_tech_groups.columns}")
print(f"\nPast Events Columns:\n{past_events_df.columns}")
Largest Tech Groups Columns: Index(['id', 'name', 'members', 'rating', 'join_mode', 'visibility', 'who', 'lat', 'lon', 'location'], dtype='object') Past Events Columns: Index(['id', 'time', 'time_ms', 'duration', 'duration_ms', 'yes_rsvp_count', 'waitlist_count'], dtype='object')
merged_df = pd.merge(
past_events_df,
largest_tech_groups[['id', 'name']],
on='id',
how='left'
)
merged_df.head(3)
id | time | time_ms | duration | duration_ms | yes_rsvp_count | waitlist_count | name | |
---|---|---|---|---|---|---|---|---|
0 | 19905099 | 08/28/18 01:00 | 1535418000000 | 3 hours, 30 minutes | 12600000 | 96 | 0 | YVR Startups |
1 | 1798770 | 08/28/18 01:00 | 1535418000000 | 2 hours, 0 minutes | 7200000 | 35 | 45 | Vancouver Data Visualization |
2 | 18515170 | 08/28/18 01:00 | 1535418000000 | 1 hours, 30 minutes | 5400000 | 2 | 0 | BrainStation Vancouver |
columns_to_keep = ['id', 'name', 'time', 'time_ms',
'duration', 'yes_rsvp_count', 'waitlist_count']
final_df = merged_df[columns_to_keep]
# Sort output by name and time
final_df = final_df.sort_values(by=['name', 'time_ms'], ascending=[True, False])
final_df.head(5)
id | name | time | time_ms | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|---|---|
195 | 1751788 | Agile Vancouver | 01/23/19 02:00 | 1548208800000 | 3 hours, 0 minutes | 100 | 39 |
123 | 1751788 | Agile Vancouver | 11/21/18 02:00 | 1542765600000 | 2 hours, 0 minutes | 120 | 14 |
70 | 1751788 | Agile Vancouver | 10/17/18 01:00 | 1539738000000 | 2 hours, 0 minutes | 75 | 0 |
34 | 1751788 | Agile Vancouver | 09/19/18 01:00 | 1537318800000 | 2 hours, 0 minutes | 66 | 0 |
109 | 3238812 | Bitcoin Vancouver | 11/11/18 01:30 | 1541899800000 | 4 hours, 30 minutes | 14 | 0 |
final_df.shape
(231, 7)
g = sns.lmplot(x="time_ms", y="yes_rsvp_count",
data=final_df, hue="name",
height=6, aspect=2,
order=1, ci=None)
plt.title("Yes RSVP's over Time")
Text(0.5, 1.0, "Yes RSVP's over Time")
Observations:
One group in particular has the largest downward trending line. I think it's the YVR Startups group. Let's confirm:
downward_group = final_df[final_df['name'] == 'YVR Startups']
downward_group
id | name | time | time_ms | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|---|---|
227 | 19905099 | YVR Startups | 02/21/19 03:00 | 1550718000000 | 2 hours, 0 minutes | 52 | 0 |
217 | 19905099 | YVR Startups | 02/08/19 02:00 | 1549591200000 | 2 hours, 30 minutes | 10 | 0 |
213 | 19905099 | YVR Startups | 02/01/19 01:30 | 1548984600000 | 3 hours, 30 minutes | 30 | 0 |
176 | 19905099 | YVR Startups | 01/08/19 02:00 | 1546912800000 | 3 hours, 0 minutes | 8 | 0 |
173 | 19905099 | YVR Startups | 12/23/18 01:00 | 1545526800000 | 5 hours, 0 minutes | 6 | 0 |
156 | 19905099 | YVR Startups | 12/11/18 01:30 | 1544491800000 | 4 hours, 30 minutes | 107 | 0 |
141 | 19905099 | YVR Startups | 11/29/18 21:00 | 1543525200000 | 29 hours, 30 minutes | 8 | 0 |
132 | 19905099 | YVR Startups | 11/27/18 03:00 | 1543287600000 | 2 hours, 0 minutes | 31 | 0 |
129 | 19905099 | YVR Startups | 11/22/18 02:00 | 1542852000000 | 3 hours, 0 minutes | 106 | 0 |
121 | 19905099 | YVR Startups | 11/20/18 02:00 | 1542679200000 | 3 hours, 0 minutes | 15 | 0 |
86 | 19905099 | YVR Startups | 10/25/18 01:00 | 1540429200000 | 3 hours, 30 minutes | 114 | 0 |
78 | 19905099 | YVR Startups | 10/23/18 02:00 | 1540260000000 | 2 hours, 0 minutes | 48 | 0 |
69 | 19905099 | YVR Startups | 10/16/18 02:00 | 1539655200000 | 2 hours, 30 minutes | 12 | 0 |
47 | 19905099 | YVR Startups | 09/27/18 01:00 | 1538010000000 | 3 hours, 30 minutes | 103 | 0 |
0 | 19905099 | YVR Startups | 08/28/18 01:00 | 1535418000000 | 3 hours, 30 minutes | 96 | 0 |
This group has events with over 100 attendees and others with only a handful but there isn't necessarily a downward spiral. The extreme slumps (less than 10 attendees) may be attributed to holidays - 2 events on December 23rd and January 8th had less than 10 attendees. Another event on November 29th also had only 8 attendees, though not sure what could have caused that.
Let's try removing this group and see the plot again:
without_yvr_startups = final_df[final_df['name'] != 'YVR Startups']
without_yvr_startups.shape
(216, 7)
without_yvr_startups['name'].unique()
array(['Agile Vancouver', 'Bitcoin Vancouver', 'BrainStation Vancouver', 'Code & Coffee (Vancouver)', 'Full Indie', 'HackerNest Vancouver Tech Socials', 'Learn Data Science', 'NetSquared Vancouver: Tech4Good and Nonprofits', 'TechVancouver', 'The Vancouver Web Design Meetup Group', "VanDev: Vancouver's Software Developers Network", 'VanJS: Vancouver JavaScript Developers', 'Vancouver Data Visualization', 'Vancouver Pixel Crafters', 'Vancouver Startup Founder 101'], dtype=object)
g = sns.lmplot(x="time_ms", y="yes_rsvp_count",
data=without_yvr_startups, hue="name",
height=6, aspect=2,
order=1, ci=None)
plt.title("Yes RSVP's over Time")
Text(0.5, 1.0, "Yes RSVP's over Time")
Observations:
We are able to confirm the largest downward trending group was the The YVR Startups group. It was quite an outlier.
We get a slightly closer view of the remaining plots. But it isn't quite easy to figure out which groups to pick.
Let's look at average RSVPs and frequency of events:
Average RSVPs
rsvp_df = final_df.groupby(['name'], as_index=False).mean()
rsvp_df = rsvp_df.sort_values(by=['yes_rsvp_count', 'name'], ascending=[True, True])
rsvp_df
name | id | time_ms | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|
2 | BrainStation Vancouver | 18515170.0 | 1.542418e+12 | 5.360656 | 0.000000 |
1 | Bitcoin Vancouver | 3238812.0 | 1.540088e+12 | 11.400000 | 0.000000 |
13 | Vancouver Pixel Crafters | 1583286.0 | 1.542226e+12 | 14.481481 | 0.000000 |
10 | VanDev: Vancouver's Software Developers Network | 144695.0 | 1.542207e+12 | 19.000000 | 2.600000 |
14 | Vancouver Startup Founder 101 | 18381330.0 | 1.543088e+12 | 19.500000 | 0.000000 |
3 | Code & Coffee (Vancouver) | 11908342.0 | 1.542799e+12 | 21.730769 | 0.076923 |
12 | Vancouver Data Visualization | 1798770.0 | 1.539124e+12 | 29.333333 | 15.000000 |
15 | YVR Startups | 19905099.0 | 1.543489e+12 | 49.733333 | 0.000000 |
6 | Learn Data Science | 18532714.0 | 1.542407e+12 | 57.258065 | 4.225806 |
7 | NetSquared Vancouver: Tech4Good and Nonprofits | 1013148.0 | 1.543106e+12 | 66.000000 | 0.166667 |
4 | Full Indie | 1642334.0 | 1.542010e+12 | 86.250000 | 3.500000 |
0 | Agile Vancouver | 1751788.0 | 1.542008e+12 | 90.250000 | 13.250000 |
5 | HackerNest Vancouver Tech Socials | 4869742.0 | 1.541788e+12 | 93.333333 | 0.000000 |
11 | VanJS: Vancouver JavaScript Developers | 1526926.0 | 1.538098e+12 | 119.000000 | 0.000000 |
8 | TechVancouver | 19162038.0 | 1.543225e+12 | 140.166667 | 0.000000 |
9 | The Vancouver Web Design Meetup Group | 229052.0 | 1.544362e+12 | 174.500000 | 3.000000 |
Observations:
There are 7 groups that get less than 30 YES RSVPs on average. As these groups have members of over 3000, 30 RSVPs represents only 1% of all members of these groups.
Out of these 3 had a non 0 average for their waitlist count:
Which leads to another question:
Let's focus on YES RSVPs for now and drop these groups.
groups_to_drop = [
'BrainStation Vancouver', 'Bitcoin Vancouver',
'Code & Coffee (Vancouver)', 'Vancouver Startup Founder 101',
"VanDev: Vancouver's Software Developers Network",
'Vancouver Data Visualization', 'Vancouver Pixel Crafters',
]
largest_groups_events = final_df.copy()
largest_groups_events = largest_groups_events[~largest_groups_events['name'].isin(
groups_to_drop)]
print(largest_groups_events.shape)
print(f"\nRemaining Groups:\n{largest_groups_events['name'].unique()}")
(75, 7) Remaining Groups: ['Agile Vancouver' 'Full Indie' 'HackerNest Vancouver Tech Socials' 'Learn Data Science' 'NetSquared Vancouver: Tech4Good and Nonprofits' 'TechVancouver' 'The Vancouver Web Design Meetup Group' 'VanJS: Vancouver JavaScript Developers' 'YVR Startups']
Frequency of Events
event_count_df = largest_groups_events.groupby('name', as_index=False).count()
event_count_df = event_count_df.sort_values(by='name', ascending=True)
event_count_df
name | id | time | time_ms | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|---|---|
0 | Agile Vancouver | 4 | 4 | 4 | 4 | 4 | 4 |
1 | Full Indie | 4 | 4 | 4 | 4 | 4 | 4 |
2 | HackerNest Vancouver Tech Socials | 6 | 6 | 6 | 6 | 6 | 6 |
3 | Learn Data Science | 31 | 31 | 31 | 31 | 31 | 31 |
4 | NetSquared Vancouver: Tech4Good and Nonprofits | 6 | 6 | 6 | 6 | 6 | 6 |
5 | TechVancouver | 6 | 6 | 6 | 6 | 6 | 6 |
6 | The Vancouver Web Design Meetup Group | 2 | 2 | 2 | 2 | 2 | 2 |
7 | VanJS: Vancouver JavaScript Developers | 1 | 1 | 1 | 1 | 1 | 1 |
8 | YVR Startups | 15 | 15 | 15 | 15 | 15 | 15 |
Observation:
Focus on Infrequently meeting groups
Let's filter out Learn Data Science and YVR Startups and see when was the last time the other meetups met.
groups_to_drop = ['Learn Data Science', 'YVR Startups']
infrequent_groups = largest_groups_events[~largest_groups_events['name'].isin(
groups_to_drop)]
print(infrequent_groups.shape)
print(f"\nRemaining Groups:\n{infrequent_groups['name'].unique()}")
(29, 7) Remaining Groups: ['Agile Vancouver' 'Full Indie' 'HackerNest Vancouver Tech Socials' 'NetSquared Vancouver: Tech4Good and Nonprofits' 'TechVancouver' 'The Vancouver Web Design Meetup Group' 'VanJS: Vancouver JavaScript Developers']
infrequent_groups.sort_values(by=['name', 'time_ms'], ascending=[True, True])
id | name | time | time_ms | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|---|---|
34 | 1751788 | Agile Vancouver | 09/19/18 01:00 | 1537318800000 | 2 hours, 0 minutes | 66 | 0 |
70 | 1751788 | Agile Vancouver | 10/17/18 01:00 | 1539738000000 | 2 hours, 0 minutes | 75 | 0 |
123 | 1751788 | Agile Vancouver | 11/21/18 02:00 | 1542765600000 | 2 hours, 0 minutes | 120 | 14 |
195 | 1751788 | Agile Vancouver | 01/23/19 02:00 | 1548208800000 | 3 hours, 0 minutes | 100 | 39 |
8 | 1642334 | Full Indie | 08/29/18 01:00 | 1535504400000 | 3 hours, 0 minutes | 20 | 14 |
83 | 1642334 | Full Indie | 10/24/18 02:00 | 1540346400000 | 2 hours, 0 minutes | 111 | 0 |
140 | 1642334 | Full Indie | 11/29/18 03:00 | 1543460400000 | 2 hours, 0 minutes | 104 | 0 |
206 | 1642334 | Full Indie | 01/29/19 03:00 | 1548730800000 | 2 hours, 0 minutes | 110 | 0 |
4 | 4869742 | HackerNest Vancouver Tech Socials | 08/28/18 02:00 | 1535421600000 | 2 hours, 0 minutes | 65 | 0 |
43 | 4869742 | HackerNest Vancouver Tech Socials | 09/25/18 02:00 | 1537840800000 | 2 hours, 0 minutes | 75 | 0 |
95 | 4869742 | HackerNest Vancouver Tech Socials | 10/30/18 02:00 | 1540864800000 | 2 hours, 0 minutes | 142 | 0 |
133 | 4869742 | HackerNest Vancouver Tech Socials | 11/27/18 03:00 | 1543287600000 | 2 hours, 0 minutes | 118 | 0 |
158 | 4869742 | HackerNest Vancouver Tech Socials | 12/12/18 02:00 | 1544580000000 | 3 hours, 0 minutes | 8 | 0 |
205 | 4869742 | HackerNest Vancouver Tech Socials | 01/29/19 03:00 | 1548730800000 | 2 hours, 0 minutes | 152 | 0 |
25 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 09/12/18 00:30 | 1536712200000 | 2 hours, 30 minutes | 68 | 0 |
66 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 10/11/18 00:30 | 1539217800000 | 2 hours, 0 minutes | 50 | 0 |
101 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 11/07/18 01:30 | 1541554200000 | 2 hours, 30 minutes | 76 | 0 |
144 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 11/30/18 01:30 | 1543541400000 | 2 hours, 0 minutes | 37 | 0 |
178 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 01/09/19 02:00 | 1546999200000 | 2 hours, 0 minutes | 105 | 0 |
225 | 1013148 | NetSquared Vancouver: Tech4Good and Nonprofits | 02/19/19 21:00 | 1550610000000 | 3 hours, 0 minutes | 60 | 1 |
26 | 19162038 | TechVancouver | 09/12/18 01:00 | 1536714000000 | 3 hours, 0 minutes | 147 | 0 |
62 | 19162038 | TechVancouver | 10/10/18 01:00 | 1539133200000 | 3 hours, 0 minutes | 112 | 0 |
116 | 19162038 | TechVancouver | 11/15/18 02:00 | 1542247200000 | 2 hours, 0 minutes | 143 | 0 |
160 | 19162038 | TechVancouver | 12/12/18 02:00 | 1544580000000 | 2 hours, 0 minutes | 143 | 0 |
182 | 19162038 | TechVancouver | 01/10/19 02:00 | 1547085600000 | 3 hours, 0 minutes | 133 | 0 |
218 | 19162038 | TechVancouver | 02/08/19 02:00 | 1549591200000 | 3 hours, 0 minutes | 163 | 0 |
81 | 229052 | The Vancouver Web Design Meetup Group | 10/24/18 01:00 | 1540342800000 | 2 hours, 0 minutes | 150 | 0 |
203 | 229052 | The Vancouver Web Design Meetup Group | 01/25/19 02:00 | 1548381600000 | 2 hours, 0 minutes | 199 | 6 |
52 | 1526926 | VanJS: Vancouver JavaScript Developers | 09/28/18 01:30 | 1538098200000 | 2 hours, 30 minutes | 119 | 0 |
Observations:
We should drop the 2 infrequently meeting groups.
Focus on Frequently meeting groups
Let's also look at the frequency of events for the two meetups which meet the most often:
groups_to_keep = ['Learn Data Science', 'YVR Startups']
frequent_groups = largest_groups_events[largest_groups_events['name'].isin(
groups_to_drop)]
print(frequent_groups.shape)
print(f"\nRemaining Groups:\n{frequent_groups['name'].unique()}")
(46, 7) Remaining Groups: ['Learn Data Science' 'YVR Startups']
frequent_groups.sort_values(by=['name', 'time_ms'], ascending=[True, True])
id | name | time | time_ms | duration | yes_rsvp_count | waitlist_count | |
---|---|---|---|---|---|---|---|
7 | 18532714 | Learn Data Science | 08/29/18 01:00 | 1535504400000 | 2 hours, 30 minutes | 2 | 40 |
13 | 18532714 | Learn Data Science | 08/30/18 01:00 | 1535590800000 | 3 hours, 0 minutes | 46 | 0 |
20 | 18532714 | Learn Data Science | 09/07/18 01:00 | 1536282000000 | 2 hours, 0 minutes | 60 | 0 |
29 | 18532714 | Learn Data Science | 09/13/18 01:00 | 1536800400000 | 3 hours, 0 minutes | 52 | 0 |
38 | 18532714 | Learn Data Science | 09/21/18 01:00 | 1537491600000 | 2 hours, 0 minutes | 56 | 0 |
48 | 18532714 | Learn Data Science | 09/27/18 01:00 | 1538010000000 | 3 hours, 0 minutes | 50 | 0 |
56 | 18532714 | Learn Data Science | 10/03/18 00:00 | 1538524800000 | 3 hours, 0 minutes | 1 | 0 |
60 | 18532714 | Learn Data Science | 10/05/18 01:00 | 1538701200000 | 2 hours, 0 minutes | 92 | 0 |
67 | 18532714 | Learn Data Science | 10/11/18 01:00 | 1539219600000 | 3 hours, 0 minutes | 59 | 0 |
73 | 18532714 | Learn Data Science | 10/18/18 00:00 | 1539820800000 | 2 hours, 0 minutes | 50 | 50 |
75 | 18532714 | Learn Data Science | 10/19/18 01:00 | 1539910800000 | 2 hours, 0 minutes | 80 | 0 |
88 | 18532714 | Learn Data Science | 10/25/18 01:00 | 1540429200000 | 3 hours, 0 minutes | 48 | 0 |
98 | 18532714 | Learn Data Science | 11/02/18 01:00 | 1541120400000 | 2 hours, 0 minutes | 70 | 0 |
103 | 18532714 | Learn Data Science | 11/07/18 02:00 | 1541556000000 | 2 hours, 0 minutes | 120 | 25 |
106 | 18532714 | Learn Data Science | 11/08/18 02:00 | 1541642400000 | 3 hours, 0 minutes | 41 | 0 |
112 | 18532714 | Learn Data Science | 11/14/18 02:00 | 1542160800000 | 1 hours, 0 minutes | 2 | 0 |
115 | 18532714 | Learn Data Science | 11/15/18 02:00 | 1542247200000 | 2 hours, 0 minutes | 83 | 0 |
117 | 18532714 | Learn Data Science | 11/16/18 02:00 | 1542333600000 | 2 hours, 0 minutes | 59 | 0 |
130 | 18532714 | Learn Data Science | 11/22/18 02:00 | 1542852000000 | 3 hours, 0 minutes | 62 | 0 |
145 | 18532714 | Learn Data Science | 11/30/18 02:00 | 1543543200000 | 2 hours, 0 minutes | 68 | 0 |
153 | 18532714 | Learn Data Science | 12/06/18 02:00 | 1544061600000 | 3 hours, 0 minutes | 44 | 0 |
154 | 18532714 | Learn Data Science | 12/07/18 02:00 | 1544148000000 | 2 hours, 0 minutes | 86 | 13 |
164 | 18532714 | Learn Data Science | 12/14/18 02:00 | 1544752800000 | 2 hours, 0 minutes | 58 | 0 |
172 | 18532714 | Learn Data Science | 12/20/18 02:00 | 1545271200000 | 3 hours, 0 minutes | 44 | 0 |
185 | 18532714 | Learn Data Science | 01/11/19 02:00 | 1547172000000 | 2 hours, 0 minutes | 91 | 0 |
190 | 18532714 | Learn Data Science | 01/17/19 02:00 | 1547690400000 | 3 hours, 0 minutes | 65 | 3 |
201 | 18532714 | Learn Data Science | 01/25/19 02:00 | 1548381600000 | 2 hours, 0 minutes | 73 | 0 |
212 | 18532714 | Learn Data Science | 01/31/19 02:00 | 1548900000000 | 3 hours, 0 minutes | 45 | 0 |
219 | 18532714 | Learn Data Science | 02/08/19 02:00 | 1549591200000 | 2 hours, 0 minutes | 60 | 0 |
224 | 18532714 | Learn Data Science | 02/14/19 02:00 | 1550109600000 | 3 hours, 0 minutes | 42 | 0 |
230 | 18532714 | Learn Data Science | 02/22/19 02:00 | 1550800800000 | 2 hours, 0 minutes | 66 | 0 |
0 | 19905099 | YVR Startups | 08/28/18 01:00 | 1535418000000 | 3 hours, 30 minutes | 96 | 0 |
47 | 19905099 | YVR Startups | 09/27/18 01:00 | 1538010000000 | 3 hours, 30 minutes | 103 | 0 |
69 | 19905099 | YVR Startups | 10/16/18 02:00 | 1539655200000 | 2 hours, 30 minutes | 12 | 0 |
78 | 19905099 | YVR Startups | 10/23/18 02:00 | 1540260000000 | 2 hours, 0 minutes | 48 | 0 |
86 | 19905099 | YVR Startups | 10/25/18 01:00 | 1540429200000 | 3 hours, 30 minutes | 114 | 0 |
121 | 19905099 | YVR Startups | 11/20/18 02:00 | 1542679200000 | 3 hours, 0 minutes | 15 | 0 |
129 | 19905099 | YVR Startups | 11/22/18 02:00 | 1542852000000 | 3 hours, 0 minutes | 106 | 0 |
132 | 19905099 | YVR Startups | 11/27/18 03:00 | 1543287600000 | 2 hours, 0 minutes | 31 | 0 |
141 | 19905099 | YVR Startups | 11/29/18 21:00 | 1543525200000 | 29 hours, 30 minutes | 8 | 0 |
156 | 19905099 | YVR Startups | 12/11/18 01:30 | 1544491800000 | 4 hours, 30 minutes | 107 | 0 |
173 | 19905099 | YVR Startups | 12/23/18 01:00 | 1545526800000 | 5 hours, 0 minutes | 6 | 0 |
176 | 19905099 | YVR Startups | 01/08/19 02:00 | 1546912800000 | 3 hours, 0 minutes | 8 | 0 |
213 | 19905099 | YVR Startups | 02/01/19 01:30 | 1548984600000 | 3 hours, 30 minutes | 30 | 0 |
217 | 19905099 | YVR Startups | 02/08/19 02:00 | 1549591200000 | 2 hours, 30 minutes | 10 | 0 |
227 | 19905099 | YVR Startups | 02/21/19 03:00 | 1550718000000 | 2 hours, 0 minutes | 52 | 0 |
Observations:
YVR Startups
Similar to its YES RSVPs, YVR Startups isn't consistent in the frequency of their events. Sometimes they meet once a month such as in August, September and January. Other months, they meet multiple times a month.
Learn Data Science
Learn Data Science meets at least once a week and most times multiple times a week. It has met 31 times in the past 6 months (~26 weeks). Pretty impressive!
In terms of YES RSVPs, it fluctuates and has a few extreme cases such as 120 RSVPS on November 11th 2018 and 2 on November 14th 2018. On average, they had YES RSVPS of 57 people.
groups_to_drop = ['VanJS: Vancouver JavaScript Developers',
'The Vancouver Web Design Meetup Group']
largest_groups_events = largest_groups_events[~largest_groups_events['name'].isin(
groups_to_drop)]
print(largest_groups_events.shape)
print(f"\nRemaining Groups:\n{largest_groups_events['name'].unique()}")
(72, 7) Remaining Groups: ['Agile Vancouver' 'Full Indie' 'HackerNest Vancouver Tech Socials' 'Learn Data Science' 'NetSquared Vancouver: Tech4Good and Nonprofits' 'TechVancouver' 'YVR Startups']
Back to our plot
Let's plot our earlier plot of YES RSVPs over Time for the Meetup groups we decided to keep:
%%capture --no-display
g = sns.lmplot(
x="time_ms", y="yes_rsvp_count",
data=largest_groups_events, hue="name",
height=6, aspect=2, order=1,
ci=68,
robust=True, legend=False)
#plt.setp(g._legend.get_texts(), fontsize='13') # for legend text
#plt.setp(g._legend.get_title(), fontsize='14') # for legend title
plt.legend(loc='upper center',
bbox_to_anchor=(0.5, -0.1),
prop={'size': 12},
fancybox=True,
shadow=True,
ncol=4
)
plt.title("Yes RSVP's over Time")
Text(0.5, 1.0, "Yes RSVP's over Time")
Observations:
With the exception of YVR Startups with its inconsistent attendee count, all other Meetups are generally increasing or maintaining their attendance.
Also more generally, some Meetup groups attract a lot of RSVPs but meet very infrequently - once every few months.
Let's list our discoveries for Tech Meetup Groups in Vancouver, BC:
Groups
Focusing exclusively on events for the largest Tech Meetups, we discovered:
Events
4th Place
If you are into Web Development and Design and don't mind joining a group that meets once every few months, but that has a large number of attendees, you should join the following groups:
3rd Place
If you want to join a Tech Meetup Group that meets at least once a month and has a lots of attendees, you should join the following groups:
2nd Place
If you want to join an Entrepreneurship and Startups focused Meetup Group that meets several times a month and has a decent but sometimes inconsistent attendance, you should join the following group:
1st Place
If you are into Data Science and want to join a Meetup Group that meets several times a week and has lots of attendees, you should join the following group:
We set out to explore Tech Meetups in Vancouver, BC and made some interesting discoveries and concluded with a few recommendations for folks looking to join Meetup groups.
To get here, we: