Investigate JC Penny store closings$^1$ by:
Final dashboard: https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed
Install cartoframes (which is currently in beta). I recommend installing in a virtual environment to keep things clean and sandboxed.
Download the JC Penny store location data from here:
Pull JC Penny locations from my CARTO account into cartoframes
import pandas as pd
import cartoframes
import json
import warnings
warnings.filterwarnings("ignore")
USERNAME = '' # <-- Put your carto username here
APIKEY = '' # <-- Put your carto api key here
# use cartoframes.credentials.set_creds() to save credentials for future use
cc = cartoframes.CartoContext(api_key=APIKEY,
base_url='https://{}.carto.com/'.format(USERNAME))
table_name = 'jc_penny_stores'
# load JC Penny locations into DataFrame
df = cc.read(table_name)
df.head()
from cartoframes import Layer
from cartoframes.styling import vivid
cc.map(layers=Layer(table_name,
color={'column': 'status', 'scheme': vivid(10, 'category')}),
interactive=False)
# get population, other measures within 5 minute walk time
# More info about this Data Observatory measure here:
# https://cartodb.github.io/bigmetadata/united_states/age_gender.html#total-population
df = cc.data_augment(table_name, [{'numer_id': 'us.census.acs.B01003001',
'normalization': 'area',
'numer_timespan': '2011 - 2015'}])
df.head()
df.describe()
Create a derivative table with geometries as isochrones of walk/drive times from store locations. If pop density is above 5000 people / sq. km., assume it's a walkable area. Otherwise, assume cars are the primary mode of transit.
Note: This functionality is a planned cartoframes method.
%%time
df = cc.query('''
SELECT
CASE WHEN total_pop_area_2011_2015 > 5000
THEN (cdb_isochrone(the_geom, 'walk', Array[600])).the_geom
ELSE (cdb_isochrone(the_geom, 'car', Array[600])).the_geom
END as the_geom,
{keep_columns}
FROM
{table_name}
'''.format(table_name=table_name,
keep_columns=', '.join(set(df.columns) - {'the_geom', 'the_geom_webmercator'})))
iso_table_name = (table_name + '_isochrones')
There is an issue in the repo already to introduce batch_api queries to avoid timeout: https://github.com/CartoDB/cartoframes/issues/85
There are bonus points to find bugs and open issues!
cc.write(df, iso_table_name)
If this fails because of a lack of credits (i.e., reaching quota), then replace the (cdb_isochrone(the_geom, 'walk', Array[600])).the_geom
pieces with ST_Buffer(the_geom::geography, 800)::geometry
for an approximate 10 minute walk ('crow flies' distance), and ST_Buffer(the_geom::geography, 12000)::geometry
for an approximate 10 minute drive (assuming 45 mph on average for 10 minutes).
df.head()
from cartoframes import BaseMap
cc.map(layers=[BaseMap('light'),
Layer(iso_table_name),
Layer(table_name)],
zoom=12, lng=-73.9668, lat=40.7306,
interactive=False)
# show choropleth of isochrones by pop density
from cartoframes.styling import vivid
cc.map(layers=[Layer(iso_table_name,
color='total_pop_area_2011_2015'),
Layer(table_name, size=6, color={'column': 'status', 'scheme': vivid(2)})],
zoom=8, lng=-74.7729, lat=39.9771,
interactive=False)
# Data Observatory measures: median income, male age 30-34 (both ACS)
# Male age 30-34: https://cartodb.github.io/bigmetadata/united_states/age_gender.html#male-age-30-to-34
# Median Income: https://cartodb.github.io/bigmetadata/united_states/income.html#median-household-income-in-the-past-12-months
# Note: this may take a minute or two because all the measures are being calculated based on the custom geographies
# that are passed in using spatially interpolated calculations (area-weighted measures)
data_obs_measures = [{'numer_id': 'us.census.acs.B01001012'},
{'numer_id': 'us.census.acs.B19013001'}]
df = cc.data_augment(table_name + '_isochrones', data_obs_measures)
df.head()
As you might have already heard, the Data Observatory just launched to help provide CartoDB users with a universe of data. One of the reasons we built the Data Observatory is because getting the third-party data you need is oftentimes the hardest part of analyzing your own data. Data wrangling shouldn't be such a big roadblock to mapping and analyzing your world.
cc.map(layers=Layer(iso_table_name,
color='median_income_prenormalized_2011_2015'),
zoom=8, lng=-74.3115, lat=40.1621,
interactive=False)
from IPython.display import HTML
HTML('<iframe width="100%" height="520" frameborder="0" src="https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed" allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')