Basic cartoframes usage

cartoframes lets you use CARTO in a Python environment so that you can do all of your analysis and mapping in, for example, a Jupyter notebook. cartoframes allows you to use CARTO's functionality for data analysis, storage, location services like routing and geocoding, and visualization.

You can view this notebook best on nbviewer here: https://nbviewer.jupyter.org/github/CartoDB/cartoframes/blob/master/examples/Basic%20Usage.ipynb It is recommended to download this notebook and use on your computer instead so you can more easily explore the functionality of cartoframes.

To get started, let's load the required packages, and set credentials.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import cartoframes
from cartoframes import Credentials
import pandas as pd

USERNAME = 'eschbacher'  # <-- replace with your username 
APIKEY = 'abcdefg'       # <-- your CARTO API key
creds = Credentials(username=USERNAME, 
                    key=APIKEY)
cc = cartoframes.CartoContext(creds=creds)

cc.read

CartoContext has several methods for interacting with CARTO in a Python environment. CartoContext.read allows you to pull a dataset stored on CARTO into a pandas DataFrame. In the cell below, we use read_taxi to get the table brooklyn_poverty from a CARTO account.

In [2]:
from cartoframes.examples import read_brooklyn_poverty
cc.write(read_brooklyn_poverty(), 'brooklyn_poverty_example', overwrite=True)
Table successfully written to CARTO: https://eschbacher.carto.com/dataset/brooklyn_poverty_example
In [3]:
# Get a CARTO table as a pandas DataFrame
df = cc.read('brooklyn_poverty_example')
df.head()
Out[3]:
commuters_16_over_2011_2015 geoid pop_determined_poverty_status_2011_2015 poverty_count poverty_per_pop the_geom total_pop_2011_2015 total_population walked_to_work_2011_2015_per_pop
cartodb_id
1606 0.0 360470702031 0.0 NaN NaN 0106000020E61000000800000001030000000100000013... 0.0 0 NaN
2052 NaN 360479901000 NaN NaN NaN None NaN 0 NaN
111 0.0 360470666001 0.0 NaN NaN 0106000020E6100000030000000103000000010000006B... 0.0 0 1.553393e-12
116 NaN 360470702030 NaN NaN NaN None NaN 0 NaN
91 15928.0 360470080002 31367.0 225.0 0.17201 0106000020E61000000100000001030000000100000007... 39471.0 1309 1.505213e-02

Notice that:

  • the index of the DataFrame is the same as the index of the CARTO table (cartodb_id)
  • the_geom column stores the geometry. This can be decoded if we set the decode_geom=True flag in cc.read, which requires the library shapely.
  • We have several numeric columns
  • SQL null values are represented as numpy.nan

Other things to notice:

In [4]:
df.dtypes
Out[4]:
commuters_16_over_2011_2015                float64
geoid                                       object
pop_determined_poverty_status_2011_2015    float64
poverty_count                              float64
poverty_per_pop                            float64
the_geom                                    object
total_pop_2011_2015                        float64
total_population                             int64
walked_to_work_2011_2015_per_pop           float64
dtype: object

The dtype of each column is a mapping of the column type on CARTO. For example, numeric will map to float64, text will map to object (pandas string representation), timestamp will map to datetime64[ns], etc. The reverse happens if a DataFrame is sent to CARTO.

cc.map

Now that we can inspect the data, we can map it to see how the values change over the geography. We can use the cc.map method for this purpose.

cc.map takes a layers argument which specifies the data layers that are to be visualized. They can be imported from cartoframes as below.

There are different types of layers:

  • Layer for visualizing CARTO tables
  • QueryLayer for visualizing arbitrary queries from tables in user's CARTO account
  • BaseMap for specifying the base map to be used

Each of the layers has different styling options. Layer and QueryLayer take the same styling arguments, and BaseMap can be specified to be light/dark and options on label placement.

Maps can be interactive or not. Set interactivity with the interactive with True or False. If the map is static (not interactive), it will be embedded in the notebook as either a matplotlib axis or IPython.Image. Either way, the image will be transported with the notebook. Interactive maps will be embedded zoom and pan-able maps.

In [4]:
from cartoframes import Layer, styling, BaseMap
l = Layer('brooklyn_poverty_example',
          color={'column': 'poverty_per_pop',
                 'scheme': styling.sunset(7)})
cc.map(layers=l,
       interactive=False)
Out[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x10630a320>

Multiple variables together

In [5]:
table = 'brooklyn_poverty_example'
cols = [
    'pop_determined_poverty_status_2011_2015',
    'poverty_per_pop',
    'walked_to_work_2011_2015_per_pop',
    'total_pop_2011_2015'
]

fig, axs = plt.subplots(2, 2, figsize=(12, 12))

for idx, col in enumerate(cols):
    cc.map(layers=[BaseMap('dark'), Layer(table,
                        color={'column': col,
                               'scheme': styling.sunset(7, 'quantiles')})],
           ax=axs[idx // 2][idx % 2],
           zoom=11, lng=-73.9476, lat=40.6437,
           interactive=False,
           size=(432, 432))
    axs[idx // 2][idx % 2].set_title(col)
fig.tight_layout()
plt.show()