In this guide you will learn how to load different data files into DataFrames and how to interact with the CARTO platform to upload DataFrames into tables and download tables or SQL queries into DataFrames.
CARTOframes is built on top of Pandas and GeoPandas. Therefore, it's compatible with all the data formats supported in those projects like GeoJSON, Shapefile, CSV, etc.
There are two main concepts we should know before continuing with the guide:
Every time we manage geographic data, a GeoDataFrame should be used. In case a DataFrame with an encoded geometry column is used (WKB, WKT, etc.), every method contains a geom_col
param to provide the name of that column and decode the geometry internally.
For further learning you can checkout the Data Management examples.
from geopandas import read_file
gdf = read_file('https://libs.cartocdn.com/cartoframes/samples/starbucks_brooklyn_geocoded.geojson')
gdf.head()
cartodb_id | field_1 | name | address | revenue | geometry | |
---|---|---|---|---|---|---|
0 | 1 | 0 | Franklin Ave & Eastern Pkwy | 341 Eastern Pkwy,Brooklyn, NY 11238 | 1321040.772 | POINT (-73.95901 40.67109) |
1 | 2 | 1 | 607 Brighton Beach Ave | 607 Brighton Beach Avenue,Brooklyn, NY 11235 | 1268080.418 | POINT (-73.96122 40.57796) |
2 | 3 | 2 | 65th St & 18th Ave | 6423 18th Avenue,Brooklyn, NY 11204 | 1248133.699 | POINT (-73.98976 40.61912) |
3 | 4 | 3 | Bay Ridge Pkwy & 3rd Ave | 7419 3rd Avenue,Brooklyn, NY 11209 | 1185702.676 | POINT (-74.02744 40.63152) |
4 | 5 | 4 | Caesar's Bay Shopping Center | 8973 Bay Parkway,Brooklyn, NY 11214 | 1148427.411 | POINT (-74.00098 40.59321) |
Shapefile is a complex format, compared to CSV or GeoJSON. To learn more about this format check GeoPandas documentation.
from geopandas import read_file
gdf = read_file('https://libs.cartocdn.com/cartoframes/samples/starbucks_brooklyn_geocoded.zip')
gdf.head()
cartodb_id | field_1 | name | address | revenue | geometry | |
---|---|---|---|---|---|---|
0 | 1 | 0 | Franklin Ave & Eastern Pkwy | 341 Eastern Pkwy,Brooklyn, NY 11238 | 1321040.772 | POINT (-73.95901 40.67109) |
1 | 2 | 1 | 607 Brighton Beach Ave | 607 Brighton Beach Avenue,Brooklyn, NY 11235 | 1268080.418 | POINT (-73.96122 40.57796) |
2 | 3 | 2 | 65th St & 18th Ave | 6423 18th Avenue,Brooklyn, NY 11204 | 1248133.699 | POINT (-73.98976 40.61912) |
3 | 4 | 3 | Bay Ridge Pkwy & 3rd Ave | 7419 3rd Avenue,Brooklyn, NY 11209 | 1185702.676 | POINT (-74.02744 40.63152) |
4 | 5 | 4 | Caesar's Bay Shopping Center | 8973 Bay Parkway,Brooklyn, NY 11214 | 1148427.411 | POINT (-74.00098 40.59321) |
Compute geometry from longitude and latitude
from pandas import read_csv
from geopandas import GeoDataFrame, points_from_xy
df = read_csv('https://libs.cartocdn.com/cartoframes/samples/sf_incidents.csv')
gdf = GeoDataFrame(df, geometry=points_from_xy(df['longitude'], df['latitude']))
gdf.head()
incident_datetime | incident_date | incident_time | incident_year | incident_day_of_week | report_datetime | row_id | incident_id | incident_number | cad_number | ... | :@computed_region_qgnn_b9vv | :@computed_region_26cr_cadq | :@computed_region_ajp5_b2md | :@computed_region_nqbw_i6c3 | :@computed_region_2dwj_jsy4 | :@computed_region_h4ep_8xdi | :@computed_region_y6ts_4iup | :@computed_region_jg9y_a9du | :@computed_region_6pnf_4xz7 | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2020-08-16T03:13:00.000 | 2020-08-16T00:00:00.000 | 03:13 | 2020 | Sunday | 2020-08-16T03:14:00.000 | 95319604083 | 953196 | 200491669 | 202290313.0 | ... | 2.0 | 9 | 26.0 | NaN | NaN | NaN | NaN | NaN | 2.0 | POINT (-122.39773 37.75483) |
1 | 2020-08-16T03:38:00.000 | 2020-08-16T00:00:00.000 | 03:38 | 2020 | Sunday | 2020-08-16T04:56:00.000 | 95326228100 | 953262 | 200491738 | 202290404.0 | ... | 3.0 | 2 | 20.0 | 3.0 | NaN | NaN | NaN | NaN | 2.0 | POINT (-122.42204 37.76654) |
2 | 2020-08-16T13:40:00.000 | 2020-08-16T00:00:00.000 | 13:40 | 2020 | Sunday | 2020-08-16T13:56:00.000 | 95336264020 | 953362 | 200492463 | 202291631.0 | ... | 1.0 | 10 | 8.0 | NaN | NaN | NaN | NaN | NaN | 1.0 | POINT (-122.40371 37.78404) |
3 | 2020-08-16T16:18:00.000 | 2020-08-16T00:00:00.000 | 16:18 | 2020 | Sunday | 2020-08-16T16:18:00.000 | 95335012010 | 953350 | 200492792 | 202292091.0 | ... | 10.0 | 7 | 35.0 | NaN | NaN | NaN | NaN | NaN | 1.0 | POINT (-122.50742 37.75100) |
4 | 2020-08-12T22:00:00.000 | 2020-08-12T00:00:00.000 | 22:00 | 2020 | Wednesday | 2020-08-15T08:30:00.000 | 95300674000 | 953006 | 200489880 | 202280827.0 | ... | 4.0 | 11 | 39.0 | NaN | NaN | NaN | NaN | NaN | 2.0 | POINT (-122.43214 37.78050) |
5 rows × 37 columns
Compute geometry from WKT/WKB
from pandas import read_csv
from geopandas import GeoDataFrame
from cartoframes.utils import decode_geometry
df = read_csv('https://libs.cartocdn.com/cartoframes/samples/starbucks_brooklyn_geocoded.csv')
gdf = GeoDataFrame(df, geometry=decode_geometry(df['the_geom']))
gdf.head()
the_geom | cartodb_id | field_1 | name | address | revenue | id_store | geometry | |
---|---|---|---|---|---|---|---|---|
0 | 0101000020E61000005EA27A6B607D52C01956F146E655... | 1 | 0 | Franklin Ave & Eastern Pkwy | 341 Eastern Pkwy,Brooklyn, NY 11238 | 1321040.772 | A | POINT (-73.95901 40.67109) |
1 | 0101000020E6100000B610E4A0847D52C0B532E197FA49... | 2 | 1 | 607 Brighton Beach Ave | 607 Brighton Beach Avenue,Brooklyn, NY 11235 | 1268080.418 | B | POINT (-73.96122 40.57796) |
2 | 0101000020E6100000E5B8533A587F52C05726FC523F4F... | 3 | 2 | 65th St & 18th Ave | 6423 18th Avenue,Brooklyn, NY 11204 | 1248133.699 | C | POINT (-73.98976 40.61912) |
3 | 0101000020E61000008BA6B393C18152C08D62B9A5D550... | 4 | 3 | Bay Ridge Pkwy & 3rd Ave | 7419 3rd Avenue,Brooklyn, NY 11209 | 1185702.676 | D | POINT (-74.02744 40.63152) |
4 | 0101000020E6100000CEFC6A0E108052C080D4264EEE4B... | 5 | 4 | Caesar's Bay Shopping Center | 8973 Bay Parkway,Brooklyn, NY 11214 | 1148427.411 | E | POINT (-74.00098 40.59321) |
Note: You'll need your CARTO Account credentials to perform this action.
from cartoframes.auth import set_default_credentials
set_default_credentials('cartoframes')
from cartoframes import read_carto
gdf = read_carto('starbucks_brooklyn')
gdf.head()
cartodb_id | the_geom | field_1 | name | address | revenue | |
---|---|---|---|---|---|---|
0 | 1 | POINT (-73.95901 40.67109) | 0 | Franklin Ave & Eastern Pkwy | 341 Eastern Pkwy,Brooklyn, NY 11238 | 1321040.772 |
1 | 2 | POINT (-73.96122 40.57796) | 1 | 607 Brighton Beach Ave | 607 Brighton Beach Avenue,Brooklyn, NY 11235 | 1268080.418 |
2 | 3 | POINT (-73.98976 40.61912) | 2 | 65th St & 18th Ave | 6423 18th Avenue,Brooklyn, NY 11204 | 1248133.699 |
3 | 4 | POINT (-74.02744 40.63152) | 3 | Bay Ridge Pkwy & 3rd Ave | 7419 3rd Avenue,Brooklyn, NY 11209 | 1185702.676 |
4 | 5 | POINT (-74.00098 40.59321) | 4 | Caesar's Bay Shopping Center | 8973 Bay Parkway,Brooklyn, NY 11214 | 1148427.411 |
Note: You'll need your CARTO Account credentials to perform this action.
from cartoframes.auth import set_default_credentials
set_default_credentials('cartoframes')
from cartoframes import read_carto
gdf = read_carto("SELECT * FROM starbucks_brooklyn WHERE revenue > 1200000")
gdf.head()
cartodb_id | the_geom | field_1 | name | address | revenue | |
---|---|---|---|---|---|---|
0 | 1 | POINT (-73.95901 40.67109) | 0 | Franklin Ave & Eastern Pkwy | 341 Eastern Pkwy,Brooklyn, NY 11238 | 1321040.772 |
1 | 2 | POINT (-73.96122 40.57796) | 1 | 607 Brighton Beach Ave | 607 Brighton Beach Avenue,Brooklyn, NY 11235 | 1268080.418 |
2 | 3 | POINT (-73.98976 40.61912) | 2 | 65th St & 18th Ave | 6423 18th Avenue,Brooklyn, NY 11204 | 1248133.699 |
Note: You'll need your CARTO Account credentials to perform this action.
from cartoframes.auth import set_default_credentials
set_default_credentials('creds.json')
from cartoframes import to_carto
to_carto(gdf, 'starbucks_brooklyn_filtered', if_exists='replace')
Success! Data uploaded to table "starbucks_brooklyn_filtered" correctly
'starbucks_brooklyn_filtered'