#!/usr/bin/env python
# coding: utf-8

# ## Data Services
# 
# You can connect to [CARTO Data Services API](https://carto.com/developers/data-services-api/) directly from CARTOframes. This API consists of a set of location-based functions that can be applied to your data to perform geospatial analyses without leaving the context of your notebook. For instance, you can **geocode** a pandas DataFrame with addresses on the fly, and then perform a trade area analysis by computing **isodistances** or **isochrones** programmatically.
# 
# Using Data Services requires to be authenticated. For more information about how to authenticate, please read the [Authentication guide](/developers/cartoframes/guides/Authentication/). For further learning you can also check out the [Data Services examples](/developers/cartoframes/examples/#example-data-services).

# In[1]:


from cartoframes.auth import set_default_credentials

set_default_credentials('creds.json')


# > Depending on your CARTO account plan, some of these data services are subject to different [quota limitations](https://carto.com/developers/data-services-api/support/quota-information/).

# ### Geocoding
# 
# To get started, let's read in and explore the Starbucks location data we have. With the Starbucks store data in a DataFrame, we can see that there are two columns that can be used in the **geocoding** service: `name` and `address`. There's also a third column that reflects the annual revenue of the store.

# In[2]:


import pandas as pd

df = pd.read_csv('http://libs.cartocdn.com/cartoframes/samples/starbucks_brooklyn.csv')
df.head()


# #### Quota consumption
# 
# Each time you run Data Services, quota is consumed. For this reason, we provide the ability to check in advance the **amount of credits** an operation will consume by using the `dry_run` parameter when running the service function.
# 
# It is also possible to check your available quota by running the `available_quota` function.

# In[3]:


from cartoframes.data.services import Geocoding

geo_service = Geocoding()

city_ny = {'value': 'New York'}
country_usa = {'value': 'USA'}

_, geo_dry_metadata = geo_service.geocode(df, street='address', city=city_ny, country=country_usa, dry_run=True)


# In[4]:


geo_dry_metadata


# In[5]:


geo_service.available_quota()


# In[6]:


geo_gdf, geo_metadata = geo_service.geocode(df, street='address', city=city_ny, country=country_usa)


# Let's compare `geo_dry_metadata` and `geo_metadata` to see the differences between the information returned with and without the `dry_run` option. As we can see, this information reflects that all the locations have been geocoded successfully and that it has consumed 10 credits of quota.

# In[7]:


geo_metadata


# In[8]:


geo_service.available_quota()


# If the input data file ever changes, cached results will only be applied to unmodified
# records, and new geocoding will be performed only on _new or changed records_. In order to use cached results, we have to save the results to a CARTO table using the `table_name` and `cached=True` parameters.

# The resulting data is a `GeoDataFrame` that contains three new columns:
# 
# * `geometry`: The resulting geometry
# * `gc_status_rel`: The percentage of accuracy of each location
# * `carto_geocode_hash`: Geocode information

# In[9]:


geo_gdf.head()


# In addition, to prevent geocoding records that have been **previously geocoded**, and thus spend quota **unnecessarily**, you should always preserve the ``the_geom`` and ``carto_geocode_hash`` columns generated by the geocoding process.
# 
# This will happen **automatically** in these cases:
# 
# 1. Your input is a **table** from CARTO processed in place (without a ``table_name`` parameter)
# 2. If you save your results to a CARTO table using the ``table_name`` parameter, and only use the resulting table for any further geocoding.
# 
# If you try to geocode this DataFrame now that it contains both ``the_geom`` and the ``carto_geocode_hash``, you will see that the required quota is 0 because it has already been geocoded.

# In[10]:


_, geo_metadata = geo_service.geocode(geo_gdf, street='address', city=city_ny, country=country_usa, dry_run=True)


# In[11]:


geo_metadata.get('required_quota')


# #### Precision
# 
# The `address` column is more complete than the `name` column, and therefore, the resulting coordinates calculated by the service will be more accurate. If we check this, the accuracy values using the `name` column are lower than the ones we get by using the `address` column for geocoding.

# In[12]:


geo_name_gdf, geo_name_metadata = geo_service.geocode(df, street='name', city=city_ny, country=country_usa)


# In[13]:


geo_name_gdf.gc_status_rel.unique()


# In[14]:


geo_gdf.gc_status_rel.unique()


# #### Visualize the results
# 
# Finally, we can visualize the precision of the geocoded results using a CARTOframes [visualization layer](/developers/cartoframes/examples/#example-color-bins-layer).

# In[15]:


from cartoframes.viz import Layer, color_bins_style, popup_element

Layer(
    geo_gdf,
    color_bins_style('gc_status_rel', method='equal', bins=geo_gdf.gc_status_rel.unique().size),
    popup_hover=[popup_element('address', 'Address'), popup_element('gc_status_rel', 'Precision')],
    title='Geocoding Precision'
)


# ### Isolines
# 
# There are two **Isoline** functions: **isochrones** and **isodistances**. In this guide we will use the **isochrones** function to calculate walking areas _by time_ for each Starbucks store and the **isodistances** function to calculate the walking area _by distance_.
# 
# By definition, isolines are concentric polygons that display equally calculated levels over a given surface area, and they are calculated as the intersection areas from the origin point, measured by:
# 
# * **Time** in the case of **isochrones**
# * **Distance** in the case of **isodistances**

# #### Isochrones
# 
# For isochrones, let's calculate the time ranges of 5, 15 and 30 minutes. These ranges are input in `seconds`, so they will be **300**, **900**, and **1800** respectively.

# In[16]:


from cartoframes.data.services import Isolines

iso_service = Isolines()

_, isochrones_dry_metadata = iso_service.isochrones(geo_gdf, [300, 900, 1800], mode='walk', dry_run=True)


# Remember to always **check the quota** using `dry_run` parameter and `available_quota` method before running the service!

# In[17]:


print('available {0}, required {1}'.format(
    iso_service.available_quota(),
    isochrones_dry_metadata.get('required_quota'))
)


# In[18]:


isochrones_gdf, isochrones_metadata = iso_service.isochrones(geo_gdf, [300, 900, 1800], mode='walk')


# In[19]:


isochrones_gdf.head()


# In[20]:


from cartoframes.viz import Layer, basic_style, basic_legend

Layer(isochrones_gdf, basic_style(opacity=0.5), basic_legend('Isochrones'))


# #### Isodistances
# 
# For isodistances, let's calculate the distance ranges of 100, 500 and 1000 meters. These ranges are input in `meters`, so they will be **100**, **500**, and **1000** respectively.

# In[21]:


_, isodistances_dry_metadata = iso_service.isodistances(geo_gdf, [100, 500, 1000], mode='walk', dry_run=True)


# In[22]:


print('available {0}, required {1}'.format(
    iso_service.available_quota(),
    isodistances_dry_metadata.get('required_quota'))
)


# In[23]:


isodistances_gdf, isodistances_metadata = iso_service.isodistances(geo_gdf, [100, 500, 1000], mode='walk')


# In[24]:


isodistances_gdf.head()


# In[25]:


from cartoframes.viz import Layer, basic_style, basic_legend

Layer(isodistances_gdf, basic_style(opacity=0.5), basic_legend('Isodistances'))