Restaurant Quest using TomTom API

About this document

Table of contents

1. Introduction: The Challenge
2. Data Requirements
3. Analyze Demographic Data
4. View Candidate Neighborhoods on a Map
5. Explore the surrondings
6. In-depth analysis of one neighborhood
7. Conclusion and future work



1. Introduction: The Challenge

Back to top

A good friend of mine, Linda, whose dream is to open a Chinese restaurant one day to share the joy of good food with others. Now, everything is ready, she can finally realize her dream. She has chosen Amsterdam to be the place that her dream starts. Not only because Amsterdam is one of the most populous and visited cities in Europe but also because the diverse culture the city embraces.

My challenge as a data scientist is to help her to find the ideal location in Amsterdam using whatever data I can access.

1.1. A Cooperative Iterative Approach

It's important to emphasize that even though I might know how to deal with data better, Linda definitely knows food business a lot more. Throughout this project, we will work closely together to take full advantage of our expertise.

The process is also iterative. Not all decisions made at the beginning of the project remains the best ones. We both fully understand that when we learn more about Amsterdam and the its residences, we need to continuously re-examine and fine tune our decisions.

1.2. About Amsterdam

Amsterdam is the capital city and most populous municipality of the Netherlands. Here are some facts about its residences:

  • Amsterdam has a population of 854,047 within the city proper
  • Amsterdam city proper has 4,457 inhabitants per square kilometer and 2,275 households per square kilometer.
  • Amsterdam has more than 100 kilometers (60 miles) of canals, most of which are navigable by boat.

As to tourism, Amsterdam is one of the most popular tourist destinations in Europe

  • Number of international tourists per year: 20.63 million.
  • Out of which, the number of day-trippers is: 16 million.

Reference: https://en.wikipedia.org/wiki/Amsterdam

1.3. Business Questions

To find the ideal location for the restaurant, we must first seek answers to a few questions.

Question 1: How many restaurants already exist?

If this new restaurant would be the only one in a neighborhood, there will be more profit for Linda. So, the number of existing restaurants in the neighborhood must be taken into consideration. Again, this question can be answered, hopefully, by using TomTom API.

For Linda, it's important to serve traditional Chinese food the way she knows. Even though Chinese food is widely loved, it makes sense to double check how existing Chinese restaurants (or Asian restaurants) are perceived. This question can be answered, hopefully, by using TomTom API.

Question 3: Who are the target customers and where do they live?

It is going to be a small restaurant (5 to 7 tables) due to the limited investment. The primary income would be takeout and orders made online. From past experience, Linda knows that people who live alone are more likely to buy takeout or use online food ordering apps such as Uber Eats. They are the ideal target customers for her new restaurant. So, we will look for an area with a relatively high density of one-person household. We need demographic information to answer this question.




2. Data Requirements

Back to top

We have the big question, where to open a Chinese restaurant in Amsterdam. Now, we need to collect data that can help us answer the questions. We need to collect data from at least two sources:

  • Data of the surroundings (density of similar restaurants nearby)
  • Demographic data (per area in Amsterdam)

2.1. Data of Surroundings

Back to top

2.2 Demographic Data

Back to top

In order to know which neighborhoods are more interesting to investigate further, we need to look into demographic data to pick the neighborhoods that have more target customers in terms of quantity and density.

The Central Bureau of Statistics of the Netherlands, CBS in short, provides a large number of demographic data regarding who live and work in the Netherlands.
In this page, you can choose whatever feature you need to solve the problem. The list of features is quite comprehensive. It is important, therefore, to define exactly what the data can be used.

Features that can be used to answer the question

  • Regional specifics (Regioaanduiding): This information can help me to link relative details to a specific area.
  • Total Households (Particulier huishouden): Number of households in a neighborhood.
  • One-person Households (Eenpersoonshuishoudens): Number of the households that with only one person.
  • Population density (Bevolkingsdichtheid): A more densely populated area means more customers for a restaurant. The unit of population density is number of people per square kilometer.

Information that isn't available

  • Origin of birth (Personen met een migratieachtergrond): This information can help us to determine the types of food the restaurant should offer.
    • Unfortunately, the categorization is not detailed enough. I am not able to single out people come from China in the data.
  • Income per household (Inkomen van huishoudens): Since restaurants are usually quite expensive comparing to home-cooking. Only people who have sufficient income can afford to go to restaurants often.
    Unfortunately, this information is missing consistently from the database.

Features deliberately excluded

  • Civil status: Since we can distinguish one-person households, it's not necessary to understand why people live alone (single or divorced does not seem to link to food strategy directly).
  • Gender: Reason, in the Netherlands, there isn't a big difference between men and women in terms of the likeliness of cooking at home.
  • Type of house: This has no direct correlation to people's choice of food.




3. Analyze Demographic Data

Back to top

I selected necessary data from CBS, as mentioned in chapter 2.2.
The data is in CSV format.

3.1 Load Data to a Dataframe

Load necessary libraries

In [1]:
# library to handle data in a vectorized manner
import numpy as np 
# library to load dataframe
import pandas as pd

# Matplotlib and associated plotting mhttps://leafletjs.com/odules
import matplotlib.colors as colors
import matplotlib.pyplot as plt

Load the CSV file

In [2]:
df = pd.read_csv('https://github.com/xding78/Sharing/raw/master/RestaurantQuest/Amsterdam.csv')
df.head()
Out[2]:
Neighborhood ID Total Residences Total Households One-person Households Population Density Lat Lon
0 Burgwallen-Oude Zijde WK036300 4305 3090 2180 12323 52.371946 4.896103
1 Burgwallen-Nieuwe Zijde WK036301 3930 2835 2000 6881 52.373706 4.889922
2 Grachtengordel-West WK036302 6385 4110 2570 14261 52.370837 4.885478
3 Grachtengordel-Zuid WK036303 5350 3410 2140 10303 52.364422 4.894243
4 Nieuwmarkt WK036304 9765 6485 4285 13741 52.372160 4.900096
In [3]:
df.shape
Out[3]:
(65, 8)

In total 65 neighborhoods

There are definitely more neighborhoods in Amsterdam municipality. However, for the sake of this challenge, we decided to focus on the neighborhoods that within or connected to Amsterdam city proper.

Sort the neighborhoods by the population density

In [4]:
df.sort_values(["Population Density"], axis=0, ascending=False, inplace=True)
df.head()
Out[4]:
Neighborhood ID Total Residences Total Households One-person Households Population Density Lat Lon
14 Staatsliedenbuurt WK036314 13315 8105 4860 28139 52.380287 4.870951
19 Van Lennepbuurt WK036319 6990 4535 3005 28005 52.365144 4.867845
31 Indische Buurt West WK036331 12640 7060 3930 26985 52.361625 4.938813
21 Overtoomse Sluis WK036321 7890 4840 2910 26482 52.359468 4.860689
18 Kinkerbuurt WK036318 6590 3950 2460 26135 52.369167 4.866649

3.2 Drop Unnecessary Data

Back to top

As we mentioned earlier, we want to learn the population density together with how many households are with only one person. It seems that the total number of residences is not necessary to answer any of the questions. Therefore, we decide to remove it from the data from now on.

In [5]:
df.drop("Total Residences", axis=1, inplace=True)
df.head()
Out[5]:
Neighborhood ID Total Households One-person Households Population Density Lat Lon
14 Staatsliedenbuurt WK036314 8105 4860 28139 52.380287 4.870951
19 Van Lennepbuurt WK036319 4535 3005 28005 52.365144 4.867845
31 Indische Buurt West WK036331 7060 3930 26985 52.361625 4.938813
21 Overtoomse Sluis WK036321 4840 2910 26482 52.359468 4.860689
18 Kinkerbuurt WK036318 3950 2460 26135 52.369167 4.866649

3.3. Observe Data using a Bar Chart

Back to top

In order to better decide what to do with the data, I want to take a good look at the data. Visualizing the data will help a lot. I choose to use a horizontal bar chart, because I want the neighborhood names to be very easy to read. Due to the number of neighborhoods (65), the vertical bar might not offer enough room to show all the bars.

In [6]:
# step 1: Extract only necessary data
df_visualize = df[["Neighborhood", "Total Households", "One-person Households", "Population Density"]]
df_visualize.head()
Out[6]:
Neighborhood Total Households One-person Households Population Density
14 Staatsliedenbuurt 8105 4860 28139
19 Van Lennepbuurt 4535 3005 28005
31 Indische Buurt West 7060 3930 26985
21 Overtoomse Sluis 4840 2910 26482
18 Kinkerbuurt 3950 2460 26135
In [7]:
ax = df_visualize.plot(kind='barh', figsize=(14,20))
#ax.set_title('Population and Households in Amsterdam')
ax.set_xlabel('Population and Households in Amsterdam')
ax.set_ylabel('Neighborhood')
ax.invert_yaxis()
ax.set_yticklabels(df['Neighborhood'].values)

rects = ax.patches

# use axvline to mark the average population density
mean = df["Population Density"].mean()
ax.axvline(mean, color='#2B9B2A') #Green

# use axvline to mark the average total households
mean2 = df["Total Households"].mean()
ax.axvline(mean2, color='#1E77B4') #Blue

# use axvline to mark the average one-person households
mean3 = df["One-person Households"].mean()
ax.axvline(mean3, color='#FF7F0F') #Orange
Out[7]:
<matplotlib.lines.Line2D at 0x11896e518>

3.3.1. Learnings from the Above Bar Chart

Back to top

  1. One thing becomes evident is that many neighborhoods in Amsterdam has much lower population density than the average value. These neighborhoods are very unlikely to be ideal location to open the restaurant. Therefore, we should remove them to focus on the neighborhoods that are more densely populated. We will do that in the next chapter.
  2. The second learning is that amongst the more densely populated neighborhoods, not all of them have above average total number of households and one-person households.We will do further analysis in chapter 3.5 to filter out the neighborhoods that do not have enough one-person households.

3.4. Remove Neighborhoods that have below average population density

Back to top

In [8]:
#First calculate the average residences of each neighborhood
average_density = int(df["Population Density"].mean())
print("Average population density of Amsterdam city proper is: ", average_density)
Average population density of Amsterdam city proper is:  13233
In [9]:
#Drop all neighborhoods that has a Total Households number below average
#result[result['Value'] ! <= 10]  
df = df[df["Population Density"] > average_density]
df.head()
Out[9]:
Neighborhood ID Total Households One-person Households Population Density Lat Lon
14 Staatsliedenbuurt WK036314 8105 4860 28139 52.380287 4.870951
19 Van Lennepbuurt WK036319 4535 3005 28005 52.365144 4.867845
31 Indische Buurt West WK036331 7060 3930 26985 52.361625 4.938813
21 Overtoomse Sluis WK036321 4840 2910 26482 52.359468 4.860689
18 Kinkerbuurt WK036318 3950 2460 26135 52.369167 4.866649
In [10]:
print("Number of neighborhoods that has higher than average population density: ", df.shape[0])
Number of neighborhoods that has higher than average population density:  29

3.5. Neighborhoods that Have More One-person Households

Back to top

In [11]:
#calculate the percentage of one-person households of every neighborhood
df["Percentage of One-person Households"] = round(df["One-person Households"] / df["Total Households"]*10000)/100
df.head()
Out[11]:
Neighborhood ID Total Households One-person Households Population Density Lat Lon Percentage of One-person Households
14 Staatsliedenbuurt WK036314 8105 4860 28139 52.380287 4.870951 59.96
19 Van Lennepbuurt WK036319 4535 3005 28005 52.365144 4.867845 66.26
31 Indische Buurt West WK036331 7060 3930 26985 52.361625 4.938813 55.67
21 Overtoomse Sluis WK036321 4840 2910 26482 52.359468 4.860689 60.12
18 Kinkerbuurt WK036318 3950 2460 26135 52.369167 4.866649 62.28
In [12]:
#sorting data frame by Percentage of One-person Households
df.sort_values(["Percentage of One-person Households"], axis=0, ascending=False, inplace=True) 
df.head()
Out[12]:
Neighborhood ID Total Households One-person Households Population Density Lat Lon Percentage of One-person Households
6 Jordaan WK036306 12985 8625 23289 52.374500 4.879491 66.42
19 Van Lennepbuurt WK036319 4535 3005 28005 52.365144 4.867845 66.26
4 Nieuwmarkt WK036304 6485 4285 13741 52.372160 4.900096 66.08
24 Oude Pijp WK036324 9875 6510 23353 52.355216 4.894574 65.92
25 Nieuwe Pijp WK036325 7905 5015 23998 52.351856 4.897728 63.44
In [13]:
# step 1: Extract only necessary data
# oph stands for One-person Households
df_oph = df[["Neighborhood", "Percentage of One-person Households"]]
df_oph.head()
Out[13]:
Neighborhood Percentage of One-person Households
6 Jordaan 66.42
19 Van Lennepbuurt 66.26
4 Nieuwmarkt 66.08
24 Oude Pijp 65.92
25 Nieuwe Pijp 63.44
In [14]:
# step 2: plot data
ax = df_oph.plot(kind='barh', figsize=(14,10))
ax.set_title('Population density per neighborhood in Amsterdam')
ax.set_xlabel('Population Density')
ax.set_ylabel('Neighborhood')
ax.invert_yaxis()
ax.set_yticklabels(df_oph['Neighborhood'].values)

rects = ax.patches

ax.axvline(61, color='r')
Out[14]:
<matplotlib.lines.Line2D at 0x1190a89b0>

3.5.1. Learnings from the Above Bar Chart

Back to top

As mentioned after we examine the data in a bar chart (chapter 3.3.1). It seems that the percentage of one-person households (in comparison to the total number of households) seems to be rather consistent for the neighborhoods that have above average population density.

Now, in the above bar chart, the consistency becomes quite clear.

However, we observe there are roughly 3 ranges of the percentage:

  1. High: 66% ~ 67%. The first four neighborhood
  2. Medium: 59% ~ 63.5%.
  3. Low: 53% ~ 59%. The last 10 neighborhood

We decided to focus on the top 10, the neighborhoods that have over 61 percent of one-person households (Marked by the red line in the above bar chart).

In [15]:
df = df[df["Percentage of One-person Households"] > 61.0]
df.shape
Out[15]:
(10, 8)




4. View candidate neighborhoods on a map

Back to top

I use these tools to visualize the information I gathered above:

Recap the 10 remaining neighborhoods

In [16]:
df
Out[16]:
Neighborhood ID Total Households One-person Households Population Density Lat Lon Percentage of One-person Households
6 Jordaan WK036306 12985 8625 23289 52.374500 4.879491 66.42
19 Van Lennepbuurt WK036319 4535 3005 28005 52.365144 4.867845 66.26
4 Nieuwmarkt WK036304 6485 4285 13741 52.372160 4.900096 66.08
24 Oude Pijp WK036324 9875 6510 23353 52.355216 4.894574 65.92
25 Nieuwe Pijp WK036325 7905 5015 23998 52.351856 4.897728 63.44
27 Weesperzijde WK036327 3470 2180 14984 52.357900 4.906300 62.82
2 Grachtengordel-West WK036302 4110 2570 14261 52.370837 4.885478 62.53
18 Kinkerbuurt WK036318 3950 2460 26135 52.369167 4.866649 62.28
20 Helmersbuurt WK036320 4580 2835 22124 52.363360 4.871285 61.90
16 Frederik Hendrikbuurt WK036316 5160 3165 23520 52.376956 4.874085 61.34

Install and import folium

In [17]:
#!conda install -c conda-forge folium=0.5.0 --yes #install folium
import folium # map rendering library

4.1 Use TomTom Search API

Back to top

Get an API key

Click the "Get Your Key" button in this page to get an API key.

Load the TomTom API

TomTom API offers multiple APIs, including the Search API. There is no need to load each API separately.

In [18]:
import requests
tomtom_api_keys = ["qTI9oA80m7X6TeWf4qKDjA2UvCy6p5mA"] # max 2500 calls/day
api_key = tomtom_api_keys[0]

Establishing the map

First, I want to define a function using Geocoding feature in Search API to get lat/lon of the center of a city. In this case, I retrieve the center of Amsterdam so that the map is properly aligned in the view.

In [19]:
# Search for city: 
def SearchCity(api_key,City,Country):
    
    url = 'https://api.tomtom.com/search/2/search/'
    url += City + ', ' + Country
    url += '.json?limit=1&idxSet=Geo&key=' + api_key
    
    result = requests.get(url).json()
    
    GeoID = result['results'][0]['dataSources']['geometry']['id']
    position = result['results'][0]['position']
    
    return GeoID,position

Get the center location of Amsterdam

In [20]:
Amsterdam_position = SearchCity(api_key, "Amsterdam", "Netherlands")
In [21]:
lat_amsterdam = Amsterdam_position[1]['lat']
lon_amsterdam = Amsterdam_position[1]['lon']
print(lat_amsterdam, lon_amsterdam)
52.37317 4.89066
In [22]:
#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the TomTom API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
In [23]:
address = 'Amsterdam, The Netherlands'

geolocator = Nominatim(user_agent="amsterdam")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Amsterdam are {}, {}.'.format(latitude, longitude))
The geograpical coordinate of Amsterdam are 52.3727598, 4.8936041.

First Impression of the Candidate Neighborhoods

Now, let’s instantiate the visual component, the TomTom map itself, so I can begin displaying neighborhoods.

In [24]:
#Define a function to initialize any map using TomTom map.
def init_map(api_key=api_key, latitude=0, longitude=0, zoom=14, layer = "basic", style = "main"):
    """
    The initialise_map function initializes a clean TomTom map
    """
    
    maps_url = "http://{s}.api.tomtom.com/map/1/tile/"+layer+"/"+style+"/{z}/{x}/{y}.png?tileSize=512&key="
    TomTom_map = folium.Map(
        location = [latitude, longitude],  # on what coordinates [lat, lon] to initialise our map
        zoom_start = zoom,  # with what zoom level to initialize the map, from 0 to 22
        tiles = str(maps_url + api_key),
        attr = 'TomTom')
    
    return TomTom_map

4.2. Visualize one feature on the map

Back to top

Let's start from visualizing the number of one-person households on the map to get an impression of the 10 candidate neighborhoods.

In [25]:
#Visualize one feature (number of one-person households) to get an impression of the 10 candidate neighborhoods.
TomTom_map = init_map(latitude=lat_amsterdam, longitude=lon_amsterdam, zoom=13, layer = "basic")

# add markers to map
for lat, lon, neighborhood, oph in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['One-person Households']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=oph/25,
        popup=label, 
        color='#FF7F0F', # Orange
        fill=True,
        fill_color='#FF7F0F',
        fill_opacity=0.3).add_to(TomTom_map)
TomTom_map.save('01_demographic.html')
TomTom_map
Out[25]:

4.3. Visualize more features on the map

Back to top

The above map gives us an impression of how many one-person households actually exist in each neighborhood.

Now, let's add two more features to the map, so there are three features in total:

  1. Orange circles represent the number of one-person households.
  2. Blue circles represent the number of households in total.
  3. Green circles represent the population density.

Important notes about these circles:

  • The center of the orange, green, and blue circles is the center of the neighborhood. Click the center of the circles to see the name of the neighborhood.
  • The radius of each circle represents the number of each feature.

In order to show a more zoomed in map view, I re-adjust the center of the map.

Re-adjust the center of the map using an address

Based on the previous map visualization, I can see a better center for further analysis is the address: Prinsengracht 745A Amsterdam.

In [26]:
url = "https://api.tomtom.com/search/2/geocode/Prinsengracht 745A Amsterdam.json?countrySet=NL&key=" + api_key
result = requests.get(url).json()
In [27]:
lat_center = result['results'][0]['position']['lat']
lon_center = result['results'][0]['position']['lon']
print(lat_center, lon_center)
52.36425 4.88628

Draw the Map with One-person Households, Total Households, and Population Density.

In [28]:
TomTom_map = init_map(latitude=lat_center, longitude=lon_center, zoom=14, layer = "basic")

# add markers that represent one-person households to the map
for lat, lon, neighborhood, oph in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['One-person Households']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=oph/25,
        popup=label,
        color='#FF7F0F', # Orange
        fill=True,
        fill_color='#FF7F0F', 
        fill_opacity=0.3
    ).add_to(TomTom_map)

# add markers that represent total households to the map
for lat, lon, neighborhood, households in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['Total Households']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=households/25,
        popup=label,
        color='#1E77B4', # Blue
        fill=False
    ).add_to(TomTom_map)
    
# add markers that represent population density to the map
for lat, lon, neighborhood, density in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['Population Density']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=density/100,
        popup=label,
        color='#2A9E2A', # Green
        fill=False
    ).add_to(TomTom_map)
TomTom_map.save('02_demographic.html')
TomTom_map
Out[28]:

Learnings from the above data visualization

As you can see, when choosing an ideal location to open the Chinese restaurant:

  • The bigger the green circles the better.
  • The less difference between the size of the blue circles and the orange circles the better.

4.4. Conclusion and Next Step

Back to top

Based the above analysis, I have chosen 10 out of 65 neighborhoods in Amsterdam city proper as the candidate neighborhoods for us to investigate further.

The next step, covered in the next chapter, will be to further analyze the 10 neighborhoods by looking into the density of Chinese restaurants in each. This will help me narrow down Linda’s choices for the best exact location for her new restaurant.




5. Explore the surroundings

Back to top

(Add description of TomTom Search API)

Now, I know where the remaining 10 neighborhoods locate and their geo-relationship. It's time to explore the surroundings. In the scope of the project, I will focus on only question to demonstrate the methodology:

  • How many Chinese restaurants are already available in each neighborhood?

5.1. How many Chinese restaurants are already available in each neighborhood?

Use the Search API explorer to get the url. I choose to store all search results in a JSON file.

Some key variables:

  • Search radius: radius
  • Maximum number of search results: limit
In [29]:
search_radius = 3000
search_limit = 2000
In [30]:
url = ('https://api.tomtom.com/search/2/categorySearch/Chinese restaurant.json?countrySet=NL'
       +'&lat=52.364250&lon=4.886280&limit=2000&radius=3000&key=' + api_key)
result = requests.get(url).json()
#result

Examine one of the search results in the JSON file

In [31]:
{'type': 'POI',
   'id': 'NL/POI/p0/109857',
   'score': 5.14904,
   'dist': 150.32529954911772,
   'info': 'search:ta:528009005857203-NL',
   'poi': {'name': 'Taste Of Culture',
    'phone': '+(31)-(20)-4271136',
    'categorySet': [{'id': 7315012}],
    'url': 'www.tasteofculture.net',
    'categories': ['chinese', 'restaurant'],
    'classifications': [{'code': 'RESTAURANT',
      'names': [{'nameLocale': 'en-US', 'name': 'chinese'},
       {'nameLocale': 'en-US', 'name': 'restaurant'}]}]},
   'address': {'streetNumber': '139HS',
    'streetName': 'Korte Leidsedwarsstraat',
    'municipalitySubdivision': 'Amsterdam',
    'municipality': 'Amsterdam',
    'countrySubdivision': 'North Holland',
    'postalCode': '1017',
    'extendedPostalCode': '1017PZ',
    'countryCode': 'NL',
    'country': 'Netherlands',
    'countryCodeISO3': 'NLD',
    'freeformAddress': 'Korte Leidsedwarsstraat 139HS, 1017PZ, Amsterdam',
    'localName': 'Amsterdam'},
   'position': {'lat': 52.36311, 'lon': 4.88509},
   'viewport': {'topLeftPoint': {'lat': 52.36401, 'lon': 4.88362},
    'btmRightPoint': {'lat': 52.36221, 'lon': 4.88656}},
   'entryPoints': [{'type': 'main',
     'position': {'lat': 52.36305, 'lon': 4.885}}]},
Out[31]:
({'type': 'POI',
  'id': 'NL/POI/p0/109857',
  'score': 5.14904,
  'dist': 150.32529954911772,
  'info': 'search:ta:528009005857203-NL',
  'poi': {'name': 'Taste Of Culture',
   'phone': '+(31)-(20)-4271136',
   'categorySet': [{'id': 7315012}],
   'url': 'www.tasteofculture.net',
   'categories': ['chinese', 'restaurant'],
   'classifications': [{'code': 'RESTAURANT',
     'names': [{'nameLocale': 'en-US', 'name': 'chinese'},
      {'nameLocale': 'en-US', 'name': 'restaurant'}]}]},
  'address': {'streetNumber': '139HS',
   'streetName': 'Korte Leidsedwarsstraat',
   'municipalitySubdivision': 'Amsterdam',
   'municipality': 'Amsterdam',
   'countrySubdivision': 'North Holland',
   'postalCode': '1017',
   'extendedPostalCode': '1017PZ',
   'countryCode': 'NL',
   'country': 'Netherlands',
   'countryCodeISO3': 'NLD',
   'freeformAddress': 'Korte Leidsedwarsstraat 139HS, 1017PZ, Amsterdam',
   'localName': 'Amsterdam'},
  'position': {'lat': 52.36311, 'lon': 4.88509},
  'viewport': {'topLeftPoint': {'lat': 52.36401, 'lon': 4.88362},
   'btmRightPoint': {'lat': 52.36221, 'lon': 4.88656}},
  'entryPoints': [{'type': 'main',
    'position': {'lat': 52.36305, 'lon': 4.885}}]},)

I can learn from the above JSON file that the following information is essential to show Chinese Restaurants on the map:

  • Get lat lon from:'position': {'lat': 52.36311, 'lon': 4.88509},
  • Get name from: 'poi': {'name': 'Taste Of Culture',

Now, let's plot these restaurants on the map.

5.2. Show Chinese restaurants on the map

Back to top

Use the position and name information extracted from the JSON file to show POIs on the map.

In [32]:
# add a grey circle to represent the search radius
folium.Circle(
    [lat_center, lon_center],
    radius=search_radius,
    color='#004B7F', # Navy
    opacity=0.3,
    fill = False
).add_to(TomTom_map)

# Add POIs one by one to the map
for poi in result['results']:
    folium.Marker(location=tuple(poi['position'].values()),
                  popup=str(poi['poi']['name']), 
                  icon=folium.Icon(color='blue', icon='glyphicon-star')
                  #icon=icon
             ).add_to(TomTom_map)
TomTom_map.save('03_ChineseRestaurants.html')
TomTom_map
Out[32]:

Legends of the above map

  1. Blue markers: Chinese restaurants.
  2. Orange circles: the number of one-person households.
  3. Blue circles: the number of households in total.
  4. Green circles: the population density.
  5. Grey circle: the search radius.

5.3. Cluster the POIs

Back to top

What I would really like to do is have a more obvious visual as to the number of Chinese restaurants in the area. Clustering the POIs (Point of Interests) might help this.

First, let's import the necessary plugins: folium.plugins.MarkerCluster

In [33]:
from folium.plugins import MarkerCluster

Show POIs as clusters on the map

In [34]:
#------IMPORTANT: Reinitiate the TomTom_map, so that the POI pins won't remain in the map--------
TomTom_map = init_map(latitude=lat_center, longitude=lon_center, zoom=14, layer = "basic")

# add markers that represent one-person households to the map
for lat, lon, neighborhood, oph in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['One-person Households']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=oph/25,
        popup=label,
        color='#FF7F0F', # Orange
        fill=True,
        fill_color='#FF7F0F', 
        fill_opacity=0.3
    ).add_to(TomTom_map)

# add markers that represent total households to the map
for lat, lon, neighborhood, households in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['Total Households']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=households/25,
        popup=label,
        color='#1E77B4', # Blue
        fill=False
    ).add_to(TomTom_map)
    
# add markers that represent population density to the map
for lat, lon, neighborhood, density in zip(df['Lat'], df['Lon'], df['Neighborhood'], df['Population Density']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lon],
        radius=density/100,
        popup=label,
        color='#2A9E2A', # Green
        fill=False
    ).add_to(TomTom_map)
    
#----------------END of the reinitiation of TomTom_map------------



#--------------Show POIs in Clusters rather than POI pins on the map------------

# Define the marker cluster
mc = MarkerCluster()

# add a grey circle to represent the search radius
folium.Circle(
    [lat_center, lon_center],
    radius=search_radius,
    color='#004B7F', # Navy
    opacity=0.3,
    fill = False
).add_to(TomTom_map)

# Add POIs one by one to the map
for poi in result['results']:
    mc.add_child(
        folium.Marker(
            location=tuple(poi['position'].values()),
            popup=str(poi['poi']['name'])
    ))

TomTom_map.add_child(mc)
TomTom_map.save('04_POI_Clustered.html')
TomTom_map
Out[34]:

Zoom in and out the map above to observe how the clusters react.

Key Takeaways

Now that I have investigated options for Linda through multiple filters and criteria, I can conclude:

  • There must be at least one existing Chinese restaurant in or near the neighborhood.
    If there isn’t at least one Chinese restaurant, it might mean that there is not enough demand.
    Opening a Chinese restaurant without understanding why there are no any could be a risk for Linda.
  • There cannot be more than 10 existing Chinese restaurants in the neighborhood, in order to mitigate competition for her.
    Linda wants to stand out!!

If I apply the criteria, from the above map, I can exclude these neighborhoods:

  • Too many existing Chinese restaurants
    • Nieuwmarkt
  • No existing Chinese restaurant in or near the neighborhood
    • Weesperzijde
    • Frederik Hendrikbuurt
    • Grachtengordel-West
    • Nieuwe Pijp

The remaining neighborhoods left for Linda to choose from:

  • Jordaan
  • Van Lennepbuurt
  • Oude Pijp
  • Kinkerbuurt
  • Helmersbuurt




6. In-depth analysis of one neighborhood

Back to top

Let's use Jordaan as an example to show how I look into one particular candidate neighborhood.

6.1. Draw the area of the neighborhood on the map

In [35]:
area_name = 'Jordaan'

Define a function to get polygon of a given GeoID.

In [36]:
# get polygon of GeoID: 
def getPolygon(api_key,GeoID,zoomLevel):
    
    url = 'https://api.tomtom.com/search/2/additionalData.json?geometries=' + GeoID
    url += '&geometriesZoom=' + str(zoomLevel)
    url += '&key=' + api_key
    
    result = requests.get(url).json()    
    GeoJson = result['additionalData'][0]['geometryData']
    
    return GeoJson

Search the city

In [37]:
# Search City:
GeoID, position = SearchCity(api_key, area_name ,'Amsterdam')

Find out the center location of the neighborhood

In [38]:
lat_area = position['lat']
lon_area = position['lon']
print("The center of the neighborhood is: (", lat_area, ", ", lon_area, ")")
The center of the neighborhood is: ( 52.37329 ,  4.87992 )

Create a polygon and add it to the map

In [39]:
# Get Polygon of city:
Polygon = getPolygon(api_key,GeoID,22)

map_url = 'http://{s}.api.tomtom.com/map/1/tile/basic/main/{z}/{x}/{y}.png?view=Unified&key=' + api_key

TomTom_map = folium.Map(
   location=[lat_area, lon_area],
   zoom_start=14,
   tiles= map_url,
   attr='TomTom')

# add polygons to a map
folium.GeoJson(
    Polygon).add_to(TomTom_map)

TomTom_map.save('05_Area.html')
TomTom_map
Out[39]:

6.2. Show Chinese restaurants in the neighborhood

Back to top

Search for the Chinese restaurant using the search API

Set the search radius to 1.2 km to cover the entire neighborhood.

In [40]:
url = ('https://api.tomtom.com/search/2/categorySearch/Chinese restaurant.json?countrySet=NL'
       +'&lat=52.37329&lon=4.87992&limit=2000&radius=1200&key=' + api_key)
result = requests.get(url).json()
#result
In [41]:
# add a grey circle to represent the search radius
folium.Circle(
    [lat_area, lon_area],
    radius=1200,
    color='#004B7F', # Navy
    opacity=0.3,
    fill = False
).add_to(TomTom_map)

# Add POIs one by one to the map
for poi in result['results']:
    folium.Marker(location=tuple(poi['position'].values()),
                  popup=str(poi['poi']['name']), 
                  icon=folium.Icon(color='blue', icon='glyphicon-star')
                  #icon=icon
             ).add_to(TomTom_map)
TomTom_map.save('06_Area_POI.html')
TomTom_map
Out[41]:

Legends of the above map

  1. Blue markers: Chinese restaurants.
  2. Blue area: The shape of the neighborhood
  3. Grey circle: the search radius.

Takeaways from this map

  1. According to TomTom Maps API, there are more than 20 Chinese Restaurants within the range of 500 meters of the neighborhood Jordaan.
  2. From the map I learn that the west of Jordaan seems to be void of Chinese restaurants. If Linda opens a Chinese restaurant there, she will likely have enough customers.

6.3. Repeat!

Back to top

At this point, I would advise Linda to repeat this in-depth examination for each neighborhood she is considering. I could also adjust the details. I consider to include other venues, for example – looking at the number of cafes, snack bars, etc present in an area in addition to regular restaurants.




7. Conclusion and future work

Back to top

The limitation of this project

Only focus on residential information

This project is limited by the lack of crucial information. So far I have been focused quite a lot on residence information and one-person households. However, customers can also come from nearby business venues. I am unable to validate any assumption or answer any questions, because the information of business venues in Amsterdam is not as available as demographic information.

Rent of a venue is not taken into consideration

Due to lack of information, I am unable to include rental price as part of the analysis. Cost could be a big factor for Linda. In order to be able to predict the potential profit, however, it is crucial to include potential rental price.

Explore more POI categories

There are other facilities in the neighborhood which may influence the income of the restaurant. For instance:

  • How easy is it to reach the place via public transportation?
    Search for nearby bus stops, tram stations, train stations, etc.
  • How easy is it to park your car in the neighborhood?
    Search for nearby parking garages or open parking places.

Next steps for Linda

  • Continue to perform the same in-depth analysis to all neighborhoods as I did in In-depth analysis of one neighborhood.
  • Include rental price of each neighborhood in future analysis to be informed about her costs to profit ratio.