Vou fazer aqui primeiramente uma descrição dos dados que me parecem necessários e apresentar sugestões de como podem ser obtidos usando a plataforma arcGIS
e outras fontes. Incluirei também algumas definições de novas variáveis, sintetizadas a partir de variáveis básicas.
Com dados de plataformas como Google Analytics (GA), Facebook Analytics e outras semelhantes podemos obter uma proxy para o perfil dos cliente típicos. Em particular, na seção Audience do GA obtemos dados como informações demográficas e dados de localização dos clientes.
Ao comprar uma casa, os compradores buscam a proximidade de instalações como mercearias, farmácias, serviços de urgência, parques, etc. Estes incluem as propriedades de localização de uma casa.
O módulo de geocoding da Python
API do ArcGIS
pode ser usada para procurar instalações (hospitais, restaurantes, farmácias, parques, etc) dentro de uma distância especificada em torno de uma ponto escolhido no mapa.
Outro aspecto importante é o tempo que se leva para ir e vir do trabalho. O ArcGIS
também fornece ferramentas para cálculo do tempo de tráfego em dias e horas escolhidos. Podemos adicionar paradas, como por exemplo academia que o cliente visita no seu trajeto.
O codigo abaixo faz um cálculo semelhante usando uma base de dados de casas nos EUA.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
pd.set_option('display.max_columns', None)
import warnings
warnings.filterwarnings('ignore')
from arcgis.gis import GIS
gis = GIS(username="marcotav", password="Parmalat65#")
from arcgis.geocoding import geocode, batch_geocode
from arcgis.features import Feature, FeatureLayer, FeatureSet, GeoAccessor, GeoSeriesAccessor, SpatialDataFrame
from arcgis.geometry import Geometry, Point
from arcgis.geometry.functions import buffer
from arcgis.network import RouteLayer
from arcgis.geocoding import reverse_geocode
PATH = 'resources/houses_oregon_lito.csv'
df = pd.read_csv(PATH)
df.drop(columns=['Unnamed: 0'], inplace=True)
df = pd.DataFrame.spatial.from_xy(df, 'LONGITUDE','LATITUDE')
map_df = gis.map('Sao Paulo, Brazil')
map_df.basemap = 'streets-navigation-vector'
(beds, baths, hoa_per_month, year_built, square_feet, price) = (3, 2, 50,
2000, 2800, 660000)
sl_df = df[(df['BEDS']>=beds) & (df['BATHS']>baths) &
(df['HOA PER MONTH']<=hoa_per_month) &
(df['YEAR BUILT']>=year_built) &
(df['SQUARE FEET'] > square_feet) &
(df['PRICE']<=price)]
def rev_geo(row, df):
return reverse_geocode(Point(df['SHAPE'][row]))
s = 'no';
if s=='yes':
sl_df['ADDRESS'] = [rev_geo(i, sl_df)['address']['LongLabel'] for i in range(sl_df.shape[0])]
cols = ['SALE TYPE', 'PROPERTY TYPE', 'ADDRESS', 'CITY', 'STATE', 'ZIP',
'PRICE', 'BEDS', 'BATHS', 'LOCATION', 'SQUARE FEET',
'LOT SIZE', 'YEAR BUILT', 'DAYS ON MARKET', 'PRICE PER SQFT', 'HOA PER MONTH',
'STATUS', 'SOURCE', 'MLS', 'LATITUDE', 'LONGITUDE', 'SHAPE']
sl_df = sl_df[cols]
else:
pass
sl_df = pd.read_csv('listings_with_address.csv', index_col=0)
prop1 = sl_df[sl_df['MLS']==18389440]
paddress = prop1.ADDRESS + ", " + prop1.CITY + ", " + prop1.STATE
prop_geom_fset = geocode(paddress.values[0],
as_featureset=True)
prop_geom = prop_geom_fset.features[0]
prop_buffer = buffer([prop_geom.geometry],
in_sr = 102100, buffer_sr=102100,
distances=0.05, unit=9001)[0]
prop_buffer_f = Feature(geometry=prop_buffer)
prop_buffer_fset = FeatureSet([prop_buffer_f])
neighborhood_data_dict = {}
groceries = geocode('groceries', search_extent=prop_buffer.extent,
max_locations=20, as_featureset=True)
neighborhood_data_dict['groceries'] = []
for place in groceries:
popup={"title" : place.attributes['PlaceName'],
"content" : place.attributes['Place_addr']}
neighborhood_data_dict['groceries'].append(place.attributes['PlaceName'])
restaurants = geocode('restaurant', search_extent=prop_buffer.extent, max_locations=200)
neighborhood_data_dict['restauruants'] = []
for place in restaurants:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['restauruants'].append(place['attributes']['PlaceName'])
hospitals = geocode('hospital', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['hospitals'] = []
for place in hospitals:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['hospitals'].append(place['attributes']['PlaceName'])
coffees = geocode('coffee', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['coffees'] = []
for place in coffees:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['coffees'].append(place['attributes']['PlaceName'])
bars = geocode('bar', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['bars'] = []
for place in bars:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['bars'].append(place['attributes']['PlaceName'])
gas = geocode('gas station', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['gas'] = []
for place in gas:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['gas'].append(place['attributes']['PlaceName'])
shops_service = geocode("",category='shops and service', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['shops'] = []
for place in shops_service:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['shops'].append(place['attributes']['PlaceName'])
transport = geocode("",category='travel and transport', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['transport'] = []
for place in transport:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['transport'].append(place['attributes']['PlaceName'])
parks = geocode("",category='parks and outdoors', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['parks'] = []
for place in parks:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['parks'].append(place['attributes']['PlaceName'])
education = geocode("",category='education', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['education'] = []
for place in education:
popup={"title" : place['attributes']['PlaceName'],
"content" : place['attributes']['Place_addr']}
neighborhood_data_dict['education'].append(place['attributes']['PlaceName'])
neighborhood_df = pd.DataFrame.from_dict(neighborhood_data_dict, orient='index')
neighborhood_df = neighborhood_df.transpose()
neighborhood_df.shape
neighborhood_df.head()
## Commute to work duration!
# - Set start time to `8:00 AM` on Mondays
# - `ArcGIS` routing service uses historic averages.
route_service_url = gis.properties.helperServices.route.url
route_service = RouteLayer(route_service_url, gis=gis)
stops = [paddress.values[0],
'309 SW 6th Ave #600, Portland, OR 97204']
from arcgis.geocoding import geocode, batch_geocode
stops_geocoded = batch_geocode(stops)
stops_geocoded = [item['location'] for item in stops_geocoded]
stops_geocoded2 = '{},{};{},{}'.format(stops_geocoded[0]['x'],stops_geocoded[0]['y'],
stops_geocoded[1]['x'],stops_geocoded[1]['y'])
modes = route_service.retrieve_travel_modes()['supportedTravelModes']
route_service.properties.impedance;
route_result = route_service.solve(stops_geocoded2, return_routes=True,
return_stops=True, return_directions=True,
impedance_attribute_name='TravelTime',
start_time=644511600000,
return_barriers=False, return_polygon_barriers=False,
return_polyline_barriers=False)
route_length = route_result['directions'][0]['summary']['totalLength']
route_duration = route_result['directions'][0]['summary']['totalTime']
route_duration_str = "{}m, {}s".format(int(route_duration),
round((route_duration %1)*60,2))
print("route length: {} miles, route duration: {}".format(round(route_length,3),
route_duration_str))
route_features = route_result['routes']['features']
route_fset = FeatureSet(route_features)
stop_features = route_result['stops']['features']
stop_fset = FeatureSet(stop_features)
route_pop_up = {'title':'Name',
'content':'Total_Miles'}
route_service_url = gis.properties.helperServices.route.url
route_service = RouteLayer(route_service_url, gis=gis)
prop_list_df = sl_df.copy()
destination_address = '309 SW 6th Ave #600, Portland, OR 97204'
prop_list_df.head()
route_service_url = gis.properties.helperServices.route.url
route_service = RouteLayer(route_service_url, gis=gis)
prop_list_df = sl_df.copy()
prop_list_df = prop_list_df.iloc[:5,:]
prop_list_df
## Loop through each property and build the neighborhood facility table
groceries_count = []
restaurants_count = []
hospitals_count = []
coffee_count = []
bars_count = []
gas_count = []
shops_service_count = []
travel_transport_count = []
parks_count = []
education_count = []
route_length = []
route_duration = []
count=0
for index, prop in prop_list_df.iterrows():
count+=1
paddress = prop['ADDRESS'] + ", " + prop['CITY'] + ", " + prop['STATE']
paddress = prop['CITY'] + ", " + prop['STATE']
prop_geom_fset = geocode(paddress, as_featureset=True)
prop_geom = prop_geom_fset.features[0]
# create buffer of 5 miles
prop_buffer = buffer([prop_geom.geometry],
in_sr = 102100, buffer_sr=102100,
distances=0.05, unit=9001)[0]
prop_buffer_f = Feature(geometry=prop_buffer)
prop_buffer_fset = FeatureSet([prop_buffer_f])
groceries = geocode('groceries', search_extent=prop_buffer.extent,
max_locations=20, as_featureset=True)
groceries_count.append(len(groceries.features))
restaurants = geocode('restaurant', search_extent=prop_buffer.extent, max_locations=200)
restaurants_count.append(len(restaurants))
hospitals = geocode('hospital', search_extent=prop_buffer.extent, max_locations=50)
hospitals_count.append(len(hospitals))
coffees = geocode('coffee', search_extent=prop_buffer.extent, max_locations=50)
coffee_count.append(len(coffees))
bars = geocode('bar', search_extent=prop_buffer.extent, max_locations=50)
bars_count.append(len(bars))
gas = geocode('gas station', search_extent=prop_buffer.extent, max_locations=50)
gas_count.append(len(gas))
shops_service = geocode("",category='shops and service',
search_extent=prop_buffer.extent, max_locations=50)
shops_service_count.append(len(shops_service))
parks = geocode("",category='parks and outdoors',
search_extent=prop_buffer.extent, max_locations=50)
parks_count.append(len(parks))
education = geocode("",category='education', search_extent=prop_buffer.extent,
max_locations=50)
education_count.append(len(education))
stops = [paddress, destination_address]
stops_geocoded = batch_geocode(stops)
stops_geocoded = [item['location'] for item in stops_geocoded]
stops_geocoded2 = '{},{};{},{}'.format(stops_geocoded[0]['x'],stops_geocoded[0]['y'],
stops_geocoded[1]['x'],stops_geocoded[1]['y'])
route_result = route_service.solve(stops_geocoded2, return_routes=True,
return_stops=False, return_directions=True,
impedance_attribute_name='TravelTime',
start_time=644511600000,
return_barriers=False, return_polygon_barriers=False,
return_polyline_barriers=False)
route_length.append(route_result['directions'][0]['summary']['totalLength'])
route_duration.append(route_result['directions'][0]['summary']['totalTime'])
print("Route")
prop_list_df['grocery_count'] = groceries_count
prop_list_df['restaurant_count']= restaurants_count
prop_list_df['hospitals_count']= hospitals_count
prop_list_df['coffee_count']= coffee_count
prop_list_df['bars_count']=bars_count
prop_list_df['gas_count']=gas_count
prop_list_df['shops_count']=shops_service_count
prop_list_df['parks_count']=parks_count
prop_list_df['edu_count']=education_count
prop_list_df['commute_length']=route_length
prop_list_df['commute_duration']=route_duration
facility_list = ['grocery_count', 'restaurant_count', 'hospitals_count', 'coffee_count',
'bars_count', 'gas_count', 'shops_count', 'parks_count',
'edu_count', 'commute_length', 'commute_duration']
def set_scores(row):
score = ((row['PRICE']*-1.5) +
(row['BEDS']*1)+
(row['BATHS']*1)+
(row['SQUARE FEET']*1)+
(row['LOT SIZE']*1)+
(row['YEAR BUILT']*1)+
(row['HOA PER MONTH']*-1)+
(row['grocery_count']*1)+
(row['restaurant_count']*1)+
(row['hospitals_count']*1.5)+
(row['coffee_count']*1)+
(row['bars_count']*1)+
(row['shops_count']*1)+
(row['parks_count']*1)+
(row['edu_count']*1)+
(row['commute_length']*-1)+
(row['commute_duration']*-2)
)
return score
prop_list_df['scores'] = prop_list_df.apply(set_scores, axis=1)
prop_list_df.head()
print('Ok!')
(50, 10)
groceries | restauruants | hospitals | coffees | bars | gas | shops | transport | parks | education | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Bales Market Place | Coffee. Cup | Providence St Vincent Medical Center-ER | Coffee. Cup | None | Shell | Powell Paint Center | MAX-Elmonica & SW 170th Ave | Jqay House Park | Oregon College of Art & Craft |
1 | Safeway | Papa Murphy's | Providence St Vincent Medical Center | Starbucks | None | ARCO | Retied | Powder Lodging | Mitchell Park | Cedar Mill Elementary School |
2 | QFC | Tilly's Gelato | None | Starbucks | None | 76 | Chrisman's Picture Frame & Gallery | Homestead Studio Suites-Beaverton | Jackie Husen Park | French American School |
3 | QFC | Oak Hills Brew Pub | None | Poppa's Haven | None | Costco | Team Uniforms | MAX-Sunset TC | The Bluffs | Goddard School |
4 | Dinihanian's Farm Market | Starbucks | None | Tazza Cafe | None | 76 | T-Mobile | Rodeway Inn & Suites-Portland | Bonny Slope Park | St Pius X Elementary School |
'TravelTime'
route length: 10.693 miles, route duration: 29m, 17.75s
SALE TYPE | PROPERTY TYPE | ADDRESS | CITY | STATE | ZIP | PRICE | BEDS | BATHS | LOCATION | SQUARE FEET | LOT SIZE | YEAR BUILT | DAYS ON MARKET | PRICE PER SQFT | HOA PER MONTH | STATUS | SOURCE | MLS | LATITUDE | LONGITUDE | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1414 | MLS Listing | Single Family Residential | 6917 SE 155th Ave, Portland, OR, 97236, USA | Portland | OR | 97236.0 | 454900.0 | 4.0 | 3.0 | Portland Southeast | 3126.0 | 6098.0 | 2003.0 | 109.0 | 146.0 | 0.0 | Active | RMLS | 18352241 | 45.472435 | -122.504185 | {'x': -122.50418476054743, 'y': 45.47243463945... |
1427 | MLS Listing | Multi-Family (2-4 Unit) | 14719 NE Couch St, Portland, OR, 97230, USA | Portland | OR | 97236.0 | 456789.0 | 6.0 | 6.0 | POWELLHURST-GILBERT | 2812.0 | 6969.0 | 2006.0 | 28.0 | 162.0 | 0.0 | Active | RMLS | 18415529 | 45.523889 | -122.511421 | {'x': -122.5114205262322, 'y': 45.523888573767... |
1644 | MLS Listing | Single Family Residential | 17104 SE Kelly St, Portland, OR, 97236, USA | Portland | OR | 97236.0 | 499000.0 | 3.0 | 2.5 | Portland Southeast | 3350.0 | 8276.0 | 2000.0 | 62.0 | 149.0 | 10.0 | Active | RMLS | 18613304 | 45.499326 | -122.487188 | {'x': -122.48718775214779, 'y': 45.49932634785... |
1650 | MLS Listing | Single Family Residential | 15701-15999 NE Glisan St, Portland, OR, 97230,... | Portland | OR | 97233.0 | 499000.0 | 4.0 | 2.5 | Portland Southeast | 2843.0 | 6969.0 | 2014.0 | 1.0 | 176.0 | 0.0 | Active | RMLS | 18035240 | 45.526420 | -122.500293 | {'x': -122.50029325523987, 'y': 45.52642034476... |
1669 | MLS Listing | Single Family Residential | Lents, Portland, OR, USA | Portland | OR | 97266.0 | 499700.0 | 5.0 | 3.5 | PLEASANT VALLEY | 3662.0 | 21344.0 | 2004.0 | 48.0 | 136.0 | 0.0 | Active | RMLS | 18134679 | 45.463353 | -122.549360 | {'x': -122.54935952325859, 'y': 45.46335337674... |
SALE TYPE | PROPERTY TYPE | ADDRESS | CITY | STATE | ZIP | PRICE | BEDS | BATHS | LOCATION | SQUARE FEET | LOT SIZE | YEAR BUILT | DAYS ON MARKET | PRICE PER SQFT | HOA PER MONTH | STATUS | SOURCE | MLS | LATITUDE | LONGITUDE | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1414 | MLS Listing | Single Family Residential | 6917 SE 155th Ave, Portland, OR, 97236, USA | Portland | OR | 97236.0 | 454900.0 | 4.0 | 3.0 | Portland Southeast | 3126.0 | 6098.0 | 2003.0 | 109.0 | 146.0 | 0.0 | Active | RMLS | 18352241 | 45.472435 | -122.504185 | {'x': -122.50418476054743, 'y': 45.47243463945... |
1427 | MLS Listing | Multi-Family (2-4 Unit) | 14719 NE Couch St, Portland, OR, 97230, USA | Portland | OR | 97236.0 | 456789.0 | 6.0 | 6.0 | POWELLHURST-GILBERT | 2812.0 | 6969.0 | 2006.0 | 28.0 | 162.0 | 0.0 | Active | RMLS | 18415529 | 45.523889 | -122.511421 | {'x': -122.5114205262322, 'y': 45.523888573767... |
1644 | MLS Listing | Single Family Residential | 17104 SE Kelly St, Portland, OR, 97236, USA | Portland | OR | 97236.0 | 499000.0 | 3.0 | 2.5 | Portland Southeast | 3350.0 | 8276.0 | 2000.0 | 62.0 | 149.0 | 10.0 | Active | RMLS | 18613304 | 45.499326 | -122.487188 | {'x': -122.48718775214779, 'y': 45.49932634785... |
1650 | MLS Listing | Single Family Residential | 15701-15999 NE Glisan St, Portland, OR, 97230,... | Portland | OR | 97233.0 | 499000.0 | 4.0 | 2.5 | Portland Southeast | 2843.0 | 6969.0 | 2014.0 | 1.0 | 176.0 | 0.0 | Active | RMLS | 18035240 | 45.526420 | -122.500293 | {'x': -122.50029325523987, 'y': 45.52642034476... |
1669 | MLS Listing | Single Family Residential | Lents, Portland, OR, USA | Portland | OR | 97266.0 | 499700.0 | 5.0 | 3.5 | PLEASANT VALLEY | 3662.0 | 21344.0 | 2004.0 | 48.0 | 136.0 | 0.0 | Active | RMLS | 18134679 | 45.463353 | -122.549360 | {'x': -122.54935952325859, 'y': 45.46335337674... |
Route Route Route Route Route
SALE TYPE | PROPERTY TYPE | ADDRESS | CITY | STATE | ZIP | PRICE | BEDS | BATHS | LOCATION | SQUARE FEET | LOT SIZE | YEAR BUILT | DAYS ON MARKET | PRICE PER SQFT | HOA PER MONTH | STATUS | SOURCE | MLS | LATITUDE | LONGITUDE | SHAPE | grocery_count | restaurant_count | hospitals_count | coffee_count | bars_count | gas_count | shops_count | parks_count | edu_count | commute_length | commute_duration | scores | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1414 | MLS Listing | Single Family Residential | 6917 SE 155th Ave, Portland, OR, 97236, USA | Portland | OR | 97236.0 | 454900.0 | 4.0 | 3.0 | Portland Southeast | 3126.0 | 6098.0 | 2003.0 | 109.0 | 146.0 | 0.0 | Active | RMLS | 18352241 | 45.472435 | -122.504185 | {'x': -122.50418476054743, 'y': 45.47243463945... | 20 | 50 | 10 | 50 | 1 | 37 | 42 | 50 | 50 | 0.917188 | 6.068344 | -670851.053875 |
1427 | MLS Listing | Multi-Family (2-4 Unit) | 14719 NE Couch St, Portland, OR, 97230, USA | Portland | OR | 97236.0 | 456789.0 | 6.0 | 6.0 | POWELLHURST-GILBERT | 2812.0 | 6969.0 | 2006.0 | 28.0 | 162.0 | 0.0 | Active | RMLS | 18415529 | 45.523889 | -122.511421 | {'x': -122.5114205262322, 'y': 45.523888573767... | 20 | 50 | 10 | 50 | 1 | 37 | 42 | 50 | 50 | 0.917188 | 6.068344 | -673119.553875 |
1644 | MLS Listing | Single Family Residential | 17104 SE Kelly St, Portland, OR, 97236, USA | Portland | OR | 97236.0 | 499000.0 | 3.0 | 2.5 | Portland Southeast | 3350.0 | 8276.0 | 2000.0 | 62.0 | 149.0 | 10.0 | Active | RMLS | 18613304 | 45.499326 | -122.487188 | {'x': -122.48718775214779, 'y': 45.49932634785... | 20 | 50 | 10 | 50 | 1 | 37 | 42 | 50 | 50 | 0.917188 | 6.068344 | -734613.553875 |
1650 | MLS Listing | Single Family Residential | 15701-15999 NE Glisan St, Portland, OR, 97230,... | Portland | OR | 97233.0 | 499000.0 | 4.0 | 2.5 | Portland Southeast | 2843.0 | 6969.0 | 2014.0 | 1.0 | 176.0 | 0.0 | Active | RMLS | 18035240 | 45.526420 | -122.500293 | {'x': -122.50029325523987, 'y': 45.52642034476... | 20 | 50 | 10 | 50 | 1 | 37 | 42 | 50 | 50 | 0.917188 | 6.068344 | -736402.553875 |
1669 | MLS Listing | Single Family Residential | Lents, Portland, OR, USA | Portland | OR | 97266.0 | 499700.0 | 5.0 | 3.5 | PLEASANT VALLEY | 3662.0 | 21344.0 | 2004.0 | 48.0 | 136.0 | 0.0 | Active | RMLS | 18134679 | 45.463353 | -122.549360 | {'x': -122.54935952325859, 'y': 45.46335337674... | 20 | 50 | 10 | 50 | 1 | 37 | 42 | 50 | 50 | 0.917188 | 6.068344 | -722266.553875 |
Ok!