Intro

Finding a good flat which is near to your work place and is also near to e.g. the kindergarden/school of your kids, your favorite park etc. can be very difficult. Unfortunately the existing search engines in Germany for apartments like Immoscout, Immowelt, Immonet don't support to compute the travel time for an apartment to some destinations. Here I want to show you how to use Immospider to do that.

Immospider

Immospider is a python program that crawls the Immoscout24 website. It is based on ideas from http://mfcabrera.com/data_science/2015/01/17/ichbineinberliner.html and https://github.com/balzer82/immoscraper . But it is faster and more flexible.

Installation

Immospider is using the popular python framework https://scrapy.org/ . To install you need Python 3. Then you can clone this repository and install the requirements via

pip3 install -r requirements.txt

This should install scrapy and the googlemaps package for you. To use it you also need an API key for the googlemaps API. You should follow the instructions at https://github.com/googlemaps/google-maps-services-python#api-keys to get your API key.

Usage

Let's assume you want to move to Berlin. You will work at some fancy startup near Alexanderplatz but your partner likes to go shopping at the KaDeWe. And you are searching for a flat with 2-3 rooms bigger than 60m^2 flat which should not be more expensive than 1000 Euro. You must enter these requirements in Immoscout24 website and search. If you search for whole Berlin you probably will find more than 500 results. As next step copy the url of your Immoscout search, because Immospider will use it. For the example given here the url is https://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Berlin/Berlin/-/2,50-/60,00-/EURO--1000,00 . With this information you can now start Immospider like

scrapy crawl immoscout -o apartments.csv -s GM_KEY=<Google Maps API Key> -a url=https://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Berlin/Berlin/-/2,50-/60,00-/EURO--1000,00 -a dest="Alexanderplatz, Berlin" -a mode=transit -a dest2="KaDeWe, Berlin" -L INFO

The option -o apartments.csv specifies the output file. The parameter -s GM_KEY=<Google Maps API Key> sets your Google maps API key. The argument dest="Alexanderplatz, Berlin" -a mode=transit tells Immospider that you want to calculate the travel time for each apartment to Alexanderplatz using public transportation mode. The argument dest2="KaDeWe, Berlin" will additionaly compute the travel time via car (the default mode) to KaDeWe. You can have up to three destinations dest1,dest2,dest3 and specify the mode for each destination mode1,mode2,mode3. The argument -a url=... must hold the search url from Immoscout. The optional parameter -L INFO can be added to generate more log output.

If you start Immospider with the given parameters here it might run up to 20 minutes, not because the crawler is slow, but because the Google Maps API takes some time to compute the travel time for each of the more than 500 apartments. If that is too slow for you, you should modify your search on Immoscout (and again copy the new url), so that the amount of search results is lower. If your result set is about 50 apartments, Immospider will only need 1-2 minutes to compute all the travel times.

Data Science

After Immospider has finished it is time for some data science. In the following we will use https://jupyter.org/ to analyze the result.

In [47]:
import pandas as pd
In [48]:
df = pd.read_csv('apartments.csv')

Data Cleansing

We remove all the results without location (latitude, longitude).

In [49]:
df.dropna(subset=["lng", "lat"], inplace=True)
df.head(n=10)
Out[49]:
city media_count immo_id district title url time_dest2 time_dest3 time_dest rent sqm address lat contact_name zip_code lng rooms
0 Berlin 13 91655265 Köpenick (Köpenick) Besichtigung am Sonntag, den 05.02. um 16:00 U... https://www.immobilienscout24.de/expose/91655265 30.383333 NaN 43.833333 746 78,83 Grünauer Straße 129, Köpenick (Köpenick), Berlin 52.43238 Herr Clemens Teske 12557 13.57151 3
1 Berlin 10 92662753 Spandau (Spandau) Spandauer Arkaden schöne Altbauwohnung mit gro... https://www.immobilienscout24.de/expose/92662753 27.116667 NaN 32.100000 530 70 Pichelsdorfer Straße 139, Spandau (Spandau), B... 52.52729 Pierre Olbort 13595 13.19551 3
2 Berlin 8 92662740 Wedding (Wedding) FREI AB SOFORT * SANIERT * BALKON VORHANDEN * ... https://www.immobilienscout24.de/expose/92662740 25.500000 NaN 26.366667 753 75 Steegerstraße 61, Wedding (Wedding), Berlin 52.56246 Klaudia Jantsch 13359 13.39491 3
3 Berlin 8 92662699 Wedding (Wedding) FREI AB SOFORT * SEHR GEPFLEGT * SANIERT * WAN... https://www.immobilienscout24.de/expose/92662699 25.550000 NaN 26.600000 749 75 Steegerstraße 60, Wedding (Wedding), Berlin 52.56228 Klaudia Jantsch 13359 13.39506 3
4 Berlin 10 93084855 Spandau (Spandau) Hoch hinaus mit toller Aussicht! https://www.immobilienscout24.de/expose/93084855 25.833333 NaN 39.416667 780 103 Falkenseer Chaussee 275b, Spandau (Spandau), B... 52.54620 Heike Rohrbach 13583 13.18929 3
5 Berlin 2 92998370 Müggelheim (Köpenick) FREI AB MÄRZ 2017 * WANNENBAD * BALKON * RUHIG... https://www.immobilienscout24.de/expose/92998370 41.533333 NaN 54.500000 561 70 Philipp-Jacob-Rauch-Straße 72, Müggelheim (Köp... 52.41571 Klaudia Jantsch 12559 13.64953 2,5
6 Berlin 6 33037112 Marienfelde (Tempelhof) Wohnen im Grünen Nähe Namitzer Damm https://www.immobilienscout24.de/expose/33037112 32.733333 NaN 46.116667 640 75 Marienfelder Allee 172a, Marienfelde (Tempelho... 52.41102 Frau S, Rahmlow 12279 13.36040 3
7 Berlin 14 92712979 Tiergarten (Tiergarten) Besichtigung: Donnerstag den 02.02.17 um 17.00... https://www.immobilienscout24.de/expose/92712979 9.800000 NaN 23.483333 998 117,41 Berlichingenstraße 3, Tiergarten (Tiergarten),... 52.52857 Herr Methner 10553 13.32520 3
8 Berlin 9 92869379 Alt-Hohenschönhausen (Hohenschönhausen) "Weiße Taube" Schöne 3 Zimmer Wohnung in ruhig... https://www.immobilienscout24.de/expose/92869379 31.316667 NaN 34.416667 677 79,73 Plauener Str. 89b, Alt-Hohenschönhausen (Hohen... 52.53910 Herr Werk 13055 13.50934 3
9 Berlin 16 91448921 Spandau (Spandau) ++ Großzügige, lichtdurchflutete 3-Zimmer-Wohn... https://www.immobilienscout24.de/expose/91448921 30.300000 NaN 61.700000 999 139,18 Hakenfelder Straße 10a, Spandau (Spandau), Berlin 52.56592 Herr Oliver Müller 13587 13.20028 3

Extracting a top10 list of apartments

We are searching for the apartments with the lowest travel time on average to our two destinations (Alexanderplatz and KaDeWe). To do this we compute the average travel time (avg_time) for each apartment and sort the list according to this value. Then we generate a list with the top10 results.

In [50]:
df["avg_time"] = 0.5*(df.time_dest + df.time_dest2)
df.sort_values("avg_time", inplace=True)
top10=df.head(n=10)
top10
Out[50]:
city media_count immo_id district title url time_dest2 time_dest3 time_dest rent sqm address lat contact_name zip_code lng rooms avg_time
732 Berlin 13 46991344 Mitte (Mitte) Helle, sanierte 3-Raum Wohnung mit Ausblick au... https://www.immobilienscout24.de/expose/46991344 16.516667 NaN 4.183333 770 70,03 Köpenicker Str. 103, Mitte (Mitte), Berlin 52.51087 Frau Marta Stellmach 10179 13.41647 3 10.350000
626 Berlin 10 91193506 Kreuzberg (Kreuzberg) ***Top sanierte 2,5-Zimmerwohnung mitten im Ki... https://www.immobilienscout24.de/expose/91193506 14.283333 NaN 9.366667 980 66 Reichenberger Straße 3, Kreuzberg (Kreuzberg),... 52.49981 Provisionsfrei vom Eigentümer 10999 13.41527 2,5 11.825000
568 Berlin 9 91799446 Kreuzberg (Kreuzberg) *** Top sanierte 2,5-Zimmerwohnung mit Südbalk... https://www.immobilienscout24.de/expose/91799446 14.283333 NaN 9.366667 965 64,5 Reichenbergerstraße 3, Kreuzberg (Kreuzberg), ... 52.49981 Provisionsfrei vom Eigentümer 10999 13.41527 2,5 11.825000
340 Berlin 7 93003687 Kreuzberg (Kreuzberg) Hell Und Sonnig In Kreuzberg! https://www.immobilienscout24.de/expose/93003687 15.550000 NaN 8.366667 759,59 114,88 Admiralstr. 37, Kreuzberg (Kreuzberg), Berlin 52.49807 Herr Robin Cramer 10999 13.41752 4 11.958333
682 Berlin 3 88469173 Schöneberg (Schöneberg) Sonnige 3-Zimmer mit Balkon / sozialer Wohnung... https://www.immobilienscout24.de/expose/88469173 4.983333 NaN 20.200000 592,30 87,95 Schwerinstr. 18, Schöneberg (Schöneberg), Berlin 52.49718 Frau Ayten Hennig 10783 13.35897 3 12.591667
31 Berlin 9 92804746 Charlottenburg (Charlottenburg) Schöne drei Zimmer Wohnung in Berlin, Charlott... https://www.immobilienscout24.de/expose/92804746 7.400000 NaN 17.816667 940 89 Bleibtreustraße 51, Charlottenburg (Charlotten... 52.50641 Herr Wolfgang Dr. Groß 10623 13.32039 3,5 12.608333
350 Berlin 2 92992442 Kreuzberg (Kreuzberg) Familienaltbauwohnung am Heinrichplatz https://www.immobilienscout24.de/expose/92992442 16.000000 NaN 10.083333 602,21 74,79 Oranienstr. 30, Kreuzberg (Kreuzberg), Berlin 52.50150 Herr Paul Herrmann 10999 13.41907 3 13.041667
151 Berlin 7 93067156 Mitte (Mitte) Helle 3-Zim. Wohnung in Berlin-Mitte plus Stel... https://www.immobilienscout24.de/expose/93067156 19.766667 NaN 7.183333 849 74 Kleine Alexanderstr 5-7, Mitte (Mitte), Berlin 52.52510 Frau T. Orth 10178 13.41106 3 13.475000
525 Berlin 7 92399166 Kreuzberg (Kreuzberg) Ihr neues zu Hause in der Nähe vom Potsdamer P... https://www.immobilienscout24.de/expose/92399166 8.966667 NaN 18.183333 871,97 79,27 Schöneberger Str. 6, Kreuzberg (Kreuzberg), Be... 52.50412 Frau Diana Wilhelm 10963 13.37929 3 13.575000
465 Berlin 20 92720306 Kreuzberg (Kreuzberg) BSI***MITTENDRIN UND VOLL SIXTIES* RETRO* SONN... https://www.immobilienscout24.de/expose/92720306 12.066667 NaN 15.166667 780 67 Gitschiner Straße 0, Kreuzberg (Kreuzberg), Be... 52.49829 Herr Bernd Sajdok 10969 13.40204 3 13.616667

Showing the results on a map

For better overview we show the results on a map. For this we use the package folium.

In [51]:
# see https://nbviewer.jupyter.org/github/python-visualization/folium/blob/master/examples/Quickstart.ipynb
import folium
print(folium.__version__)
0.2.1

We create a map of Berlin and add a marker cluster to the map. Then we add our top 10 results as markers to the marker cluster. We also add a HTML popup to each result, showing the average travel time and a link to the expose.

In [52]:
map = folium.Map(location=[52.520645, 13.409779])
# see http://deparkes.co.uk/2016/06/24/folium-marker-clusters/
marker_cluster = folium.MarkerCluster("appartments").add_to(map)

for index,row in enumerate(top10.itertuples()):
    html = '''{0}. <a target="_blank" href="{1}">{2}</a> </br>
    {3} </br>
    Average travel time: {4:.2f} min '''.format(index, row.url, row.title, row.address, row.avg_time)
    iframe = folium.element.IFrame(html=html.decode("utf-8").encode('ascii', 'xmlcharrefreplace'), width=300, height=100)
    popup = folium.Popup(iframe, max_width=300)
    folium.Marker([row.lat, row.lng], popup=popup).add_to(marker_cluster)
In [53]:
# see https://nbviewer.jupyter.org/github/ocefpaf/folium_notebooks/blob/master/test_fit_bounds.ipynb
map.fit_bounds(map.get_bounds())
map
Out[53]:
In [ ]: