This project aims to clean and analyze dataset for used cars from eBay Kleinanzeigen, a classified section of German eBay website. The dataset is sampled only upto 50,000 data point for faster processing. This data can be found at url= https://data.world/data-society/used-cars-data
import pandas as pd
import numpy as np
autos = pd.read_csv("autos.csv", encoding = "Latin-1")
autos # a cell containg autos enbale jupyter notebook to display a few top and last values in pandas
dateCrawled | name | seller | offerType | price | abtest | vehicleType | yearOfRegistration | gearbox | powerPS | model | odometer | monthOfRegistration | fuelType | brand | notRepairedDamage | dateCreated | nrOfPictures | postalCode | lastSeen | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2016-03-26 17:47:46 | Peugeot_807_160_NAVTECH_ON_BOARD | privat | Angebot | $5,000 | control | bus | 2004 | manuell | 158 | andere | 150,000km | 3 | lpg | peugeot | nein | 2016-03-26 00:00:00 | 0 | 79588 | 2016-04-06 06:45:54 |
1 | 2016-04-04 13:38:56 | BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik | privat | Angebot | $8,500 | control | limousine | 1997 | automatik | 286 | 7er | 150,000km | 6 | benzin | bmw | nein | 2016-04-04 00:00:00 | 0 | 71034 | 2016-04-06 14:45:08 |
2 | 2016-03-26 18:57:24 | Volkswagen_Golf_1.6_United | privat | Angebot | $8,990 | test | limousine | 2009 | manuell | 102 | golf | 70,000km | 7 | benzin | volkswagen | nein | 2016-03-26 00:00:00 | 0 | 35394 | 2016-04-06 20:15:37 |
3 | 2016-03-12 16:58:10 | Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan... | privat | Angebot | $4,350 | control | kleinwagen | 2007 | automatik | 71 | fortwo | 70,000km | 6 | benzin | smart | nein | 2016-03-12 00:00:00 | 0 | 33729 | 2016-03-15 03:16:28 |
4 | 2016-04-01 14:38:50 | Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg... | privat | Angebot | $1,350 | test | kombi | 2003 | manuell | 0 | focus | 150,000km | 7 | benzin | ford | nein | 2016-04-01 00:00:00 | 0 | 39218 | 2016-04-01 14:38:50 |
5 | 2016-03-21 13:47:45 | Chrysler_Grand_Voyager_2.8_CRD_Aut.Limited_Sto... | privat | Angebot | $7,900 | test | bus | 2006 | automatik | 150 | voyager | 150,000km | 4 | diesel | chrysler | NaN | 2016-03-21 00:00:00 | 0 | 22962 | 2016-04-06 09:45:21 |
6 | 2016-03-20 17:55:21 | VW_Golf_III_GT_Special_Electronic_Green_Metall... | privat | Angebot | $300 | test | limousine | 1995 | manuell | 90 | golf | 150,000km | 8 | benzin | volkswagen | NaN | 2016-03-20 00:00:00 | 0 | 31535 | 2016-03-23 02:48:59 |
7 | 2016-03-16 18:55:19 | Golf_IV_1.9_TDI_90PS | privat | Angebot | $1,990 | control | limousine | 1998 | manuell | 90 | golf | 150,000km | 12 | diesel | volkswagen | nein | 2016-03-16 00:00:00 | 0 | 53474 | 2016-04-07 03:17:32 |
8 | 2016-03-22 16:51:34 | Seat_Arosa | privat | Angebot | $250 | test | NaN | 2000 | manuell | 0 | arosa | 150,000km | 10 | NaN | seat | nein | 2016-03-22 00:00:00 | 0 | 7426 | 2016-03-26 18:18:10 |
9 | 2016-03-16 13:47:02 | Renault_Megane_Scenic_1.6e_RT_Klimaanlage | privat | Angebot | $590 | control | bus | 1997 | manuell | 90 | megane | 150,000km | 7 | benzin | renault | nein | 2016-03-16 00:00:00 | 0 | 15749 | 2016-04-06 10:46:35 |
10 | 2016-03-15 01:41:36 | VW_Golf_Tuning_in_siber/grau | privat | Angebot | $999 | test | NaN | 2017 | manuell | 90 | NaN | 150,000km | 4 | benzin | volkswagen | nein | 2016-03-14 00:00:00 | 0 | 86157 | 2016-04-07 03:16:21 |
11 | 2016-03-16 18:45:34 | Mercedes_A140_Motorschaden | privat | Angebot | $350 | control | NaN | 2000 | NaN | 0 | NaN | 150,000km | 0 | benzin | mercedes_benz | NaN | 2016-03-16 00:00:00 | 0 | 17498 | 2016-03-16 18:45:34 |
12 | 2016-03-31 19:48:22 | Smart_smart_fortwo_coupe_softouch_pure_MHD_Pan... | privat | Angebot | $5,299 | control | kleinwagen | 2010 | automatik | 71 | fortwo | 50,000km | 9 | benzin | smart | nein | 2016-03-31 00:00:00 | 0 | 34590 | 2016-04-06 14:17:52 |
13 | 2016-03-23 10:48:32 | Audi_A3_1.6_tuning | privat | Angebot | $1,350 | control | limousine | 1999 | manuell | 101 | a3 | 150,000km | 11 | benzin | audi | nein | 2016-03-23 00:00:00 | 0 | 12043 | 2016-04-01 14:17:13 |
14 | 2016-03-23 11:50:46 | Renault_Clio_3__Dynamique_1.2__16_V;_viele_Ver... | privat | Angebot | $3,999 | test | kleinwagen | 2007 | manuell | 75 | clio | 150,000km | 9 | benzin | renault | NaN | 2016-03-23 00:00:00 | 0 | 81737 | 2016-04-01 15:46:47 |
15 | 2016-04-01 12:06:20 | Corvette_C3_Coupe_T_Top_Crossfire_Injection | privat | Angebot | $18,900 | test | coupe | 1982 | automatik | 203 | NaN | 80,000km | 6 | benzin | sonstige_autos | nein | 2016-04-01 00:00:00 | 0 | 61276 | 2016-04-02 21:10:48 |
16 | 2016-03-16 14:59:02 | Opel_Vectra_B_Kombi | privat | Angebot | $350 | test | kombi | 1999 | manuell | 101 | vectra | 150,000km | 5 | benzin | opel | nein | 2016-03-16 00:00:00 | 0 | 57299 | 2016-03-18 05:29:37 |
17 | 2016-03-29 11:46:22 | Volkswagen_Scirocco_2_G60 | privat | Angebot | $5,500 | test | coupe | 1990 | manuell | 205 | scirocco | 150,000km | 6 | benzin | volkswagen | nein | 2016-03-29 00:00:00 | 0 | 74821 | 2016-04-05 20:46:26 |
18 | 2016-03-26 19:57:44 | Verkaufen_mein_bmw_e36_320_i_touring | privat | Angebot | $300 | control | bus | 1995 | manuell | 150 | 3er | 150,000km | 0 | benzin | bmw | NaN | 2016-03-26 00:00:00 | 0 | 54329 | 2016-04-02 12:16:41 |
19 | 2016-03-17 13:36:21 | mazda_tribute_2.0_mit_gas_und_tuev_neu_2018 | privat | Angebot | $4,150 | control | suv | 2004 | manuell | 124 | andere | 150,000km | 2 | lpg | mazda | nein | 2016-03-17 00:00:00 | 0 | 40878 | 2016-03-17 14:45:58 |
20 | 2016-03-05 19:57:31 | Audi_A4_Avant_1.9_TDI_*6_Gang*AHK*Klimatronik*... | privat | Angebot | $3,500 | test | kombi | 2003 | manuell | 131 | a4 | 150,000km | 5 | diesel | audi | NaN | 2016-03-05 00:00:00 | 0 | 53913 | 2016-03-07 05:46:46 |
21 | 2016-03-06 19:07:10 | Porsche_911_Carrera_4S_Cabrio | privat | Angebot | $41,500 | test | cabrio | 2004 | manuell | 320 | 911 | 150,000km | 4 | benzin | porsche | nein | 2016-03-06 00:00:00 | 0 | 65428 | 2016-04-05 23:46:19 |
22 | 2016-03-28 20:50:54 | MINI_Cooper_S_Cabrio | privat | Angebot | $25,450 | control | cabrio | 2015 | manuell | 184 | cooper | 10,000km | 1 | benzin | mini | nein | 2016-03-28 00:00:00 | 0 | 44789 | 2016-04-01 06:45:30 |
23 | 2016-03-10 19:55:34 | Peugeot_Boxer_2_2_HDi_120_Ps_9_Sitzer_inkl_Klima | privat | Angebot | $7,999 | control | bus | 2010 | manuell | 120 | NaN | 150,000km | 2 | diesel | peugeot | nein | 2016-03-10 00:00:00 | 0 | 30900 | 2016-03-17 08:45:17 |
24 | 2016-04-03 11:57:02 | BMW_535i_xDrive_Sport_Aut. | privat | Angebot | $48,500 | control | limousine | 2014 | automatik | 306 | 5er | 30,000km | 12 | benzin | bmw | nein | 2016-04-03 00:00:00 | 0 | 22547 | 2016-04-07 13:16:50 |
25 | 2016-03-21 21:56:18 | Ford_escort_kombi_an_bastler_mit_ghia_ausstattung | privat | Angebot | $90 | control | kombi | 1996 | manuell | 116 | NaN | 150,000km | 4 | benzin | ford | ja | 2016-03-21 00:00:00 | 0 | 27574 | 2016-04-01 05:16:49 |
26 | 2016-04-03 22:46:28 | Volkswagen_Polo_Fox | privat | Angebot | $777 | control | kleinwagen | 1992 | manuell | 54 | polo | 125,000km | 2 | benzin | volkswagen | nein | 2016-04-03 00:00:00 | 0 | 38110 | 2016-04-05 23:46:48 |
27 | 2016-03-27 18:45:01 | Hat_einer_Ahnung_mit_Ford_Galaxy_HILFE | privat | Angebot | $0 | control | NaN | 2005 | NaN | 0 | NaN | 150,000km | 0 | NaN | ford | NaN | 2016-03-27 00:00:00 | 0 | 66701 | 2016-03-27 18:45:01 |
28 | 2016-03-19 21:56:19 | MINI_Cooper_D | privat | Angebot | $5,250 | control | kleinwagen | 2007 | manuell | 110 | cooper | 150,000km | 7 | diesel | mini | ja | 2016-03-19 00:00:00 | 0 | 15745 | 2016-04-07 14:58:48 |
29 | 2016-04-02 12:45:44 | Mercedes_Benz_E_320_T_CDI_Avantgarde_DPF7_Sitz... | privat | Angebot | $4,999 | test | kombi | 2004 | automatik | 204 | e_klasse | 150,000km | 10 | diesel | mercedes_benz | nein | 2016-04-02 00:00:00 | 0 | 47638 | 2016-04-02 12:45:44 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
49970 | 2016-03-21 22:47:37 | c4_Grand_Picasso_mit_Automatik_Leder_Navi_Temp... | privat | Angebot | $15,800 | control | bus | 2010 | automatik | 136 | c4 | 60,000km | 4 | diesel | citroen | nein | 2016-03-21 00:00:00 | 0 | 14947 | 2016-04-07 04:17:34 |
49971 | 2016-03-29 14:54:12 | W.Lupo_1.0 | privat | Angebot | $950 | test | kleinwagen | 2001 | manuell | 50 | lupo | 150,000km | 4 | benzin | volkswagen | nein | 2016-03-29 00:00:00 | 0 | 65197 | 2016-03-29 20:41:51 |
49972 | 2016-03-26 22:25:23 | Mercedes_Benz_Vito_115_CDI_Extralang_Aut. | privat | Angebot | $3,300 | control | bus | 2004 | automatik | 150 | vito | 150,000km | 10 | diesel | mercedes_benz | ja | 2016-03-26 00:00:00 | 0 | 65326 | 2016-03-28 11:28:18 |
49973 | 2016-03-27 05:32:39 | Mercedes_Benz_SLK_200_Kompressor | privat | Angebot | $6,000 | control | cabrio | 2004 | manuell | 163 | slk | 150,000km | 11 | benzin | mercedes_benz | nein | 2016-03-27 00:00:00 | 0 | 53567 | 2016-03-27 08:25:24 |
49974 | 2016-03-20 10:52:31 | Golf_1_Cabrio_Tuev_Neu_viele_Extras_alles_eing... | privat | Angebot | $0 | control | cabrio | 1983 | manuell | 70 | golf | 150,000km | 2 | benzin | volkswagen | nein | 2016-03-20 00:00:00 | 0 | 8209 | 2016-03-27 19:48:16 |
49975 | 2016-03-27 20:51:39 | Honda_Jazz_1.3_DSi_i_VTEC_IMA_CVT_Comfort | privat | Angebot | $9,700 | control | kleinwagen | 2012 | automatik | 88 | jazz | 100,000km | 11 | hybrid | honda | nein | 2016-03-27 00:00:00 | 0 | 84385 | 2016-04-05 19:45:34 |
49976 | 2016-03-19 18:56:05 | Audi_80_Avant_2.6_E__Vollausstattung!!_Einziga... | privat | Angebot | $5,900 | test | kombi | 1992 | automatik | 150 | 80 | 150,000km | 12 | benzin | audi | nein | 2016-03-19 00:00:00 | 0 | 36100 | 2016-04-07 06:16:44 |
49977 | 2016-03-31 18:37:18 | Mercedes_Benz_C200_Cdi_W203 | privat | Angebot | $5,500 | control | limousine | 2003 | manuell | 116 | c_klasse | 150,000km | 2 | diesel | mercedes_benz | nein | 2016-03-31 00:00:00 | 0 | 33739 | 2016-04-06 12:16:11 |
49978 | 2016-04-04 10:37:14 | Mercedes_Benz_E_200_Classic | privat | Angebot | $900 | control | limousine | 1996 | automatik | 136 | e_klasse | 150,000km | 9 | benzin | mercedes_benz | ja | 2016-04-04 00:00:00 | 0 | 24405 | 2016-04-06 12:44:20 |
49979 | 2016-03-20 18:38:40 | Volkswagen_Polo_1.6_TDI_Style | privat | Angebot | $11,000 | test | kleinwagen | 2011 | manuell | 90 | polo | 70,000km | 11 | diesel | volkswagen | nein | 2016-03-20 00:00:00 | 0 | 48455 | 2016-04-07 01:45:12 |
49980 | 2016-03-12 10:55:54 | Ford_Escort_Turnier_16V | privat | Angebot | $400 | control | kombi | 1995 | manuell | 105 | escort | 125,000km | 3 | benzin | ford | NaN | 2016-03-12 00:00:00 | 0 | 56218 | 2016-04-06 17:16:49 |
49981 | 2016-03-15 09:38:21 | Opel_Astra_Kombi_mit_Anhaengerkupplung | privat | Angebot | $2,000 | control | kombi | 1998 | manuell | 115 | astra | 150,000km | 12 | benzin | opel | nein | 2016-03-15 00:00:00 | 0 | 86859 | 2016-04-05 17:21:46 |
49982 | 2016-03-29 18:51:08 | Skoda_Fabia_4_Tuerer_Bj:2004__85.000Tkm | privat | Angebot | $1,950 | control | kleinwagen | 2004 | manuell | 0 | fabia | 90,000km | 7 | benzin | skoda | NaN | 2016-03-29 00:00:00 | 0 | 45884 | 2016-03-29 18:51:08 |
49983 | 2016-03-06 12:43:04 | Ford_focus_99 | privat | Angebot | $600 | test | kleinwagen | 1999 | manuell | 101 | focus | 150,000km | 4 | benzin | ford | NaN | 2016-03-06 00:00:00 | 0 | 52477 | 2016-03-09 06:16:08 |
49984 | 2016-03-31 22:48:48 | Student_sucht_ein__Anfaengerauto___ab_2000_BJ_... | privat | Angebot | $0 | test | NaN | 2000 | NaN | 0 | NaN | 150,000km | 0 | NaN | sonstige_autos | NaN | 2016-03-31 00:00:00 | 0 | 12103 | 2016-04-02 19:44:53 |
49985 | 2016-04-02 16:38:23 | Verkaufe_meinen_vw_vento! | privat | Angebot | $1,000 | control | NaN | 1995 | automatik | 0 | NaN | 150,000km | 0 | benzin | volkswagen | NaN | 2016-04-02 00:00:00 | 0 | 30900 | 2016-04-06 15:17:52 |
49986 | 2016-04-04 20:46:02 | Chrysler_300C_3.0_CRD_DPF_Automatik_Voll_Ausst... | privat | Angebot | $15,900 | control | limousine | 2010 | automatik | 218 | 300c | 125,000km | 11 | diesel | chrysler | nein | 2016-04-04 00:00:00 | 0 | 73527 | 2016-04-06 23:16:00 |
49987 | 2016-03-22 20:47:27 | Audi_A3_Limousine_2.0_TDI_DPF_Ambition__NAVI__... | privat | Angebot | $21,990 | control | limousine | 2013 | manuell | 150 | a3 | 50,000km | 11 | diesel | audi | nein | 2016-03-22 00:00:00 | 0 | 94362 | 2016-03-26 22:46:06 |
49988 | 2016-03-28 19:49:51 | BMW_330_Ci | privat | Angebot | $9,550 | control | coupe | 2001 | manuell | 231 | 3er | 150,000km | 10 | benzin | bmw | nein | 2016-03-28 00:00:00 | 0 | 83646 | 2016-04-07 02:17:40 |
49989 | 2016-03-11 19:50:37 | VW_Polo_zum_Ausschlachten_oder_Wiederaufbau | privat | Angebot | $150 | test | kleinwagen | 1997 | manuell | 0 | polo | 150,000km | 5 | benzin | volkswagen | ja | 2016-03-11 00:00:00 | 0 | 21244 | 2016-03-12 10:17:55 |
49990 | 2016-03-21 19:54:19 | Mercedes_Benz_A_200__BlueEFFICIENCY__Urban | privat | Angebot | $17,500 | test | limousine | 2012 | manuell | 156 | a_klasse | 30,000km | 12 | benzin | mercedes_benz | nein | 2016-03-21 00:00:00 | 0 | 58239 | 2016-04-06 22:46:57 |
49991 | 2016-03-06 15:25:19 | Kleinwagen | privat | Angebot | $500 | control | NaN | 2016 | manuell | 0 | twingo | 150,000km | 0 | benzin | renault | NaN | 2016-03-06 00:00:00 | 0 | 61350 | 2016-03-06 18:24:19 |
49992 | 2016-03-10 19:37:38 | Fiat_Grande_Punto_1.4_T_Jet_16V_Sport | privat | Angebot | $4,800 | control | kleinwagen | 2009 | manuell | 120 | andere | 125,000km | 9 | lpg | fiat | nein | 2016-03-10 00:00:00 | 0 | 68642 | 2016-03-13 01:44:51 |
49993 | 2016-03-15 18:47:35 | Audi_A3__1_8l__Silber;_schoenes_Fahrzeug | privat | Angebot | $1,650 | control | kleinwagen | 1997 | manuell | 0 | NaN | 150,000km | 7 | benzin | audi | NaN | 2016-03-15 00:00:00 | 0 | 65203 | 2016-04-06 19:46:53 |
49994 | 2016-03-22 17:36:42 | Audi_A6__S6__Avant_4.2_quattro_eventuell_Tausc... | privat | Angebot | $5,000 | control | kombi | 2001 | automatik | 299 | a6 | 150,000km | 1 | benzin | audi | nein | 2016-03-22 00:00:00 | 0 | 46537 | 2016-04-06 08:16:39 |
49995 | 2016-03-27 14:38:19 | Audi_Q5_3.0_TDI_qu._S_tr.__Navi__Panorama__Xenon | privat | Angebot | $24,900 | control | limousine | 2011 | automatik | 239 | q5 | 100,000km | 1 | diesel | audi | nein | 2016-03-27 00:00:00 | 0 | 82131 | 2016-04-01 13:47:40 |
49996 | 2016-03-28 10:50:25 | Opel_Astra_F_Cabrio_Bertone_Edition___TÜV_neu+... | privat | Angebot | $1,980 | control | cabrio | 1996 | manuell | 75 | astra | 150,000km | 5 | benzin | opel | nein | 2016-03-28 00:00:00 | 0 | 44807 | 2016-04-02 14:18:02 |
49997 | 2016-04-02 14:44:48 | Fiat_500_C_1.2_Dualogic_Lounge | privat | Angebot | $13,200 | test | cabrio | 2014 | automatik | 69 | 500 | 5,000km | 11 | benzin | fiat | nein | 2016-04-02 00:00:00 | 0 | 73430 | 2016-04-04 11:47:27 |
49998 | 2016-03-08 19:25:42 | Audi_A3_2.0_TDI_Sportback_Ambition | privat | Angebot | $22,900 | control | kombi | 2013 | manuell | 150 | a3 | 40,000km | 11 | diesel | audi | nein | 2016-03-08 00:00:00 | 0 | 35683 | 2016-04-05 16:45:07 |
49999 | 2016-03-14 00:42:12 | Opel_Vectra_1.6_16V | privat | Angebot | $1,250 | control | limousine | 1996 | manuell | 101 | vectra | 150,000km | 1 | benzin | opel | nein | 2016-03-13 00:00:00 | 0 | 45897 | 2016-04-06 21:18:48 |
50000 rows × 20 columns
print(autos.info()) # To acess the data information
print(autos.head()) # To show some few columns at top
<class 'pandas.core.frame.DataFrame'> RangeIndex: 50000 entries, 0 to 49999 Data columns (total 20 columns): dateCrawled 50000 non-null object name 50000 non-null object seller 50000 non-null object offerType 50000 non-null object price 50000 non-null object abtest 50000 non-null object vehicleType 44905 non-null object yearOfRegistration 50000 non-null int64 gearbox 47320 non-null object powerPS 50000 non-null int64 model 47242 non-null object odometer 50000 non-null object monthOfRegistration 50000 non-null int64 fuelType 45518 non-null object brand 50000 non-null object notRepairedDamage 40171 non-null object dateCreated 50000 non-null object nrOfPictures 50000 non-null int64 postalCode 50000 non-null int64 lastSeen 50000 non-null object dtypes: int64(5), object(15) memory usage: 7.6+ MB None dateCrawled name \ 0 2016-03-26 17:47:46 Peugeot_807_160_NAVTECH_ON_BOARD 1 2016-04-04 13:38:56 BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik 2 2016-03-26 18:57:24 Volkswagen_Golf_1.6_United 3 2016-03-12 16:58:10 Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan... 4 2016-04-01 14:38:50 Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg... seller offerType price abtest vehicleType yearOfRegistration \ 0 privat Angebot $5,000 control bus 2004 1 privat Angebot $8,500 control limousine 1997 2 privat Angebot $8,990 test limousine 2009 3 privat Angebot $4,350 control kleinwagen 2007 4 privat Angebot $1,350 test kombi 2003 gearbox powerPS model odometer monthOfRegistration fuelType \ 0 manuell 158 andere 150,000km 3 lpg 1 automatik 286 7er 150,000km 6 benzin 2 manuell 102 golf 70,000km 7 benzin 3 automatik 71 fortwo 70,000km 6 benzin 4 manuell 0 focus 150,000km 7 benzin brand notRepairedDamage dateCreated nrOfPictures \ 0 peugeot nein 2016-03-26 00:00:00 0 1 bmw nein 2016-04-04 00:00:00 0 2 volkswagen nein 2016-03-26 00:00:00 0 3 smart nein 2016-03-12 00:00:00 0 4 ford nein 2016-04-01 00:00:00 0 postalCode lastSeen 0 79588 2016-04-06 06:45:54 1 71034 2016-04-06 14:45:08 2 35394 2016-04-06 20:15:37 3 33729 2016-03-15 03:16:28 4 39218 2016-04-01 14:38:50
Based on my observations , there are 20 columns and 50,000 rows in the dataset.However, some column are represented as the object type indicating they are represented by string not numbers. Also, all column have non-null values. Non the less, columns for gear box, model 1, brand,vihicle type have white spaces and camelcase style that needed to be converted to python snake case style. There is also inconsistent naming in Fuel type and NotRepairedDamage column.
autos.columns # To check info about column names
Index(['dateCrawled', 'name', 'seller', 'offerType', 'price', 'abtest', 'vehicleType', 'yearOfRegistration', 'gearbox', 'powerPS', 'model', 'odometer', 'monthOfRegistration', 'fuelType', 'brand', 'notRepairedDamage', 'dateCreated', 'nrOfPictures', 'postalCode', 'lastSeen'], dtype='object')
# Dataframe.rename()method is created to rename the columns
df =autos.rename({'dateCrawled': 'date_crawled',
'offerType': 'offer_type',
'vehicleType': 'vehicle_type',
'yearOfRegistration': 'registration_year',
'monthOfRegistration': 'registration_month',
'dateCreated':'ad_created',
'powerPS' : 'power_ps',
'fuelType': 'fuel_type',
'notRepairedDamage': 'unrepaired_damage',
'dateCreated': 'ad_created',
'nrOfPictures': 'num_pics',
'postalCode': 'postal_code',
'lastSeen': 'last_seen_date',
'gearbox': 'gear_box'}, axis=1, inplace = True)
autos.rename
autos.head()
date_crawled | name | seller | offer_type | price | abtest | vehicle_type | registration_year | gear_box | power_ps | model | odometer | registration_month | fuel_type | brand | unrepaired_damage | ad_created | num_pics | postal_code | last_seen_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2016-03-26 17:47:46 | Peugeot_807_160_NAVTECH_ON_BOARD | privat | Angebot | $5,000 | control | bus | 2004 | manuell | 158 | andere | 150,000km | 3 | lpg | peugeot | nein | 2016-03-26 00:00:00 | 0 | 79588 | 2016-04-06 06:45:54 |
1 | 2016-04-04 13:38:56 | BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik | privat | Angebot | $8,500 | control | limousine | 1997 | automatik | 286 | 7er | 150,000km | 6 | benzin | bmw | nein | 2016-04-04 00:00:00 | 0 | 71034 | 2016-04-06 14:45:08 |
2 | 2016-03-26 18:57:24 | Volkswagen_Golf_1.6_United | privat | Angebot | $8,990 | test | limousine | 2009 | manuell | 102 | golf | 70,000km | 7 | benzin | volkswagen | nein | 2016-03-26 00:00:00 | 0 | 35394 | 2016-04-06 20:15:37 |
3 | 2016-03-12 16:58:10 | Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan... | privat | Angebot | $4,350 | control | kleinwagen | 2007 | automatik | 71 | fortwo | 70,000km | 6 | benzin | smart | nein | 2016-03-12 00:00:00 | 0 | 33729 | 2016-03-15 03:16:28 |
4 | 2016-04-01 14:38:50 | Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg... | privat | Angebot | $1,350 | test | kombi | 2003 | manuell | 0 | focus | 150,000km | 7 | benzin | ford | nein | 2016-04-01 00:00:00 | 0 | 39218 | 2016-04-01 14:38:50 |
Since the column names were in camelcase, we cant simply use replace method, thus,I used dataframe.rename() method instead. To rename the column names, a dictionary is created,setting axis =1, indicating column. df.rename method is necessary because it enable convert our dataset columns names to snakecase which is standard for python scripting
autos.describe( include = "all")# To get data set info about descriptive statistics
date_crawled | name | seller | offer_type | price | abtest | vehicle_type | registration_year | gear_box | power_ps | model | odometer | registration_month | fuel_type | brand | unrepaired_damage | ad_created | num_pics | postal_code | last_seen_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 50000 | 50000 | 50000 | 50000 | 50000 | 50000 | 44905 | 50000.000000 | 47320 | 50000.000000 | 47242 | 50000 | 50000.000000 | 45518 | 50000 | 40171 | 50000 | 50000.0 | 50000.000000 | 50000 |
unique | 48213 | 38754 | 2 | 2 | 2357 | 2 | 8 | NaN | 2 | NaN | 245 | 13 | NaN | 7 | 40 | 2 | 76 | NaN | NaN | 39481 |
top | 2016-03-12 16:06:22 | Ford_Fiesta | privat | Angebot | $0 | test | limousine | NaN | manuell | NaN | golf | 150,000km | NaN | benzin | volkswagen | nein | 2016-04-03 00:00:00 | NaN | NaN | 2016-04-07 06:17:27 |
freq | 3 | 78 | 49999 | 49999 | 1421 | 25756 | 12859 | NaN | 36993 | NaN | 4024 | 32424 | NaN | 30107 | 10687 | 35232 | 1946 | NaN | NaN | 8 |
mean | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2005.073280 | NaN | 116.355920 | NaN | NaN | 5.723360 | NaN | NaN | NaN | NaN | 0.0 | 50813.627300 | NaN |
std | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 105.712813 | NaN | 209.216627 | NaN | NaN | 3.711984 | NaN | NaN | NaN | NaN | 0.0 | 25779.747957 | NaN |
min | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1000.000000 | NaN | 0.000000 | NaN | NaN | 0.000000 | NaN | NaN | NaN | NaN | 0.0 | 1067.000000 | NaN |
25% | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1999.000000 | NaN | 70.000000 | NaN | NaN | 3.000000 | NaN | NaN | NaN | NaN | 0.0 | 30451.000000 | NaN |
50% | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2003.000000 | NaN | 105.000000 | NaN | NaN | 6.000000 | NaN | NaN | NaN | NaN | 0.0 | 49577.000000 | NaN |
75% | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2008.000000 | NaN | 150.000000 | NaN | NaN | 9.000000 | NaN | NaN | NaN | NaN | 0.0 | 71540.000000 | NaN |
max | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 9999.000000 | NaN | 17700.000000 | NaN | NaN | 12.000000 | NaN | NaN | NaN | NaN | 0.0 | 99998.000000 | NaN |
autos["price"].value_counts() # To count values in price column
$0 1421 $500 781 $1,500 734 $2,500 643 $1,000 639 $1,200 639 $600 531 $3,500 498 $800 498 $2,000 460 $999 434 $750 433 $900 420 $650 419 $850 410 $700 395 $4,500 394 $300 384 $2,200 382 $950 379 $1,100 376 $1,300 371 $3,000 365 $550 356 $1,800 355 $5,500 340 $350 335 $1,250 335 $1,600 327 $1,999 322 ... $1,935 1 $9,989 1 $3,996 1 $31,200 1 $28,400 1 $34,890 1 $75,900 1 $11,270 1 $2,549 1 $18,090 1 $40,800 1 $31,400 1 $29,777 1 $6,410 1 $10,488 1 $45,800 1 $7,085 1 $20,980 1 $16,666 1 $7,825 1 $3,220 1 $6,340 1 $17,695 1 $2,870 1 $28,700 1 $6,969 1 $4,222 1 $277 1 $16,650 1 $5,634 1 Name: price, Length: 2357, dtype: int64
autos['price']=autos['price'].str.replace("$",'')
autos['price']=autos['price'].str.replace(",",'')
autos['price']=autos['price'].astype(int)
autos['price'].head()
0 5000 1 8500 2 8990 3 4350 4 1350 Name: price, dtype: int64
autos["odometer"].value_counts() # To count for values in odometer column
150,000km 32424 125,000km 5170 100,000km 2169 90,000km 1757 80,000km 1436 70,000km 1230 60,000km 1164 50,000km 1027 5,000km 967 40,000km 819 30,000km 789 20,000km 784 10,000km 264 Name: odometer, dtype: int64
autos['odometer'] = autos['odometer'].str.replace("km", '')
autos['odometer'] = autos['odometer'].str.replace(",", '')
autos['odometer'] = autos['odometer'].astype(int)
autos['odometer'].head()
0 150000 1 150000 2 70000 3 70000 4 150000 Name: odometer, dtype: int64
autos.rename({"odometer":"odometer_km"},axis = 1, inplace = True)# To rename column
autos.columns # To explore column names only
Index(['date_crawled', 'name', 'seller', 'offer_type', 'price', 'abtest', 'vehicle_type', 'registration_year', 'gear_box', 'power_ps', 'model', 'odometer_km', 'registration_month', 'fuel_type', 'brand', 'unrepaired_damage', 'ad_created', 'num_pics', 'postal_code', 'last_seen_date'], dtype='object')
dataframe.describe(), series.head() and series.value_counts() methods have been used initially to explore the data set. Based on my obseravtions, price and odometer columns have numeric data stored as text, thus require data cleaning. Furthermore, seller and offer_type column is a likely candidate to be drop from the dataset since both have mostly same values. I also use series.str.replace() method to remove non-numeric character and series.astype() method to convert "price" and "odometer" columns to numeric data type. Finally, dataframe.rename() method is used to rename "odometer" column to "odometer_km" as shown above
autos["odometer_km"].unique() # To explore unique values in the dataset
array([150000, 70000, 50000, 80000, 10000, 30000, 125000, 90000, 20000, 60000, 5000, 100000, 40000])
autos["odometer_km"].describe() # To get statistics of the dataset
count 50000.000000 mean 125732.700000 std 40042.211706 min 5000.000000 25% 125000.000000 50% 150000.000000 75% 150000.000000 max 150000.000000 Name: odometer_km, dtype: float64
autos["odometer_km"].value_counts().head(20)
150000 32424 125000 5170 100000 2169 90000 1757 80000 1436 70000 1230 60000 1164 50000 1027 5000 967 40000 819 30000 789 20000 784 10000 264 Name: odometer_km, dtype: int64
autos[autos["odometer_km"].between(0,5000)]
date_crawled | name | seller | offer_type | price | abtest | vehicle_type | registration_year | gear_box | power_ps | model | odometer_km | registration_month | fuel_type | brand | unrepaired_damage | ad_created | num_pics | postal_code | last_seen_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
52 | 2016-03-25 18:50:03 | Senator_A_3.0E_Karosserie_restauriert_m._viele... | privat | Angebot | 3500 | test | limousine | 1985 | NaN | 0 | andere | 5000 | 0 | benzin | opel | nein | 2016-03-25 00:00:00 | 0 | 63500 | 2016-04-07 00:46:00 |
71 | 2016-03-28 19:39:35 | Suche_Opel_Astra_F__Corsa_oder_Kadett_E_mit_Re... | privat | Angebot | 0 | control | NaN | 1990 | manuell | 0 | NaN | 5000 | 0 | benzin | opel | NaN | 2016-03-28 00:00:00 | 0 | 4552 | 2016-04-07 01:45:48 |
76 | 2016-03-22 14:52:57 | BMW_318i_neustes_Model_0Km | privat | Angebot | 31999 | control | limousine | 2016 | manuell | 136 | 3er | 5000 | 2 | benzin | bmw | NaN | 2016-03-22 00:00:00 | 0 | 45149 | 2016-04-06 05:15:42 |
102 | 2016-03-22 11:57:49 | Ford_Ka_dunkel_blau | privat | Angebot | 320 | control | kleinwagen | 2004 | manuell | 0 | ka | 5000 | 6 | benzin | ford | ja | 2016-03-22 00:00:00 | 0 | 24109 | 2016-04-02 01:47:21 |
106 | 2016-03-26 02:57:29 | Opel_Tigra_A_1.6_16V_Schlachtfahrzeug__GSI_C20... | privat | Angebot | 150 | test | coupe | 1996 | manuell | 106 | tigra | 5000 | 10 | NaN | opel | ja | 2016-03-26 00:00:00 | 0 | 67269 | 2016-04-05 19:47:10 |
121 | 2016-03-15 09:48:41 | verkaufe_Ford_focus_1_8_Diesel | privat | Angebot | 1100 | control | kombi | 2002 | manuell | 127 | focus | 5000 | 8 | diesel | ford | NaN | 2016-03-15 00:00:00 | 0 | 27568 | 2016-03-22 08:47:10 |
167 | 2016-04-02 19:43:45 | Suche_VW_Multivan_Innenausstattung_Set_oder_TE... | privat | Angebot | 0 | control | NaN | 2011 | NaN | 0 | transporter | 5000 | 0 | NaN | volkswagen | NaN | 2016-04-02 00:00:00 | 0 | 64739 | 2016-04-06 19:45:08 |
226 | 2016-03-25 23:52:12 | Porsche_911_S_Targa__67er_SWB | privat | Angebot | 0 | control | cabrio | 1967 | manuell | 160 | 911 | 5000 | 12 | benzin | porsche | nein | 2016-03-25 00:00:00 | 0 | 44575 | 2016-04-05 14:46:39 |
259 | 2016-04-03 23:49:58 | guenstiges_Auto_/_auch_defekt | privat | Angebot | 0 | control | NaN | 2000 | NaN | 0 | NaN | 5000 | 6 | NaN | sonstige_autos | NaN | 2016-04-03 00:00:00 | 0 | 89269 | 2016-04-06 07:16:22 |
269 | 2016-03-27 10:39:16 | Audi_a3_8l_1_9l_TDI_in_Ebonyschwarz | privat | Angebot | 1470 | control | kleinwagen | 1999 | manuell | 110 | a3 | 5000 | 1 | diesel | audi | nein | 2016-03-27 00:00:00 | 0 | 23715 | 2016-04-07 05:17:10 |
301 | 2016-03-08 20:37:59 | Kaufe_alle_Autos_bietet_an | privat | Angebot | 0 | control | NaN | 1990 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-08 00:00:00 | 0 | 13589 | 2016-04-05 18:44:43 |
344 | 2016-03-22 11:25:18 | Verkaufe_hier_mein_mein_schoener_honda_crx_in_... | privat | Angebot | 1 | test | NaN | 2000 | manuell | 125 | andere | 5000 | 0 | NaN | honda | NaN | 2016-03-22 00:00:00 | 0 | 65391 | 2016-04-06 18:15:57 |
351 | 2016-03-22 12:47:31 | Alles_gut_bei_dese_Auto | privat | Angebot | 950 | control | kombi | 1996 | manuell | 74 | passat | 5000 | 6 | NaN | volkswagen | NaN | 2016-03-22 00:00:00 | 0 | 53819 | 2016-03-29 15:46:52 |
403 | 2016-03-28 10:53:59 | MINI_One_Pepper___PDC__hinten_ | privat | Angebot | 17695 | control | kleinwagen | 2015 | manuell | 102 | one | 5000 | 4 | benzin | mini | nein | 2016-03-28 00:00:00 | 0 | 7318 | 2016-04-06 13:44:50 |
418 | 2016-03-29 14:43:24 | Fiat_SCUDO_8_Sitzer_Bus__Diesel_JTD__80_KW | privat | Angebot | 0 | test | bus | 2003 | manuell | 80 | andere | 5000 | 5 | diesel | fiat | nein | 2016-03-29 00:00:00 | 0 | 35315 | 2016-04-05 23:47:14 |
428 | 2016-03-21 19:55:57 | Audi_A3_98 | privat | Angebot | 1000 | test | kleinwagen | 1999 | manuell | 101 | a3 | 5000 | 0 | benzin | audi | ja | 2016-03-21 00:00:00 | 0 | 6258 | 2016-03-28 17:18:38 |
430 | 2016-03-18 23:52:40 | Winterraeder_FORD | privat | Angebot | 0 | test | NaN | 2007 | NaN | 0 | focus | 5000 | 0 | NaN | ford | NaN | 2016-03-18 00:00:00 | 0 | 40549 | 2016-03-19 06:47:06 |
435 | 2016-03-21 20:38:33 | Ford_Puma_zu_verkaufen | privat | Angebot | 550 | control | coupe | 2002 | manuell | 125 | NaN | 5000 | 1 | benzin | ford | NaN | 2016-03-21 00:00:00 | 0 | 32657 | 2016-04-04 21:17:01 |
453 | 2016-03-28 13:51:12 | Armee_Jeep | privat | Angebot | 9800 | test | NaN | 4500 | manuell | 0 | andere | 5000 | 0 | NaN | jeep | NaN | 2016-03-28 00:00:00 | 0 | 7545 | 2016-04-06 17:45:49 |
796 | 2016-03-19 01:54:39 | Top_ford_focus_2003_nur_60_000km!!!!! | privat | Angebot | 3100 | control | limousine | 2003 | manuell | 0 | focus | 5000 | 10 | benzin | ford | NaN | 2016-03-19 00:00:00 | 0 | 41238 | 2016-04-06 05:44:40 |
811 | 2016-03-06 16:41:41 | Toyota_Yaris_sehr_gut_erhalten_2618_KM!! | privat | Angebot | 8690 | test | kleinwagen | 2009 | automatik | 74 | yaris | 5000 | 6 | benzin | toyota | nein | 2016-03-06 00:00:00 | 0 | 70376 | 2016-03-11 23:16:31 |
844 | 2016-03-22 13:38:51 | Golf_Cabrio_zum_Schlachten_ohne_Motor | privat | Angebot | 300 | control | cabrio | 1997 | manuell | 1 | golf | 5000 | 4 | diesel | volkswagen | ja | 2016-03-22 00:00:00 | 0 | 83395 | 2016-04-04 01:15:36 |
1011 | 2016-03-09 19:36:43 | Ford_top_Zustand | privat | Angebot | 1100 | test | limousine | 1999 | manuell | 101 | focus | 5000 | 1 | benzin | ford | nein | 2016-03-09 00:00:00 | 0 | 53474 | 2016-03-12 20:16:11 |
1028 | 2016-03-09 12:52:54 | Polo_86c_Coupe | privat | Angebot | 150 | control | kleinwagen | 1994 | NaN | 45 | polo | 5000 | 3 | benzin | volkswagen | NaN | 2016-03-09 00:00:00 | 0 | 57250 | 2016-03-09 21:46:41 |
1039 | 2016-03-31 11:48:48 | Vw_polo_classic | privat | Angebot | 150 | control | NaN | 2016 | manuell | 0 | NaN | 5000 | 7 | benzin | volkswagen | nein | 2016-03-31 00:00:00 | 0 | 25767 | 2016-04-04 03:44:36 |
1171 | 2016-03-29 17:53:03 | Seat_Leon_Spielzeug_Auto | privat | Angebot | 2 | control | limousine | 1950 | automatik | 5 | leon | 5000 | 0 | diesel | seat | NaN | 2016-03-29 00:00:00 | 0 | 26919 | 2016-04-06 03:45:23 |
1179 | 2016-03-18 09:39:30 | Andere_russ._LuAZ_967M_Schwimmwagen_NVA_GSSD_a... | privat | Angebot | 3999 | control | suv | 1987 | manuell | 71 | NaN | 5000 | 6 | benzin | sonstige_autos | nein | 2016-03-18 00:00:00 | 0 | 1589 | 2016-03-22 10:44:44 |
1183 | 2016-04-01 15:52:39 | Renault_Twingo_Tuning_in_Blau_1000 | privat | Angebot | 1000 | test | kleinwagen | 2000 | manuell | 58 | twingo | 5000 | 12 | benzin | renault | nein | 2016-04-01 00:00:00 | 0 | 34123 | 2016-04-07 14:56:01 |
1198 | 2016-03-25 10:38:18 | BMW_e34_520i | privat | Angebot | 1300 | test | limousine | 1992 | automatik | 150 | 5er | 5000 | 5 | benzin | bmw | NaN | 2016-03-25 00:00:00 | 0 | 86573 | 2016-04-06 09:45:36 |
1358 | 2016-03-29 14:36:59 | Trabant_601_S | privat | Angebot | 750 | test | limousine | 1989 | manuell | 26 | 601 | 5000 | 5 | benzin | trabant | ja | 2016-03-29 00:00:00 | 0 | 12524 | 2016-03-31 07:15:32 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
48347 | 2016-04-01 19:56:37 | OPEL_CORSA_1.2 | privat | Angebot | 300 | control | NaN | 2016 | manuell | 65 | corsa | 5000 | 9 | benzin | opel | NaN | 2016-04-01 00:00:00 | 0 | 59555 | 2016-04-01 19:56:37 |
48391 | 2016-04-01 15:48:17 | VW_Polo_BastlerAuto | privat | Angebot | 800 | test | kleinwagen | 1999 | manuell | 60 | polo | 5000 | 11 | benzin | volkswagen | NaN | 2016-04-01 00:00:00 | 0 | 46519 | 2016-04-07 13:50:01 |
48409 | 2016-03-14 22:55:21 | Bmw_330d_coupe_e92 | privat | Angebot | 13000 | control | coupe | 2007 | automatik | 300 | 3er | 5000 | 1 | diesel | bmw | nein | 2016-03-14 00:00:00 | 0 | 65203 | 2016-04-01 03:45:48 |
48425 | 2016-03-29 15:59:17 | Opel_Adam_1.4_Turbo_ecoFLEX_S | privat | Angebot | 18200 | control | kleinwagen | 2015 | manuell | 150 | andere | 5000 | 7 | benzin | opel | nein | 2016-03-29 00:00:00 | 0 | 85221 | 2016-04-06 02:15:45 |
48483 | 2016-03-14 19:36:16 | Corsa_D_Energy_neuwertig_mit_2900_km | privat | Angebot | 8999 | control | kleinwagen | 2014 | manuell | 70 | corsa | 5000 | 11 | benzin | opel | nein | 2016-03-14 00:00:00 | 0 | 57462 | 2016-04-06 04:17:20 |
48545 | 2016-04-03 21:58:29 | Volkswagen_Golf_Variant_1.6_Automatik_. | privat | Angebot | 1450 | test | kombi | 2000 | automatik | 101 | golf | 5000 | 8 | benzin | volkswagen | nein | 2016-04-03 00:00:00 | 0 | 53773 | 2016-04-05 23:44:51 |
48597 | 2016-03-14 16:48:08 | Skoda_Fabia___Tuev_neu___8_mal_bereift___4_Tue... | privat | Angebot | 1799 | test | limousine | 2000 | manuell | 75 | fabia | 5000 | 7 | benzin | skoda | nein | 2016-03-14 00:00:00 | 0 | 24536 | 2016-03-20 04:44:40 |
48616 | 2016-03-15 23:55:53 | Seat_Leon_1.8_TSI_Start&Stop_FR_NAVI+_Garantie | privat | Angebot | 21999 | control | limousine | 2015 | manuell | 179 | leon | 5000 | 0 | benzin | seat | nein | 2016-03-15 00:00:00 | 0 | 41539 | 2016-03-28 01:17:06 |
48763 | 2016-03-14 11:08:01 | Tausche_oder_Verkaufe__Bmw_740_v8_6_Schalter_. | privat | Angebot | 4500 | control | NaN | 2017 | manuell | 0 | NaN | 5000 | 0 | benzin | bmw | nein | 2016-03-13 00:00:00 | 0 | 35428 | 2016-03-18 01:17:30 |
48773 | 2016-04-01 17:37:51 | Smart_For_Two | privat | Angebot | 1111 | test | kleinwagen | 1999 | manuell | 45 | fortwo | 5000 | 6 | benzin | smart | ja | 2016-04-01 00:00:00 | 0 | 76149 | 2016-04-05 12:46:09 |
48832 | 2016-03-23 17:59:06 | Renault_Clio_Verhandlungsbasis | privat | Angebot | 1500 | control | kleinwagen | 2004 | manuell | 75 | clio | 5000 | 4 | benzin | renault | nein | 2016-03-23 00:00:00 | 0 | 31832 | 2016-03-27 23:16:06 |
48985 | 2016-03-27 08:55:25 | Stockcar_Autos_Peugeot | privat | Angebot | 0 | test | NaN | 2007 | NaN | 60 | NaN | 5000 | 0 | NaN | peugeot | NaN | 2016-03-27 00:00:00 | 0 | 17153 | 2016-04-07 05:17:18 |
49153 | 2016-03-12 01:36:59 | Corsa_c20xe | privat | Angebot | 2500 | test | NaN | 5000 | NaN | 0 | corsa | 5000 | 0 | NaN | opel | NaN | 2016-03-12 00:00:00 | 0 | 88214 | 2016-03-12 22:15:17 |
49189 | 2016-03-24 17:37:23 | Skoda_Fabia_Combi_1.2_TSI_Ambition | privat | Angebot | 11995 | control | kombi | 2016 | manuell | 90 | fabia | 5000 | 3 | benzin | skoda | nein | 2016-03-24 00:00:00 | 0 | 82229 | 2016-04-07 10:17:21 |
49230 | 2016-03-11 18:38:43 | Vw_polo_sparsam_mit_faltdach_muss_weg. | privat | Angebot | 400 | test | kleinwagen | 1999 | NaN | 0 | polo | 5000 | 10 | benzin | volkswagen | NaN | 2016-03-11 00:00:00 | 0 | 66333 | 2016-03-21 05:46:29 |
49263 | 2016-04-02 15:55:11 | VW_Passat_1.9TDI_Kombi | privat | Angebot | 1250 | control | kombi | 2003 | automatik | 1998 | passat | 5000 | 12 | diesel | volkswagen | nein | 2016-04-02 00:00:00 | 0 | 33719 | 2016-04-06 14:46:09 |
49283 | 2016-03-15 18:38:53 | Citroen_HY | privat | Angebot | 7750 | control | NaN | 1001 | NaN | 0 | andere | 5000 | 0 | NaN | citroen | NaN | 2016-03-15 00:00:00 | 0 | 66706 | 2016-04-06 18:47:20 |
49318 | 2016-03-17 18:39:13 | Verkaufe_mein_Polo | privat | Angebot | 400 | control | kleinwagen | 1995 | NaN | 0 | polo | 5000 | 4 | NaN | volkswagen | NaN | 2016-03-17 00:00:00 | 0 | 28832 | 2016-04-01 08:44:50 |
49324 | 2016-04-01 16:36:34 | Mercedes_Benz_Vito_111_BlueTEC_Tourer_Kompakt_PRO | privat | Angebot | 29500 | control | kombi | 2016 | manuell | 114 | vito | 5000 | 2 | diesel | mercedes_benz | nein | 2016-04-01 00:00:00 | 0 | 35037 | 2016-04-07 12:44:36 |
49334 | 2016-04-03 16:37:23 | Passat_2.0_turbo | privat | Angebot | 4950 | control | limousine | 2007 | manuell | 200 | passat | 5000 | 3 | benzin | volkswagen | nein | 2016-04-03 00:00:00 | 0 | 10317 | 2016-04-03 16:37:23 |
49340 | 2016-03-05 19:52:28 | Nagelneuer_Adam__viele_Extras__aus_Gewinn._NIE... | privat | Angebot | 12990 | test | kleinwagen | 2016 | manuell | 87 | andere | 5000 | 3 | benzin | opel | nein | 2016-03-03 00:00:00 | 0 | 48249 | 2016-03-22 19:18:46 |
49437 | 2016-03-31 17:47:40 | Mazda_CX_5_2.2_SKYACTIV_D_AWD___LEDER___NAVI | privat | Angebot | 30700 | control | suv | 2015 | manuell | 175 | cx_reihe | 5000 | 2 | diesel | mazda | nein | 2016-03-31 00:00:00 | 0 | 94209 | 2016-04-06 11:17:39 |
49484 | 2016-03-12 15:57:10 | ich_biete_einen_mercedes_w_203_sport_coupe_230 | privat | Angebot | 2000 | control | limousine | 2001 | automatik | 197 | c_klasse | 5000 | 3 | benzin | mercedes_benz | ja | 2016-03-12 00:00:00 | 0 | 66333 | 2016-03-28 22:16:03 |
49496 | 2016-03-26 13:55:28 | Bmw_e39_520 | privat | Angebot | 0 | control | limousine | 1998 | manuell | 0 | NaN | 5000 | 0 | NaN | bmw | NaN | 2016-03-26 00:00:00 | 0 | 26188 | 2016-03-26 13:55:28 |
49581 | 2016-03-15 10:58:06 | Suzuki_Baleno_1_3_TÜV_neu | privat | Angebot | 790 | control | kleinwagen | 1995 | manuell | 0 | andere | 5000 | 6 | benzin | suzuki | NaN | 2016-03-15 00:00:00 | 0 | 1594 | 2016-03-22 10:17:49 |
49722 | 2016-03-29 10:37:17 | fghgfhfghfgh | privat | Angebot | 200 | control | NaN | 1960 | NaN | 0 | 145 | 5000 | 0 | NaN | alfa_romeo | NaN | 2016-03-29 00:00:00 | 0 | 24960 | 2016-03-29 11:39:37 |
49844 | 2016-03-26 19:43:51 | Fahrzeug_Ankauf!!!! | privat | Angebot | 22222 | control | NaN | 2005 | NaN | 0 | NaN | 5000 | 0 | NaN | opel | NaN | 2016-03-26 00:00:00 | 0 | 72160 | 2016-03-26 19:43:51 |
49845 | 2016-03-21 00:50:15 | Schlachte_VW_Sharan_vr6_Automatik___no_GTI_16V... | privat | Angebot | 1 | test | NaN | 2000 | automatik | 174 | sharan | 5000 | 6 | benzin | volkswagen | nein | 2016-03-20 00:00:00 | 0 | 14621 | 2016-04-05 21:16:16 |
49865 | 2016-03-10 01:37:42 | Mercedes_CLK_270_CDI_TÜV_bis_11_2017_klimaauto... | privat | Angebot | 5200 | test | coupe | 2004 | automatik | 0 | clk | 5000 | 12 | diesel | mercedes_benz | NaN | 2016-03-09 00:00:00 | 0 | 80686 | 2016-03-31 07:46:31 |
49997 | 2016-04-02 14:44:48 | Fiat_500_C_1.2_Dualogic_Lounge | privat | Angebot | 13200 | test | cabrio | 2014 | automatik | 69 | 500 | 5000 | 11 | benzin | fiat | nein | 2016-04-02 00:00:00 | 0 | 73430 | 2016-04-04 11:47:27 |
967 rows × 20 columns
print(autos['price'].unique().shape)
print(autos['price'].describe())
autos["price"].value_counts().head(20)
(2357,) count 5.000000e+04 mean 9.840044e+03 std 4.811044e+05 min 0.000000e+00 25% 1.100000e+03 50% 2.950000e+03 75% 7.200000e+03 max 1.000000e+08 Name: price, dtype: float64
0 1421 500 781 1500 734 2500 643 1000 639 1200 639 600 531 800 498 3500 498 2000 460 999 434 750 433 900 420 650 419 850 410 700 395 4500 394 300 384 2200 382 950 379 Name: price, dtype: int64
autos[(autos["odometer_km"]<=5000) & (autos["price"]==0)]
date_crawled | name | seller | offer_type | price | abtest | vehicle_type | registration_year | gear_box | power_ps | model | odometer_km | registration_month | fuel_type | brand | unrepaired_damage | ad_created | num_pics | postal_code | last_seen_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
71 | 2016-03-28 19:39:35 | Suche_Opel_Astra_F__Corsa_oder_Kadett_E_mit_Re... | privat | Angebot | 0 | control | NaN | 1990 | manuell | 0 | NaN | 5000 | 0 | benzin | opel | NaN | 2016-03-28 00:00:00 | 0 | 4552 | 2016-04-07 01:45:48 |
167 | 2016-04-02 19:43:45 | Suche_VW_Multivan_Innenausstattung_Set_oder_TE... | privat | Angebot | 0 | control | NaN | 2011 | NaN | 0 | transporter | 5000 | 0 | NaN | volkswagen | NaN | 2016-04-02 00:00:00 | 0 | 64739 | 2016-04-06 19:45:08 |
226 | 2016-03-25 23:52:12 | Porsche_911_S_Targa__67er_SWB | privat | Angebot | 0 | control | cabrio | 1967 | manuell | 160 | 911 | 5000 | 12 | benzin | porsche | nein | 2016-03-25 00:00:00 | 0 | 44575 | 2016-04-05 14:46:39 |
259 | 2016-04-03 23:49:58 | guenstiges_Auto_/_auch_defekt | privat | Angebot | 0 | control | NaN | 2000 | NaN | 0 | NaN | 5000 | 6 | NaN | sonstige_autos | NaN | 2016-04-03 00:00:00 | 0 | 89269 | 2016-04-06 07:16:22 |
301 | 2016-03-08 20:37:59 | Kaufe_alle_Autos_bietet_an | privat | Angebot | 0 | control | NaN | 1990 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-08 00:00:00 | 0 | 13589 | 2016-04-05 18:44:43 |
418 | 2016-03-29 14:43:24 | Fiat_SCUDO_8_Sitzer_Bus__Diesel_JTD__80_KW | privat | Angebot | 0 | test | bus | 2003 | manuell | 80 | andere | 5000 | 5 | diesel | fiat | nein | 2016-03-29 00:00:00 | 0 | 35315 | 2016-04-05 23:47:14 |
430 | 2016-03-18 23:52:40 | Winterraeder_FORD | privat | Angebot | 0 | test | NaN | 2007 | NaN | 0 | focus | 5000 | 0 | NaN | ford | NaN | 2016-03-18 00:00:00 | 0 | 40549 | 2016-03-19 06:47:06 |
1937 | 2016-03-19 08:51:48 | Vw_polo_1_9tdi | privat | Angebot | 0 | test | kombi | 2001 | manuell | 120 | polo | 5000 | 0 | NaN | volkswagen | NaN | 2016-03-19 00:00:00 | 0 | 4720 | 2016-04-06 07:46:03 |
2360 | 2016-04-04 16:44:31 | Polo_86c_3f__g40_g60__vr6 | privat | Angebot | 0 | test | NaN | 1995 | NaN | 0 | NaN | 5000 | 0 | NaN | volkswagen | NaN | 2016-04-04 00:00:00 | 0 | 26529 | 2016-04-06 18:17:16 |
2466 | 2016-03-20 17:57:49 | Auto_Haus_Kanaan_An&Verkauf_Gebrauchtwagen | privat | Angebot | 0 | test | NaN | 2015 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-20 00:00:00 | 0 | 55234 | 2016-03-21 21:17:30 |
2813 | 2016-03-20 09:51:34 | Verkaufe_Fiat_Punto_Felgen | privat | Angebot | 0 | control | NaN | 1990 | NaN | 0 | punto | 5000 | 0 | NaN | fiat | NaN | 2016-03-20 00:00:00 | 0 | 80999 | 2016-04-06 05:45:13 |
2875 | 2016-04-04 08:53:00 | VW_Polo_3F_Motorsport | privat | Angebot | 0 | control | coupe | 1991 | manuell | 75 | polo | 5000 | 1 | benzin | volkswagen | nein | 2016-04-04 00:00:00 | 0 | 54516 | 2016-04-06 10:45:56 |
3308 | 2016-03-28 20:58:50 | Suche_dringend_ein_Kleinwagen | privat | Angebot | 0 | control | NaN | 2000 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-28 00:00:00 | 0 | 44532 | 2016-04-03 02:20:29 |
3394 | 2016-04-01 18:52:13 | Opel_omega_b_schlachtauto_zu_verschenken | privat | Angebot | 0 | control | NaN | 2000 | automatik | 211 | NaN | 5000 | 0 | benzin | opel | NaN | 2016-04-01 00:00:00 | 0 | 84189 | 2016-04-03 15:46:14 |
3422 | 2016-03-28 18:56:23 | Audi_A6_3.0Tdi_Gewinde___20zoll | privat | Angebot | 0 | control | NaN | 2006 | NaN | 0 | a6 | 5000 | 0 | NaN | audi | NaN | 2016-03-28 00:00:00 | 0 | 26632 | 2016-04-04 22:18:49 |
3643 | 2016-03-27 00:49:35 | Single_frame_golf_4 | privat | Angebot | 0 | test | NaN | 2000 | NaN | 0 | golf | 5000 | 0 | NaN | volkswagen | NaN | 2016-03-26 00:00:00 | 0 | 6528 | 2016-04-07 03:15:23 |
3928 | 2016-03-07 21:45:42 | Oldtimer_GAZ_M_21_Wolga_viele_Ersatzteile_Youn... | privat | Angebot | 0 | control | limousine | 1960 | manuell | 0 | NaN | 5000 | 1 | benzin | sonstige_autos | NaN | 2016-03-07 00:00:00 | 0 | 98630 | 2016-04-06 01:15:19 |
4079 | 2016-03-07 18:38:53 | Opel_Zafira_CNG_zu_verkaufen! | privat | Angebot | 0 | control | NaN | 2016 | NaN | 0 | zafira | 5000 | 0 | cng | opel | ja | 2016-03-07 00:00:00 | 0 | 44135 | 2016-03-08 16:46:52 |
4111 | 2016-03-12 08:53:03 | Brauch_ein_neues_Autos_habt_ihr_was_zum_anbiet... | privat | Angebot | 0 | control | NaN | 2000 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-12 00:00:00 | 0 | 80809 | 2016-03-21 06:17:14 |
4509 | 2016-03-22 01:00:15 | Fahrzeuglackierer | privat | Angebot | 0 | control | NaN | 2015 | NaN | 0 | NaN | 5000 | 0 | benzin | sonstige_autos | NaN | 2016-03-22 00:00:00 | 0 | 38440 | 2016-03-26 08:18:27 |
7172 | 2016-03-14 20:46:25 | 2_Ford_Escort_Cabrio_Paket_Preis | privat | Angebot | 0 | control | NaN | 1990 | NaN | 0 | escort | 5000 | 3 | NaN | ford | NaN | 2016-03-14 00:00:00 | 0 | 47804 | 2016-03-14 20:46:25 |
7266 | 2016-03-17 19:53:31 | Tauschen_golf3 | privat | Angebot | 0 | control | NaN | 2016 | NaN | 0 | golf | 5000 | 0 | benzin | volkswagen | nein | 2016-03-17 00:00:00 | 0 | 16515 | 2016-03-20 22:44:57 |
7499 | 2016-03-13 16:37:44 | Karosserie__opel_Kadett_c_Limo | privat | Angebot | 0 | test | limousine | 1978 | NaN | 0 | kadett | 5000 | 0 | NaN | opel | NaN | 2016-03-13 00:00:00 | 0 | 93413 | 2016-03-31 21:47:16 |
7512 | 2016-03-28 16:50:43 | Bmw_750i_Tausch_oder_Angebot | privat | Angebot | 0 | test | limousine | 1995 | automatik | 326 | 7er | 5000 | 0 | NaN | bmw | nein | 2016-03-28 00:00:00 | 0 | 23826 | 2016-04-06 22:17:41 |
7672 | 2016-03-13 01:57:22 | FORD_FOCUS_WENIG_KILOMETER | privat | Angebot | 0 | control | NaN | 2018 | NaN | 0 | focus | 5000 | 12 | diesel | ford | nein | 2016-03-13 00:00:00 | 0 | 52385 | 2016-03-14 21:46:28 |
8438 | 2016-03-14 21:37:21 | Jaguar_xj_40_Daimler_Original_Baujahr_1994 | privat | Angebot | 0 | test | NaN | 1995 | automatik | 0 | NaN | 5000 | 3 | NaN | jaguar | nein | 2016-03-14 00:00:00 | 0 | 12099 | 2016-03-25 11:45:38 |
8908 | 2016-03-21 08:57:09 | BMW_E46_320d | privat | Angebot | 0 | control | NaN | 2000 | manuell | 0 | NaN | 5000 | 12 | diesel | bmw | ja | 2016-03-21 00:00:00 | 0 | 96367 | 2016-03-28 03:16:00 |
8955 | 2016-03-28 16:58:49 | Tausche/Verkaufe | privat | Angebot | 0 | test | limousine | 2012 | manuell | 143 | 3er | 5000 | 2 | diesel | bmw | nein | 2016-03-28 00:00:00 | 0 | 8525 | 2016-03-30 08:17:16 |
9061 | 2016-03-09 09:42:55 | Rote_nummer_dringend | privat | Angebot | 0 | test | NaN | 1995 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-09 00:00:00 | 0 | 4318 | 2016-03-11 13:45:55 |
9673 | 2016-03-20 22:40:27 | Suche_PKW_geschenkt_bitte_alles_anbieten_danke | privat | Angebot | 0 | test | NaN | 1990 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-20 00:00:00 | 0 | 30655 | 2016-04-07 07:46:23 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
39015 | 2016-03-21 10:43:29 | Opel_Astra_f | privat | Angebot | 0 | test | kleinwagen | 1996 | manuell | 150 | astra | 5000 | 4 | benzin | opel | NaN | 2016-03-21 00:00:00 | 0 | 38364 | 2016-03-29 17:45:43 |
39371 | 2016-04-02 15:39:51 | biete_unikat_mit_kunstwert | privat | Angebot | 0 | control | NaN | 1990 | NaN | 0 | golf | 5000 | 0 | NaN | volkswagen | NaN | 2016-04-02 00:00:00 | 0 | 10585 | 2016-04-06 14:16:35 |
39594 | 2016-03-11 23:40:30 | fiat_scudo__fuer_Bastler__laeuft_auf_3_zylinde... | privat | Angebot | 0 | test | bus | 2000 | manuell | 69 | andere | 5000 | 5 | diesel | fiat | NaN | 2016-03-11 00:00:00 | 0 | 21107 | 2016-03-17 15:17:51 |
39604 | 2016-03-23 21:51:40 | Kaefer_1303__Projekt_mit_Subaru_EJ22_Motor_eve... | privat | Angebot | 0 | test | limousine | 1973 | manuell | 136 | kaefer | 5000 | 0 | benzin | volkswagen | NaN | 2016-03-23 00:00:00 | 0 | 9131 | 2016-04-06 04:15:27 |
39635 | 2016-03-24 13:51:55 | SUCHE_fuer_TT8N | privat | Angebot | 0 | control | coupe | 2000 | manuell | 0 | tt | 5000 | 12 | benzin | audi | nein | 2016-03-24 00:00:00 | 0 | 61169 | 2016-04-05 23:46:41 |
40264 | 2016-03-27 11:49:32 | Bmw_316_i_Schlachtfesst_austauschmotor | privat | Angebot | 0 | test | NaN | 2000 | NaN | 0 | 3er | 5000 | 0 | NaN | bmw | NaN | 2016-03-27 00:00:00 | 0 | 87766 | 2016-04-07 07:16:08 |
40324 | 2016-03-29 00:57:04 | Golf_5_Bj.2007 | privat | Angebot | 0 | test | limousine | 2007 | manuell | 140 | golf | 5000 | 6 | benzin | volkswagen | NaN | 2016-03-28 00:00:00 | 0 | 12435 | 2016-04-05 13:45:15 |
40568 | 2016-03-25 22:53:28 | Opel_Vectra_B | privat | Angebot | 0 | control | NaN | 2017 | automatik | 136 | vectra | 5000 | 12 | NaN | opel | nein | 2016-03-25 00:00:00 | 0 | 25554 | 2016-04-05 15:45:46 |
40955 | 2016-03-06 00:54:53 | Zer_Guten_Auto_800_ | privat | Angebot | 0 | control | NaN | 2016 | NaN | 0 | NaN | 5000 | 0 | NaN | chevrolet | NaN | 2016-03-06 00:00:00 | 0 | 28325 | 2016-03-06 00:54:53 |
41251 | 2016-03-24 20:00:29 | Tauschen_e36_cabrio | privat | Angebot | 0 | control | cabrio | 1995 | manuell | 116 | NaN | 5000 | 3 | lpg | bmw | nein | 2016-03-24 00:00:00 | 0 | 26897 | 2016-03-29 06:47:12 |
42181 | 2016-03-27 19:50:53 | SAMSUNG_55_3D_Tv_und_Soundbar_gegen_Auto | privat | Angebot | 0 | test | NaN | 1910 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-27 00:00:00 | 0 | 57080 | 2016-04-06 01:15:30 |
42851 | 2016-03-26 22:54:00 | Mercedes_Benz._____________________Kabine_actr... | privat | Angebot | 0 | test | NaN | 2005 | NaN | 0 | andere | 5000 | 0 | NaN | mercedes_benz | NaN | 2016-03-26 00:00:00 | 0 | 46240 | 2016-04-07 01:45:36 |
43224 | 2016-03-18 23:48:59 | Golf_3_in_Teile__..._einfach_Anfragen__4_Tuerer | privat | Angebot | 0 | test | NaN | 2000 | NaN | 60 | NaN | 5000 | 0 | NaN | volkswagen | NaN | 2016-03-18 00:00:00 | 0 | 32312 | 2016-04-06 00:46:08 |
43481 | 2016-03-20 10:59:19 | Holzkoepfe/Arschmaden | privat | Angebot | 0 | control | NaN | 2000 | NaN | 0 | NaN | 5000 | 8 | NaN | sonstige_autos | NaN | 2016-03-20 00:00:00 | 0 | 23845 | 2016-03-20 10:59:19 |
43492 | 2016-04-03 13:52:33 | Opel_Astra_G_mit_Klima_elektrische_fenster | privat | Angebot | 0 | control | NaN | 2000 | NaN | 101 | astra | 5000 | 0 | NaN | opel | NaN | 2016-04-03 00:00:00 | 0 | 47169 | 2016-04-05 12:46:36 |
44109 | 2016-03-13 19:37:21 | BMW_e30_Sammlungsaufloesung | privat | Angebot | 0 | control | NaN | 1990 | NaN | 0 | 3er | 5000 | 0 | NaN | bmw | NaN | 2016-03-13 00:00:00 | 0 | 97456 | 2016-03-13 19:37:21 |
44195 | 2016-03-11 12:50:39 | whatsapp_gruppe_Ford_focus | privat | Angebot | 0 | test | NaN | 2015 | NaN | 0 | focus | 5000 | 0 | NaN | ford | NaN | 2016-03-11 00:00:00 | 0 | 72770 | 2016-04-07 06:45:56 |
44415 | 2016-03-10 22:55:34 | Auffahrrampe_/_Ausstellungsrampe | privat | Angebot | 0 | control | NaN | 2013 | NaN | 0 | NaN | 5000 | 0 | NaN | sonstige_autos | NaN | 2016-03-10 00:00:00 | 0 | 31789 | 2016-03-18 04:15:19 |
44624 | 2016-03-20 13:44:18 | Opel_zafira__selection!_2.2_benzin__lpg._Defect | privat | Angebot | 0 | test | bus | 2001 | automatik | 150 | NaN | 5000 | 10 | lpg | opel | ja | 2016-03-20 00:00:00 | 0 | 63452 | 2016-03-26 22:47:07 |
46213 | 2016-04-02 13:47:16 | Bellier_Vario | privat | Angebot | 0 | test | kleinwagen | 1910 | NaN | 0 | NaN | 5000 | 1 | andere | sonstige_autos | NaN | 2016-04-02 00:00:00 | 0 | 93105 | 2016-04-04 11:16:30 |
46220 | 2016-03-29 04:03:36 | Tausch_oder_Verkauf_Renault_twingo | privat | Angebot | 0 | test | NaN | 2000 | manuell | 0 | twingo | 5000 | 0 | benzin | renault | NaN | 2016-03-29 00:00:00 | 0 | 71159 | 2016-04-05 17:25:59 |
46665 | 2016-03-24 13:50:22 | FIAT_COUPE_175_ERSATZTEILE | privat | Angebot | 0 | control | coupe | 1994 | manuell | 139 | andere | 5000 | 0 | benzin | fiat | NaN | 2016-03-24 00:00:00 | 0 | 58553 | 2016-04-07 05:17:26 |
47142 | 2016-03-19 08:37:44 | GESUCHT_WIRD_UNBEDINGT_EIN_VOLKSWAGEN_TOURAN_S... | privat | Angebot | 0 | test | bus | 2003 | manuell | 101 | touran | 5000 | 0 | diesel | volkswagen | ja | 2016-03-19 00:00:00 | 0 | 27753 | 2016-04-06 07:17:47 |
47280 | 2016-04-05 11:43:12 | PKW_gesucht___01793917553 | privat | Angebot | 0 | test | NaN | 2005 | manuell | 0 | NaN | 5000 | 0 | benzin | volkswagen | NaN | 2016-04-05 00:00:00 | 0 | 8066 | 2016-04-05 11:43:12 |
47310 | 2016-03-14 04:55:34 | Vw_golf_3_gt | privat | Angebot | 0 | control | NaN | 2016 | manuell | 90 | golf | 5000 | 12 | NaN | volkswagen | NaN | 2016-03-14 00:00:00 | 0 | 99955 | 2016-04-07 06:45:52 |
47368 | 2016-04-01 23:44:14 | ZU_VERSCHENKEN_AUTO | privat | Angebot | 0 | control | kleinwagen | 1998 | manuell | 60 | polo | 5000 | 8 | benzin | volkswagen | NaN | 2016-04-01 00:00:00 | 0 | 69469 | 2016-04-01 23:44:14 |
48193 | 2016-04-04 01:47:49 | Mein_Auto_fuer_Ihre_Werbung | privat | Angebot | 0 | control | NaN | 2005 | NaN | 0 | NaN | 5000 | 0 | NaN | volkswagen | NaN | 2016-04-03 00:00:00 | 0 | 7549 | 2016-04-06 08:16:17 |
48290 | 2016-04-03 17:41:30 | Suche_mk2_Fahrer | privat | Angebot | 0 | test | NaN | 2009 | NaN | 0 | NaN | 5000 | 0 | NaN | ford | NaN | 2016-04-03 00:00:00 | 0 | 45711 | 2016-04-05 17:26:42 |
48985 | 2016-03-27 08:55:25 | Stockcar_Autos_Peugeot | privat | Angebot | 0 | test | NaN | 2007 | NaN | 60 | NaN | 5000 | 0 | NaN | peugeot | NaN | 2016-03-27 00:00:00 | 0 | 17153 | 2016-04-07 05:17:18 |
49496 | 2016-03-26 13:55:28 | Bmw_e39_520 | privat | Angebot | 0 | control | limousine | 1998 | manuell | 0 | NaN | 5000 | 0 | NaN | bmw | NaN | 2016-03-26 00:00:00 | 0 | 26188 | 2016-03-26 13:55:28 |
130 rows × 20 columns
For odometer and price columns, I started by exploring unique values and statistics of the dataset using series.unique.shape() and series.describe() methods. To identify and remove some outliers from the dataset, I used df[(df["col"] > x ) & (df["col"] < y )] or df[df["col"].between(x,y)] as shown above. For instance, I use odometer value of(5000) and price (0) as my outlier limits to the dataset. There are 1421 cars with zero price that need to be removed from the dataset. Furthermore, date columns(date_crawled, ad_created, last_seen_date) are string values thus need to be converted to numerical values while registeration_month and registeration_year are aleady numerical values.
To better understand string date columns (date_crawled, ad_created, last_seen_date) quantatively, these columns are converted to numerical representation. series.value_count()method is used to extract the date values to understand the date range while, series.sort_index() and series.sort_values()methods are used to sort both values and dates in ascending order as seen below
(autos['date_crawled']
.str[:10]
.value_counts(normalize = True, dropna = True)
.sort_index()
)
2016-03-05 0.02538 2016-03-06 0.01394 2016-03-07 0.03596 2016-03-08 0.03330 2016-03-09 0.03322 2016-03-10 0.03212 2016-03-11 0.03248 2016-03-12 0.03678 2016-03-13 0.01556 2016-03-14 0.03662 2016-03-15 0.03398 2016-03-16 0.02950 2016-03-17 0.03152 2016-03-18 0.01306 2016-03-19 0.03490 2016-03-20 0.03782 2016-03-21 0.03752 2016-03-22 0.03294 2016-03-23 0.03238 2016-03-24 0.02910 2016-03-25 0.03174 2016-03-26 0.03248 2016-03-27 0.03104 2016-03-28 0.03484 2016-03-29 0.03418 2016-03-30 0.03362 2016-03-31 0.03192 2016-04-01 0.03380 2016-04-02 0.03540 2016-04-03 0.03868 2016-04-04 0.03652 2016-04-05 0.01310 2016-04-06 0.00318 2016-04-07 0.00142 Name: date_crawled, dtype: float64
(autos["date_crawled"]
.str[:10]
.value_counts(normalize=True, dropna=False)
.sort_values()
)
2016-04-07 0.00142 2016-04-06 0.00318 2016-03-18 0.01306 2016-04-05 0.01310 2016-03-06 0.01394 2016-03-13 0.01556 2016-03-05 0.02538 2016-03-24 0.02910 2016-03-16 0.02950 2016-03-27 0.03104 2016-03-17 0.03152 2016-03-25 0.03174 2016-03-31 0.03192 2016-03-10 0.03212 2016-03-23 0.03238 2016-03-11 0.03248 2016-03-26 0.03248 2016-03-22 0.03294 2016-03-09 0.03322 2016-03-08 0.03330 2016-03-30 0.03362 2016-04-01 0.03380 2016-03-15 0.03398 2016-03-29 0.03418 2016-03-28 0.03484 2016-03-19 0.03490 2016-04-02 0.03540 2016-03-07 0.03596 2016-04-04 0.03652 2016-03-14 0.03662 2016-03-12 0.03678 2016-03-21 0.03752 2016-03-20 0.03782 2016-04-03 0.03868 Name: date_crawled, dtype: float64
Based on my observation, the highest distribution is between 2016-03-15 to 2016-04-04 with peak of 0.03192 to 0.03782. While lowest score is between 2016-04-05 to 2016-0407 with average score of 0.001310. In general, the distribution seems to increase at begining of march and reduces towards the April.
(autos["ad_created"]
.str[:10]
.value_counts(normalize = True, dropna = True)
.sort_index()
)
2015-06-11 0.00002 2015-08-10 0.00002 2015-09-09 0.00002 2015-11-10 0.00002 2015-12-05 0.00002 2015-12-30 0.00002 2016-01-03 0.00002 2016-01-07 0.00002 2016-01-10 0.00004 2016-01-13 0.00002 2016-01-14 0.00002 2016-01-16 0.00002 2016-01-22 0.00002 2016-01-27 0.00006 2016-01-29 0.00002 2016-02-01 0.00002 2016-02-02 0.00004 2016-02-05 0.00004 2016-02-07 0.00002 2016-02-08 0.00002 2016-02-09 0.00004 2016-02-11 0.00002 2016-02-12 0.00006 2016-02-14 0.00004 2016-02-16 0.00002 2016-02-17 0.00002 2016-02-18 0.00004 2016-02-19 0.00006 2016-02-20 0.00004 2016-02-21 0.00006 ... 2016-03-09 0.03324 2016-03-10 0.03186 2016-03-11 0.03278 2016-03-12 0.03662 2016-03-13 0.01692 2016-03-14 0.03522 2016-03-15 0.03374 2016-03-16 0.03000 2016-03-17 0.03120 2016-03-18 0.01372 2016-03-19 0.03384 2016-03-20 0.03786 2016-03-21 0.03772 2016-03-22 0.03280 2016-03-23 0.03218 2016-03-24 0.02908 2016-03-25 0.03188 2016-03-26 0.03256 2016-03-27 0.03090 2016-03-28 0.03496 2016-03-29 0.03414 2016-03-30 0.03344 2016-03-31 0.03192 2016-04-01 0.03380 2016-04-02 0.03508 2016-04-03 0.03892 2016-04-04 0.03688 2016-04-05 0.01184 2016-04-06 0.00326 2016-04-07 0.00128 Name: ad_created, Length: 76, dtype: float64
(autos["ad_created"]
.str[:10]
.value_counts(normalize = True, dropna = True)
. sort_values()
)
2016-01-03 0.00002 2015-09-09 0.00002 2015-11-10 0.00002 2016-02-08 0.00002 2016-01-22 0.00002 2016-01-29 0.00002 2015-12-30 0.00002 2015-08-10 0.00002 2016-01-14 0.00002 2016-02-07 0.00002 2016-02-01 0.00002 2016-02-11 0.00002 2015-12-05 0.00002 2016-02-17 0.00002 2016-02-22 0.00002 2015-06-11 0.00002 2016-02-16 0.00002 2016-01-16 0.00002 2016-01-13 0.00002 2016-01-07 0.00002 2016-01-10 0.00004 2016-02-02 0.00004 2016-02-18 0.00004 2016-02-09 0.00004 2016-02-20 0.00004 2016-02-26 0.00004 2016-02-14 0.00004 2016-02-24 0.00004 2016-02-05 0.00004 2016-02-19 0.00006 ... 2016-03-06 0.01512 2016-03-13 0.01692 2016-03-05 0.02304 2016-03-24 0.02908 2016-03-16 0.03000 2016-03-27 0.03090 2016-03-17 0.03120 2016-03-10 0.03186 2016-03-25 0.03188 2016-03-31 0.03192 2016-03-23 0.03218 2016-03-26 0.03256 2016-03-11 0.03278 2016-03-22 0.03280 2016-03-09 0.03324 2016-03-08 0.03334 2016-03-30 0.03344 2016-03-15 0.03374 2016-04-01 0.03380 2016-03-19 0.03384 2016-03-29 0.03414 2016-03-07 0.03474 2016-03-28 0.03496 2016-04-02 0.03508 2016-03-14 0.03522 2016-03-12 0.03662 2016-04-04 0.03688 2016-03-21 0.03772 2016-03-20 0.03786 2016-04-03 0.03892 Name: ad_created, Length: 76, dtype: float64
Based on observations, the lowest distribution is between January 2015 to feburary 2016 with lowest score of 0.00002. While between 2016-03-16 to 2016-04-03, the distribution reached a peak of 0.03892
(autos["last_seen_date"]
.str[:10]
.value_counts(normalize = True, dropna = True)
.sort_index()
)
2016-03-05 0.00108 2016-03-06 0.00442 2016-03-07 0.00536 2016-03-08 0.00760 2016-03-09 0.00986 2016-03-10 0.01076 2016-03-11 0.01252 2016-03-12 0.02382 2016-03-13 0.00898 2016-03-14 0.01280 2016-03-15 0.01588 2016-03-16 0.01644 2016-03-17 0.02792 2016-03-18 0.00742 2016-03-19 0.01574 2016-03-20 0.02070 2016-03-21 0.02074 2016-03-22 0.02158 2016-03-23 0.01858 2016-03-24 0.01956 2016-03-25 0.01920 2016-03-26 0.01696 2016-03-27 0.01602 2016-03-28 0.02086 2016-03-29 0.02234 2016-03-30 0.02484 2016-03-31 0.02384 2016-04-01 0.02310 2016-04-02 0.02490 2016-04-03 0.02536 2016-04-04 0.02462 2016-04-05 0.12428 2016-04-06 0.22100 2016-04-07 0.13092 Name: last_seen_date, dtype: float64
(autos["last_seen_date"]
.str[:10]
.value_counts(normalize = True, dropna = False)
.sort_values()
)
2016-03-05 0.00108 2016-03-06 0.00442 2016-03-07 0.00536 2016-03-18 0.00742 2016-03-08 0.00760 2016-03-13 0.00898 2016-03-09 0.00986 2016-03-10 0.01076 2016-03-11 0.01252 2016-03-14 0.01280 2016-03-19 0.01574 2016-03-15 0.01588 2016-03-27 0.01602 2016-03-16 0.01644 2016-03-26 0.01696 2016-03-23 0.01858 2016-03-25 0.01920 2016-03-24 0.01956 2016-03-20 0.02070 2016-03-21 0.02074 2016-03-28 0.02086 2016-03-22 0.02158 2016-03-29 0.02234 2016-04-01 0.02310 2016-03-12 0.02382 2016-03-31 0.02384 2016-04-04 0.02462 2016-03-30 0.02484 2016-04-02 0.02490 2016-04-03 0.02536 2016-03-17 0.02792 2016-04-05 0.12428 2016-04-07 0.13092 2016-04-06 0.22100 Name: last_seen_date, dtype: float64
Between 2016-03-05 to 2016-03-31, there is gradual increase in distribution of values with peak of 0.02484 towards the end of March 2016. While in April, the distribution seems to fluctuate between 0.12428 to 0.22100
autos["registration_year"].describe() # To get info about statistics of registration year
count 50000.000000 mean 2005.073280 std 105.712813 min 1000.000000 25% 1999.000000 50% 2003.000000 75% 2008.000000 max 9999.000000 Name: registration_year, dtype: float64
The minimum distribution is 1000 while maximum is 9999 for registarion year. It has mean of 2005.1 and standard deviation of 105.7
autos[autos["registration_year"].between(1900,2016)]# To check and remove for registration year outside 1900 & 2016
date_crawled | name | seller | offer_type | price | abtest | vehicle_type | registration_year | gear_box | power_ps | model | odometer_km | registration_month | fuel_type | brand | unrepaired_damage | ad_created | num_pics | postal_code | last_seen_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2016-03-26 17:47:46 | Peugeot_807_160_NAVTECH_ON_BOARD | privat | Angebot | 5000 | control | bus | 2004 | manuell | 158 | andere | 150000 | 3 | lpg | peugeot | nein | 2016-03-26 00:00:00 | 0 | 79588 | 2016-04-06 06:45:54 |
1 | 2016-04-04 13:38:56 | BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik | privat | Angebot | 8500 | control | limousine | 1997 | automatik | 286 | 7er | 150000 | 6 | benzin | bmw | nein | 2016-04-04 00:00:00 | 0 | 71034 | 2016-04-06 14:45:08 |
2 | 2016-03-26 18:57:24 | Volkswagen_Golf_1.6_United | privat | Angebot | 8990 | test | limousine | 2009 | manuell | 102 | golf | 70000 | 7 | benzin | volkswagen | nein | 2016-03-26 00:00:00 | 0 | 35394 | 2016-04-06 20:15:37 |
3 | 2016-03-12 16:58:10 | Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan... | privat | Angebot | 4350 | control | kleinwagen | 2007 | automatik | 71 | fortwo | 70000 | 6 | benzin | smart | nein | 2016-03-12 00:00:00 | 0 | 33729 | 2016-03-15 03:16:28 |
4 | 2016-04-01 14:38:50 | Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg... | privat | Angebot | 1350 | test | kombi | 2003 | manuell | 0 | focus | 150000 | 7 | benzin | ford | nein | 2016-04-01 00:00:00 | 0 | 39218 | 2016-04-01 14:38:50 |
5 | 2016-03-21 13:47:45 | Chrysler_Grand_Voyager_2.8_CRD_Aut.Limited_Sto... | privat | Angebot | 7900 | test | bus | 2006 | automatik | 150 | voyager | 150000 | 4 | diesel | chrysler | NaN | 2016-03-21 00:00:00 | 0 | 22962 | 2016-04-06 09:45:21 |
6 | 2016-03-20 17:55:21 | VW_Golf_III_GT_Special_Electronic_Green_Metall... | privat | Angebot | 300 | test | limousine | 1995 | manuell | 90 | golf | 150000 | 8 | benzin | volkswagen | NaN | 2016-03-20 00:00:00 | 0 | 31535 | 2016-03-23 02:48:59 |
7 | 2016-03-16 18:55:19 | Golf_IV_1.9_TDI_90PS | privat | Angebot | 1990 | control | limousine | 1998 | manuell | 90 | golf | 150000 | 12 | diesel | volkswagen | nein | 2016-03-16 00:00:00 | 0 | 53474 | 2016-04-07 03:17:32 |
8 | 2016-03-22 16:51:34 | Seat_Arosa | privat | Angebot | 250 | test | NaN | 2000 | manuell | 0 | arosa | 150000 | 10 | NaN | seat | nein | 2016-03-22 00:00:00 | 0 | 7426 | 2016-03-26 18:18:10 |
9 | 2016-03-16 13:47:02 | Renault_Megane_Scenic_1.6e_RT_Klimaanlage | privat | Angebot | 590 | control | bus | 1997 | manuell | 90 | megane | 150000 | 7 | benzin | renault | nein | 2016-03-16 00:00:00 | 0 | 15749 | 2016-04-06 10:46:35 |
11 | 2016-03-16 18:45:34 | Mercedes_A140_Motorschaden | privat | Angebot | 350 | control | NaN | 2000 | NaN | 0 | NaN | 150000 | 0 | benzin | mercedes_benz | NaN | 2016-03-16 00:00:00 | 0 | 17498 | 2016-03-16 18:45:34 |
12 | 2016-03-31 19:48:22 | Smart_smart_fortwo_coupe_softouch_pure_MHD_Pan... | privat | Angebot | 5299 | control | kleinwagen | 2010 | automatik | 71 | fortwo | 50000 | 9 | benzin | smart | nein | 2016-03-31 00:00:00 | 0 | 34590 | 2016-04-06 14:17:52 |
13 | 2016-03-23 10:48:32 | Audi_A3_1.6_tuning | privat | Angebot | 1350 | control | limousine | 1999 | manuell | 101 | a3 | 150000 | 11 | benzin | audi | nein | 2016-03-23 00:00:00 | 0 | 12043 | 2016-04-01 14:17:13 |
14 | 2016-03-23 11:50:46 | Renault_Clio_3__Dynamique_1.2__16_V;_viele_Ver... | privat | Angebot | 3999 | test | kleinwagen | 2007 | manuell | 75 | clio | 150000 | 9 | benzin | renault | NaN | 2016-03-23 00:00:00 | 0 | 81737 | 2016-04-01 15:46:47 |
15 | 2016-04-01 12:06:20 | Corvette_C3_Coupe_T_Top_Crossfire_Injection | privat | Angebot | 18900 | test | coupe | 1982 | automatik | 203 | NaN | 80000 | 6 | benzin | sonstige_autos | nein | 2016-04-01 00:00:00 | 0 | 61276 | 2016-04-02 21:10:48 |
16 | 2016-03-16 14:59:02 | Opel_Vectra_B_Kombi | privat | Angebot | 350 | test | kombi | 1999 | manuell | 101 | vectra | 150000 | 5 | benzin | opel | nein | 2016-03-16 00:00:00 | 0 | 57299 | 2016-03-18 05:29:37 |
17 | 2016-03-29 11:46:22 | Volkswagen_Scirocco_2_G60 | privat | Angebot | 5500 | test | coupe | 1990 | manuell | 205 | scirocco | 150000 | 6 | benzin | volkswagen | nein | 2016-03-29 00:00:00 | 0 | 74821 | 2016-04-05 20:46:26 |
18 | 2016-03-26 19:57:44 | Verkaufen_mein_bmw_e36_320_i_touring | privat | Angebot | 300 | control | bus | 1995 | manuell | 150 | 3er | 150000 | 0 | benzin | bmw | NaN | 2016-03-26 00:00:00 | 0 | 54329 | 2016-04-02 12:16:41 |
19 | 2016-03-17 13:36:21 | mazda_tribute_2.0_mit_gas_und_tuev_neu_2018 | privat | Angebot | 4150 | control | suv | 2004 | manuell | 124 | andere | 150000 | 2 | lpg | mazda | nein | 2016-03-17 00:00:00 | 0 | 40878 | 2016-03-17 14:45:58 |
20 | 2016-03-05 19:57:31 | Audi_A4_Avant_1.9_TDI_*6_Gang*AHK*Klimatronik*... | privat | Angebot | 3500 | test | kombi | 2003 | manuell | 131 | a4 | 150000 | 5 | diesel | audi | NaN | 2016-03-05 00:00:00 | 0 | 53913 | 2016-03-07 05:46:46 |
21 | 2016-03-06 19:07:10 | Porsche_911_Carrera_4S_Cabrio | privat | Angebot | 41500 | test | cabrio | 2004 | manuell | 320 | 911 | 150000 | 4 | benzin | porsche | nein | 2016-03-06 00:00:00 | 0 | 65428 | 2016-04-05 23:46:19 |
22 | 2016-03-28 20:50:54 | MINI_Cooper_S_Cabrio | privat | Angebot | 25450 | control | cabrio | 2015 | manuell | 184 | cooper | 10000 | 1 | benzin | mini | nein | 2016-03-28 00:00:00 | 0 | 44789 | 2016-04-01 06:45:30 |
23 | 2016-03-10 19:55:34 | Peugeot_Boxer_2_2_HDi_120_Ps_9_Sitzer_inkl_Klima | privat | Angebot | 7999 | control | bus | 2010 | manuell | 120 | NaN | 150000 | 2 | diesel | peugeot | nein | 2016-03-10 00:00:00 | 0 | 30900 | 2016-03-17 08:45:17 |
24 | 2016-04-03 11:57:02 | BMW_535i_xDrive_Sport_Aut. | privat | Angebot | 48500 | control | limousine | 2014 | automatik | 306 | 5er | 30000 | 12 | benzin | bmw | nein | 2016-04-03 00:00:00 | 0 | 22547 | 2016-04-07 13:16:50 |
25 | 2016-03-21 21:56:18 | Ford_escort_kombi_an_bastler_mit_ghia_ausstattung | privat | Angebot | 90 | control | kombi | 1996 | manuell | 116 | NaN | 150000 | 4 | benzin | ford | ja | 2016-03-21 00:00:00 | 0 | 27574 | 2016-04-01 05:16:49 |
26 | 2016-04-03 22:46:28 | Volkswagen_Polo_Fox | privat | Angebot | 777 | control | kleinwagen | 1992 | manuell | 54 | polo | 125000 | 2 | benzin | volkswagen | nein | 2016-04-03 00:00:00 | 0 | 38110 | 2016-04-05 23:46:48 |
27 | 2016-03-27 18:45:01 | Hat_einer_Ahnung_mit_Ford_Galaxy_HILFE | privat | Angebot | 0 | control | NaN | 2005 | NaN | 0 | NaN | 150000 | 0 | NaN | ford | NaN | 2016-03-27 00:00:00 | 0 | 66701 | 2016-03-27 18:45:01 |
28 | 2016-03-19 21:56:19 | MINI_Cooper_D | privat | Angebot | 5250 | control | kleinwagen | 2007 | manuell | 110 | cooper | 150000 | 7 | diesel | mini | ja | 2016-03-19 00:00:00 | 0 | 15745 | 2016-04-07 14:58:48 |
29 | 2016-04-02 12:45:44 | Mercedes_Benz_E_320_T_CDI_Avantgarde_DPF7_Sitz... | privat | Angebot | 4999 | test | kombi | 2004 | automatik | 204 | e_klasse | 150000 | 10 | diesel | mercedes_benz | nein | 2016-04-02 00:00:00 | 0 | 47638 | 2016-04-02 12:45:44 |
30 | 2016-03-14 11:47:31 | Peugeot_206_Unfallfahrzeug | privat | Angebot | 80 | test | kleinwagen | 2002 | manuell | 60 | 2_reihe | 150000 | 6 | benzin | peugeot | ja | 2016-03-14 00:00:00 | 0 | 57076 | 2016-03-14 11:47:31 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
49970 | 2016-03-21 22:47:37 | c4_Grand_Picasso_mit_Automatik_Leder_Navi_Temp... | privat | Angebot | 15800 | control | bus | 2010 | automatik | 136 | c4 | 60000 | 4 | diesel | citroen | nein | 2016-03-21 00:00:00 | 0 | 14947 | 2016-04-07 04:17:34 |
49971 | 2016-03-29 14:54:12 | W.Lupo_1.0 | privat | Angebot | 950 | test | kleinwagen | 2001 | manuell | 50 | lupo | 150000 | 4 | benzin | volkswagen | nein | 2016-03-29 00:00:00 | 0 | 65197 | 2016-03-29 20:41:51 |
49972 | 2016-03-26 22:25:23 | Mercedes_Benz_Vito_115_CDI_Extralang_Aut. | privat | Angebot | 3300 | control | bus | 2004 | automatik | 150 | vito | 150000 | 10 | diesel | mercedes_benz | ja | 2016-03-26 00:00:00 | 0 | 65326 | 2016-03-28 11:28:18 |
49973 | 2016-03-27 05:32:39 | Mercedes_Benz_SLK_200_Kompressor | privat | Angebot | 6000 | control | cabrio | 2004 | manuell | 163 | slk | 150000 | 11 | benzin | mercedes_benz | nein | 2016-03-27 00:00:00 | 0 | 53567 | 2016-03-27 08:25:24 |
49974 | 2016-03-20 10:52:31 | Golf_1_Cabrio_Tuev_Neu_viele_Extras_alles_eing... | privat | Angebot | 0 | control | cabrio | 1983 | manuell | 70 | golf | 150000 | 2 | benzin | volkswagen | nein | 2016-03-20 00:00:00 | 0 | 8209 | 2016-03-27 19:48:16 |
49975 | 2016-03-27 20:51:39 | Honda_Jazz_1.3_DSi_i_VTEC_IMA_CVT_Comfort | privat | Angebot | 9700 | control | kleinwagen | 2012 | automatik | 88 | jazz | 100000 | 11 | hybrid | honda | nein | 2016-03-27 00:00:00 | 0 | 84385 | 2016-04-05 19:45:34 |
49976 | 2016-03-19 18:56:05 | Audi_80_Avant_2.6_E__Vollausstattung!!_Einziga... | privat | Angebot | 5900 | test | kombi | 1992 | automatik | 150 | 80 | 150000 | 12 | benzin | audi | nein | 2016-03-19 00:00:00 | 0 | 36100 | 2016-04-07 06:16:44 |
49977 | 2016-03-31 18:37:18 | Mercedes_Benz_C200_Cdi_W203 | privat | Angebot | 5500 | control | limousine | 2003 | manuell | 116 | c_klasse | 150000 | 2 | diesel | mercedes_benz | nein | 2016-03-31 00:00:00 | 0 | 33739 | 2016-04-06 12:16:11 |
49978 | 2016-04-04 10:37:14 | Mercedes_Benz_E_200_Classic | privat | Angebot | 900 | control | limousine | 1996 | automatik | 136 | e_klasse | 150000 | 9 | benzin | mercedes_benz | ja | 2016-04-04 00:00:00 | 0 | 24405 | 2016-04-06 12:44:20 |
49979 | 2016-03-20 18:38:40 | Volkswagen_Polo_1.6_TDI_Style | privat | Angebot | 11000 | test | kleinwagen | 2011 | manuell | 90 | polo | 70000 | 11 | diesel | volkswagen | nein | 2016-03-20 00:00:00 | 0 | 48455 | 2016-04-07 01:45:12 |
49980 | 2016-03-12 10:55:54 | Ford_Escort_Turnier_16V | privat | Angebot | 400 | control | kombi | 1995 | manuell | 105 | escort | 125000 | 3 | benzin | ford | NaN | 2016-03-12 00:00:00 | 0 | 56218 | 2016-04-06 17:16:49 |
49981 | 2016-03-15 09:38:21 | Opel_Astra_Kombi_mit_Anhaengerkupplung | privat | Angebot | 2000 | control | kombi | 1998 | manuell | 115 | astra | 150000 | 12 | benzin | opel | nein | 2016-03-15 00:00:00 | 0 | 86859 | 2016-04-05 17:21:46 |
49982 | 2016-03-29 18:51:08 | Skoda_Fabia_4_Tuerer_Bj:2004__85.000Tkm | privat | Angebot | 1950 | control | kleinwagen | 2004 | manuell | 0 | fabia | 90000 | 7 | benzin | skoda | NaN | 2016-03-29 00:00:00 | 0 | 45884 | 2016-03-29 18:51:08 |
49983 | 2016-03-06 12:43:04 | Ford_focus_99 | privat | Angebot | 600 | test | kleinwagen | 1999 | manuell | 101 | focus | 150000 | 4 | benzin | ford | NaN | 2016-03-06 00:00:00 | 0 | 52477 | 2016-03-09 06:16:08 |
49984 | 2016-03-31 22:48:48 | Student_sucht_ein__Anfaengerauto___ab_2000_BJ_... | privat | Angebot | 0 | test | NaN | 2000 | NaN | 0 | NaN | 150000 | 0 | NaN | sonstige_autos | NaN | 2016-03-31 00:00:00 | 0 | 12103 | 2016-04-02 19:44:53 |
49985 | 2016-04-02 16:38:23 | Verkaufe_meinen_vw_vento! | privat | Angebot | 1000 | control | NaN | 1995 | automatik | 0 | NaN | 150000 | 0 | benzin | volkswagen | NaN | 2016-04-02 00:00:00 | 0 | 30900 | 2016-04-06 15:17:52 |
49986 | 2016-04-04 20:46:02 | Chrysler_300C_3.0_CRD_DPF_Automatik_Voll_Ausst... | privat | Angebot | 15900 | control | limousine | 2010 | automatik | 218 | 300c | 125000 | 11 | diesel | chrysler | nein | 2016-04-04 00:00:00 | 0 | 73527 | 2016-04-06 23:16:00 |
49987 | 2016-03-22 20:47:27 | Audi_A3_Limousine_2.0_TDI_DPF_Ambition__NAVI__... | privat | Angebot | 21990 | control | limousine | 2013 | manuell | 150 | a3 | 50000 | 11 | diesel | audi | nein | 2016-03-22 00:00:00 | 0 | 94362 | 2016-03-26 22:46:06 |
49988 | 2016-03-28 19:49:51 | BMW_330_Ci | privat | Angebot | 9550 | control | coupe | 2001 | manuell | 231 | 3er | 150000 | 10 | benzin | bmw | nein | 2016-03-28 00:00:00 | 0 | 83646 | 2016-04-07 02:17:40 |
49989 | 2016-03-11 19:50:37 | VW_Polo_zum_Ausschlachten_oder_Wiederaufbau | privat | Angebot | 150 | test | kleinwagen | 1997 | manuell | 0 | polo | 150000 | 5 | benzin | volkswagen | ja | 2016-03-11 00:00:00 | 0 | 21244 | 2016-03-12 10:17:55 |
49990 | 2016-03-21 19:54:19 | Mercedes_Benz_A_200__BlueEFFICIENCY__Urban | privat | Angebot | 17500 | test | limousine | 2012 | manuell | 156 | a_klasse | 30000 | 12 | benzin | mercedes_benz | nein | 2016-03-21 00:00:00 | 0 | 58239 | 2016-04-06 22:46:57 |
49991 | 2016-03-06 15:25:19 | Kleinwagen | privat | Angebot | 500 | control | NaN | 2016 | manuell | 0 | twingo | 150000 | 0 | benzin | renault | NaN | 2016-03-06 00:00:00 | 0 | 61350 | 2016-03-06 18:24:19 |
49992 | 2016-03-10 19:37:38 | Fiat_Grande_Punto_1.4_T_Jet_16V_Sport | privat | Angebot | 4800 | control | kleinwagen | 2009 | manuell | 120 | andere | 125000 | 9 | lpg | fiat | nein | 2016-03-10 00:00:00 | 0 | 68642 | 2016-03-13 01:44:51 |
49993 | 2016-03-15 18:47:35 | Audi_A3__1_8l__Silber;_schoenes_Fahrzeug | privat | Angebot | 1650 | control | kleinwagen | 1997 | manuell | 0 | NaN | 150000 | 7 | benzin | audi | NaN | 2016-03-15 00:00:00 | 0 | 65203 | 2016-04-06 19:46:53 |
49994 | 2016-03-22 17:36:42 | Audi_A6__S6__Avant_4.2_quattro_eventuell_Tausc... | privat | Angebot | 5000 | control | kombi | 2001 | automatik | 299 | a6 | 150000 | 1 | benzin | audi | nein | 2016-03-22 00:00:00 | 0 | 46537 | 2016-04-06 08:16:39 |
49995 | 2016-03-27 14:38:19 | Audi_Q5_3.0_TDI_qu._S_tr.__Navi__Panorama__Xenon | privat | Angebot | 24900 | control | limousine | 2011 | automatik | 239 | q5 | 100000 | 1 | diesel | audi | nein | 2016-03-27 00:00:00 | 0 | 82131 | 2016-04-01 13:47:40 |
49996 | 2016-03-28 10:50:25 | Opel_Astra_F_Cabrio_Bertone_Edition___TÜV_neu+... | privat | Angebot | 1980 | control | cabrio | 1996 | manuell | 75 | astra | 150000 | 5 | benzin | opel | nein | 2016-03-28 00:00:00 | 0 | 44807 | 2016-04-02 14:18:02 |
49997 | 2016-04-02 14:44:48 | Fiat_500_C_1.2_Dualogic_Lounge | privat | Angebot | 13200 | test | cabrio | 2014 | automatik | 69 | 500 | 5000 | 11 | benzin | fiat | nein | 2016-04-02 00:00:00 | 0 | 73430 | 2016-04-04 11:47:27 |
49998 | 2016-03-08 19:25:42 | Audi_A3_2.0_TDI_Sportback_Ambition | privat | Angebot | 22900 | control | kombi | 2013 | manuell | 150 | a3 | 40000 | 11 | diesel | audi | nein | 2016-03-08 00:00:00 | 0 | 35683 | 2016-04-05 16:45:07 |
49999 | 2016-03-14 00:42:12 | Opel_Vectra_1.6_16V | privat | Angebot | 1250 | control | limousine | 1996 | manuell | 101 | vectra | 150000 | 1 | benzin | opel | nein | 2016-03-13 00:00:00 | 0 | 45897 | 2016-04-06 21:18:48 |
48028 rows × 20 columns
autos["registration_year"].value_counts(normalize = True).head(20)# To Calculate the distribution of remaining values
2000 0.06708 2005 0.06030 1999 0.06000 2004 0.05474 2003 0.05454 2006 0.05416 2001 0.05406 2002 0.05066 1998 0.04906 2007 0.04608 2008 0.04462 2009 0.04196 1997 0.04056 2011 0.03268 2010 0.03194 2017 0.02906 1996 0.02888 2012 0.02646 2016 0.02632 1995 0.02626 Name: registration_year, dtype: float64
From my observation , I have decided to take the registration date limits between(1900s to 2016) because most of the cars were registered within last 20years as seen in the distribution of values above
unique_brand = autos["brand"].value_counts(normalize = True).head(20)
unique_common_brand= unique_brand[unique_brand > 0.05].index
print(unique_common_brand)
Index(['volkswagen', 'opel', 'bmw', 'mercedes_benz', 'audi', 'ford'], dtype='object')
Out of the top 20 car brands, German cars represent 0.52 % (0.21374 + 0.10922 + 0.10858 + 0.09468 + 0.08566) of all the other brands. Therefore, i would aggregate the data based on unique_common_brand which represent more than 5% of the cars
brand_mean_price = {} # using loop to calculate mean price
for brand in unique_common_brand:
selected_brand = autos[autos["brand"]==brand]
mean_price = selected_brand["price"].mean()
brand_mean_price[brand] = int(mean_price)
autos["brand"].value_counts()
brand_mean_price
{'audi': 8965, 'bmw': 8252, 'ford': 7105, 'mercedes_benz': 29511, 'opel': 5106, 'volkswagen': 6384}
Based on results above, German cars (mercedes_benz': 29511, audi': 8965, 'bmw': 8252) are the most expensive in the market. Ford and Volkswagen have average price while opel is less expensive
bmp_series = pd.Series(brand_mean_price).sort_values(ascending = False)
print(bmp_series) # Converting series object using series constructor
mercedes_benz 29511 audi 8965 bmw 8252 ford 7105 volkswagen 6384 opel 5106 dtype: int64
df = pd.DataFrame(bmp_series, columns=['mean_price'])
df # Creating dataframe from first series object using dataframe constructor
mean_price | |
---|---|
mercedes_benz | 29511 |
audi | 8965 |
bmw | 8252 |
ford | 7105 |
volkswagen | 6384 |
opel | 5106 |
brand_mean_mileage = {} # using loop method to calculate mean mileage
for brand in unique_common_brand:
selected_brand = autos[autos["brand"]==brand]
mean_mileage = selected_brand["odometer_km"].mean()
brand_mean_mileage[brand] = int(mean_mileage)
autos["brand"].value_counts()
brand_mean_mileage
{'audi': 129643, 'bmw': 132521, 'ford': 124131, 'mercedes_benz': 130886, 'opel': 129298, 'volkswagen': 128955}
bmm_series = pd.Series(brand_mean_mileage).sort_values(ascending = False)
print(bmm_series)
bmw 132521 mercedes_benz 130886 audi 129643 opel 129298 volkswagen 128955 ford 124131 dtype: int64
df = pd.DataFrame(bmm_series, columns=['brand_mean_mileage'])
df
brand_mean_mileage | |
---|---|
bmw | 132521 |
mercedes_benz | 130886 |
audi | 129643 |
opel | 129298 |
volkswagen | 128955 |
ford | 124131 |
On comparision, Volkswagen cars are the best brand with average mean mileage of 128955 and mean price of 6384. From this, it cealry explains why this brand is the most famous in the market. Though mercedes_benz is the most expensive brand , it doesnt necessarily have the higest average mileage.
It can be concluded that, volkswagen brand is probably the best to purchase from eBay Kleinanzeigen website given that its cheap with good average mileage