Depuis le 26 mars, le site GitHub.com est sous attaque, victime d'une DDoS.
Le moteur de recherche Baidu a été utilisé comme vecteur d'attaque.
Les scripts malveillants semblent être injectés par des serveurs à la frontière de l'infrastrucutre de réseau chinoise.
L'attaque vise GreatFire, et d'autres sites, hébergés par GitHub, qui s'opposent à la surveillance du net pratiquée par le gouvernement chinois.
Sources:
Vous vous trompez :
La décennie des données ?
Problématiques majeures : traîtement de grosses masses de données, sécurité, vie privée.
API : Application Programming Interface
Vieux terme, traditionnellement utilisé pour désigner les fonctions exposées par une bilbiothèque logicielle.
Applications Web (API REST): description des URLs et de leurs paramètres.
Envoyer des paramètres :
GET : par l'URL. Ex.: http://www.google.fr/?
q=parametres+GET
POST : dans le corps de la requête (pour des données de grande taille).
from IPython.display import HTML
import urllib2
goog = urllib2.urlopen("https://www.google.com/?q=parametres+GET")
HTML(goog.read().decode('iso-8859-1'))
L'application web est libre d'accepter toute URL, mais il existe un standard, appelé RFC 3986, qui est universalement respecté :
?
,cle=valeur
,&
.Exemple :
https://www.google.fr/?
q=recherche
&
hl=fr
q=recherche
: quoi chercherhl=fr
: langue de l'interfacehttps://github.com/defeo/in202/raw/gh-pages/assets/bike-dataset.csv
import pandas as pd
bikes = urllib2.urlopen('https://github.com/defeo/in202/raw/gh-pages/assets/bike-dataset.csv')
b = pd.read_csv(bikes)
Format dérivé de JavaScript :
1
, 2.0
,"encodés en utf8"
,["comme", "en", "python"]
,{ "clef" : "valeur", "autre clef": "valeur" }
Attention : les clefs des objets sont limités à des chaînes de caractères.
Bibliothèque json
: conversion de JSON en données Python
Exemple
http://eu.battle.net/api/sc2/ladder/grandmaster?locale=fr_FR'
Note : Quand les données JSON sont plates, on peut directement les lire avec pandas
import json
data = json.load(urllib2.urlopen('http://eu.battle.net/api/sc2/ladder/grandmaster?locale=fr_FR'))
type(data)
dict
data.keys()
[u'ladderMembers']
type(data['ladderMembers'])
list
data['ladderMembers'][0]
{u'character': {u'clanName': u'', u'clanTag': u'', u'displayName': u'IIIIIIIII', u'id': 3257655, u'profilePath': u'/profile/3257655/1/IIIIIIIII/', u'realm': 1}, u'favoriteRaceP1': u'PROTOSS', u'highestRank': 1, u'joinTimestamp': 1421665350, u'losses': 126, u'points': 2889.0, u'previousRank': 4, u'wins': 218}
sc2 = pd.DataFrame(data['ladderMembers'])
sc2
character | favoriteRaceP1 | highestRank | joinTimestamp | losses | points | previousRank | wins | |
---|---|---|---|---|---|---|---|---|
0 | {u'displayName': u'IIIIIIIII', u'clanName': u'... | PROTOSS | 1 | 1421665350 | 126 | 2889 | 4 | 218 |
1 | {u'displayName': u'IlIlIlIlIlIl', u'clanName':... | PROTOSS | 2 | 1422633286 | 188 | 2850 | 6 | 296 |
2 | {u'displayName': u'HatsuneMiku', u'clanName': ... | TERRAN | 3 | 1423671811 | 153 | 2827 | 10 | 258 |
3 | {u'displayName': u'PtitDrogo', u'clanName': u'... | PROTOSS | 3 | 1425058466 | 233 | 2759 | 9 | 322 |
4 | {u'displayName': u'PenetraTHOR', u'clanName': ... | TERRAN | 3 | 1425513212 | 108 | 2754 | 9 | 173 |
5 | {u'displayName': u'lllllIIIllIl', u'clanName':... | PROTOSS | 3 | 1423078934 | 88 | 2751 | 6 | 162 |
6 | {u'displayName': u'llllllllllll', u'clanName':... | PROTOSS | 6 | 1422391181 | 170 | 2741 | 6 | 250 |
7 | {u'displayName': u'LiquidSnute', u'clanName': ... | ZERG | 2 | 1426706701 | 108 | 2732 | 8 | 186 |
8 | {u'displayName': u'llllllllllll', u'clanName':... | TERRAN | 3 | 1421669242 | 231 | 2712 | 2 | 399 |
9 | {u'displayName': u'lllllIIIllll', u'clanName':... | ZERG | 6 | 1425790701 | 60 | 2709 | 7 | 240 |
10 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | ZERG | 1 | 1425427183 | 339 | 2693 | 2 | 432 |
11 | {u'displayName': u'KaarisOCLICK', u'clanName':... | TERRAN | 6 | 1427370092 | 83 | 2679 | 0 | 142 |
12 | {u'displayName': u'LiquidMaNa', u'clanName': u... | PROTOSS | 10 | 1422380341 | 124 | 2664 | 18 | 209 |
13 | {u'displayName': u'lIlIlIlIlIlI', u'clanName':... | ZERG | 13 | 1421687897 | 306 | 2654 | 20 | 361 |
14 | {u'displayName': u'fraer', u'clanName': u'ExTr... | PROTOSS | 9 | 1425406077 | 230 | 2580 | 15 | 242 |
15 | {u'displayName': u'IlIlIlIlIlIl', u'clanName':... | PROTOSS | 13 | 1421625605 | 268 | 2571 | 23 | 371 |
16 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | TERRAN | 17 | 1421625625 | 64 | 2567 | 30 | 118 |
17 | {u'displayName': u'elfi', u'clanName': u'PEKKA... | PROTOSS | 11 | 1421663442 | 437 | 2564 | 19 | 478 |
18 | {u'displayName': u'llllllllllll', u'clanName':... | ZERG | 11 | 1422473792 | 195 | 2554 | 27 | 212 |
19 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | ZERG | 14 | 1426411330 | 135 | 2553 | 14 | 183 |
20 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | TERRAN | 13 | 1425348745 | 168 | 2547 | 27 | 204 |
21 | {u'displayName': u'llllllllllll', u'clanName':... | PROTOSS | 20 | 1421776196 | 104 | 2539 | 25 | 134 |
22 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | TERRAN | 20 | 1425397354 | 156 | 2518 | 37 | 176 |
23 | {u'displayName': u'IlIIIIllIlll', u'clanName':... | ZERG | 20 | 1426846393 | 180 | 2512 | 34 | 222 |
24 | {u'displayName': u'Lambo', u'clanName': u'', u... | ZERG | 25 | 1421630424 | 265 | 2497 | 44 | 278 |
25 | {u'displayName': u'CARTIER', u'clanName': u'CR... | ZERG | 20 | 1422687731 | 188 | 2488 | 21 | 292 |
26 | {u'displayName': u'llllllllllll', u'clanName':... | TERRAN | 27 | 1421628225 | 231 | 2472 | 37 | 245 |
27 | {u'displayName': u'FXOStrelok', u'clanName': u... | TERRAN | 27 | 1423750644 | 139 | 2463 | 51 | 197 |
28 | {u'displayName': u'MaDMarC', u'clanName': u'AT... | TERRAN | 17 | 1424450160 | 236 | 2456 | 15 | 310 |
29 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | PROTOSS | 11 | 1421680402 | 339 | 2455 | 18 | 352 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
170 | {u'displayName': u'NorthBrute', u'clanName': u... | TERRAN | 166 | 1425766140 | 305 | 1903 | 176 | 316 |
171 | {u'displayName': u'ReaVer', u'clanName': u'Tea... | PROTOSS | 140 | 1427176238 | 420 | 1900 | 0 | 448 |
172 | {u'displayName': u'IIIIIIII', u'clanName': u''... | TERRAN | 148 | 1423651429 | 238 | 1896 | 171 | 248 |
173 | {u'displayName': u'BobaFett', u'clanName': u''... | TERRAN | 147 | 1427143902 | 99 | 1890 | 0 | 89 |
174 | {u'displayName': u'Itwasluck', u'clanName': u'... | PROTOSS | 148 | 1425306855 | 161 | 1890 | 150 | 176 |
175 | {u'displayName': u'Hephaistas', u'clanName': u... | ZERG | 167 | 1425830784 | 108 | 1888 | 175 | 125 |
176 | {u'displayName': u'Scandicain', u'clanName': u... | PROTOSS | 167 | 1425914514 | 109 | 1886 | 169 | 103 |
177 | {u'displayName': u'Poseidon', u'clanName': u'H... | PROTOSS | 161 | 1421668780 | 206 | 1872 | 171 | 212 |
178 | {u'displayName': u'NoCti', u'clanName': u'Fan ... | TERRAN | 171 | 1423344375 | 260 | 1862 | 176 | 251 |
179 | {u'displayName': u'Talia', u'clanName': u'', u... | TERRAN | 177 | 1426202660 | 87 | 1799 | 182 | 144 |
180 | {u'displayName': u'JeSuisCharli', u'clanName':... | PROTOSS | 177 | 1423578734 | 392 | 1780 | 179 | 415 |
181 | {u'displayName': u'VeniVidiVins', u'clanName':... | ZERG | 174 | 1424819521 | 414 | 1777 | 180 | 427 |
182 | {u'displayName': u'RiSky', u'clanName': u'Team... | ZERG | 182 | 1425861392 | 138 | 1775 | 180 | 141 |
183 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | PROTOSS | 177 | 1426076915 | 318 | 1754 | 180 | 339 |
184 | {u'displayName': u'MafiatA', u'clanName': u'Nu... | ZERG | 181 | 1421638711 | 106 | 1748 | 185 | 114 |
185 | {u'displayName': u'Mööp', u'clanName': u'Heral... | PROTOSS | 180 | 1424175561 | 179 | 1741 | 187 | 190 |
186 | {u'displayName': u'Déca', u'clanName': u'Worke... | PROTOSS | 186 | 1421626457 | 97 | 1655 | 187 | 79 |
187 | {u'displayName': u'QwerelL', u'clanName': u'Ba... | ZERG | 184 | 1424791290 | 92 | 1653 | 183 | 92 |
188 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | ZERG | 184 | 1421670781 | 125 | 1641 | 185 | 113 |
189 | {u'displayName': u'IllllllllIll', u'clanName':... | PROTOSS | 186 | 1425521351 | 84 | 1592 | 184 | 108 |
190 | {u'displayName': u'Teacher', u'clanName': u'Up... | PROTOSS | 187 | 1422631018 | 91 | 1591 | 190 | 90 |
191 | {u'displayName': u'SKéviN', u'clanName': u'New... | TERRAN | 187 | 1421686287 | 92 | 1560 | 192 | 92 |
192 | {u'displayName': u'Justice', u'clanName': u'we... | TERRAN | 191 | 1421627433 | 98 | 1507 | 193 | 101 |
193 | {u'displayName': u'Zlayer', u'clanName': u'Old... | TERRAN | 190 | 1421626220 | 97 | 1479 | 193 | 96 |
194 | {u'displayName': u'ŦṝūḿṕǂAƆƩ', u'clanName': u'... | PROTOSS | 192 | 1425312651 | 75 | 1229 | 193 | 137 |
195 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | PROTOSS | 193 | 1423687954 | 83 | 1186 | 195 | 92 |
196 | {u'displayName': u'IIIIIIIIIIII', u'clanName':... | TERRAN | 194 | 1424778035 | 79 | 1129 | 197 | 139 |
197 | {u'displayName': u'Rayman', u'clanName': u'Wes... | PROTOSS | 196 | 1421626381 | 162 | 1099 | 198 | 146 |
198 | {u'displayName': u'nBroccoli', u'clanName': u'... | TERRAN | 195 | 1423400164 | 100 | 1068 | 196 | 68 |
199 | {u'displayName': u'imRDA', u'clanName': u'Pani... | ZERG | 197 | 1421852264 | 116 | 675 | 198 | 33 |
200 rows × 8 columns
sc2['percent'] = sc2.wins / (sc2.wins + sc2.losses)
sc2.groupby('favoriteRaceP1').mean()
highestRank | joinTimestamp | losses | points | previousRank | wins | percent | |
---|---|---|---|---|---|---|---|
favoriteRaceP1 | |||||||
PROTOSS | 84.053333 | 1.424071e+09 | 188.200000 | 2129.186667 | 85.933333 | 213.880000 | 0.535269 |
RANDOM | 117.000000 | 1.423326e+09 | 224.000000 | 2042.000000 | 154.000000 | 229.000000 | 0.505519 |
TERRAN | 88.272727 | 1.424079e+09 | 186.345455 | 2124.672727 | 87.054545 | 212.763636 | 0.536759 |
ZERG | 75.514706 | 1.424795e+09 | 181.514706 | 2159.000000 | 75.426471 | 208.088235 | 0.532856 |
"XML is crap. Really. There are no excuses. XML is nasty to parse for humans, and it's a disaster to parse even for computers. There's just no reason for that horrible crap to exist."
Exemple :
Deux bibliothèques
import datetime
import dateutil
date = datetime.datetime(2015, 3, 2)
date
datetime.datetime(2015, 3, 2, 0, 0)
date.ctime()
'Mon Mar 2 00:00:00 2015'
dateutil.parser.parse('2015-3-2')
datetime.datetime(2015, 3, 2, 0, 0)
dateutil.parser.parse('2/3/2015')
datetime.datetime(2015, 2, 3, 0, 0)
dateutil.parser.parse('20/3/2015')
datetime.datetime(2015, 3, 20, 0, 0)
delta = datetime.datetime.now() - datetime.datetime(2015, 3, 4)
delta
datetime.timedelta(26, 24180, 3905)
delta.total_seconds()
2270580.003905
Revenons à la météo
b.head()
instant | dteday | season | yr | mnth | holiday | weekday | workingday | weathersit | temp | atemp | hum | windspeed | casual | registered | cnt | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2011-01-01 | 1 | 0 | 1 | 0 | 6 | 0 | 2 | 0.344167 | 0.363625 | 0.805833 | 0.160446 | 331 | 654 | 985 |
1 | 2 | 2011-01-02 | 1 | 0 | 1 | 0 | 0 | 0 | 2 | 0.363478 | 0.353739 | 0.696087 | 0.248539 | 131 | 670 | 801 |
2 | 3 | 2011-01-03 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 0.196364 | 0.189405 | 0.437273 | 0.248309 | 120 | 1229 | 1349 |
3 | 4 | 2011-01-04 | 1 | 0 | 1 | 0 | 2 | 1 | 1 | 0.200000 | 0.212122 | 0.590435 | 0.160296 | 108 | 1454 | 1562 |
4 | 5 | 2011-01-05 | 1 | 0 | 1 | 0 | 3 | 1 | 1 | 0.226957 | 0.229270 | 0.436957 | 0.186900 | 82 | 1518 | 1600 |
type(b['dteday'][0])
str
b.dtypes
instant int64 dteday object season int64 yr int64 mnth int64 holiday int64 weekday int64 workingday int64 weathersit int64 temp float64 atemp float64 hum float64 windspeed float64 casual int64 registered int64 cnt int64 dtype: object
b.dteday - b.dteday
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-23-f242b7452d43> in <module>() ----> 1 b.dteday - b.dteday /usr/lib/python2.7/dist-packages/pandas/core/ops.pyc in wrapper(left, right, name) 503 rvalues = com.take_1d(rvalues, ridx) 504 --> 505 arr = na_op(lvalues, rvalues) 506 507 return left._constructor(wrap_results(arr), index=index, /usr/lib/python2.7/dist-packages/pandas/core/ops.pyc in na_op(x, y) 456 result = np.empty(x.size, dtype=dtype) 457 mask = notnull(x) & notnull(y) --> 458 result[mask] = op(x[mask], y[mask]) 459 else: 460 result = pa.empty(len(x), dtype=x.dtype) TypeError: unsupported operand type(s) for -: 'str' and 'str'
Attention : très lent pour des grosses données
bikes = urllib2.urlopen('https://github.com/defeo/in202/raw/gh-pages/assets/bike-dataset.csv')
bb = pd.read_csv(bikes, parse_dates=["dteday"])
bb.dtypes
instant int64 dteday datetime64[ns] season int64 yr int64 mnth int64 holiday int64 weekday int64 workingday int64 weathersit int64 temp float64 atemp float64 hum float64 windspeed float64 casual int64 registered int64 cnt int64 dtype: object
type(bb['dteday'][0])
pandas.tslib.Timestamp
bb.dteday - bb.dteday
0 0 days 1 0 days 2 0 days 3 0 days 4 0 days 5 0 days 6 0 days 7 0 days 8 0 days 9 0 days 10 0 days 11 0 days 12 0 days 13 0 days 14 0 days ... 716 0 days 717 0 days 718 0 days 719 0 days 720 0 days 721 0 days 722 0 days 723 0 days 724 0 days 725 0 days 726 0 days 727 0 days 728 0 days 729 0 days 730 0 days Name: dteday, Length: 731, dtype: timedelta64[ns]
from geopy.geocoders import Nominatim
coder = Nominatim()
l = coder.geocode("45 avenue des États Unis, Versailles")
l
Location((48.8084125, 2.1460823, 0.0))
l2 = coder.reverse((46.0,4.0))
l2.address
u'Route de Villemontais, Ouches, Roanne, Loire, Rh\xf4ne-Alpes, France m\xe9tropolitaine, 42155, France'
l.latitude, l.longitude
(48.8084125, 2.1460823)
from geopy.distance import distance
distance(l.point, l2.point)
Distance(342.001142917)
distance(l.point, (49.0, 3.4))
Distance(94.3627744207)
Intégration avec IPython :