import pandas as pd
pandas series is similar to numpy array, But it suppport lots of extra functionality like Pandaseries.describe()
Basic acces is samilar to numpy arrary, it support access by index( s[5] ) or slicing ( s[5:10] ).
It also support vectorise operation and looping like numpy array.
Implemented in C so it works very fast.
s=pd.Series([2,3,4,5,6])
print s.describe()
count 5.000000 mean 4.000000 std 1.581139 min 2.000000 25% 3.000000 50% 4.000000 75% 5.000000 max 6.000000 dtype: float64
Hybrid of list and python Dictionary. It map key value pair.
sal=pd.Series([40,12,43,56],
index=['Ram',
'Syam',
"Rahul",
"Ganesh"])
print sal
Ram 40 Syam 12 Rahul 43 Ganesh 56 dtype: int64
print sal[0]
40
print sal.loc["Syam"]
12
Using sal[position] is not prefered instead prefer to use sal.iloc[position] becouse Index has different meaning in series so it avoid confusion
print sal.iloc[3]
56
argmax() function return index of max value element
print sal.argmax()
Ganesh
print sal.loc["Ganesh"]
print sal.max()
56 56
a=pd.Series([1,2,3,4],
index=["a","b","c","d"])
b=pd.Series([9,8,7,6],
index=["c","d","e","f"])
print a
a 1 b 2 c 3 d 4 dtype: int64
print b
c 9 d 8 e 7 f 6 dtype: int64
print a+b
a NaN b NaN c 12 d 12 e NaN f NaN dtype: float64
C,D are common in both so added correctly rest are just assign a volue NaN (Not a number)
res = (a+b)
print res.dropna()
c 12 d 12 dtype: float64
res=a.add(b,fill_value=0)
print res
a 1 b 2 c 12 d 12 e 7 f 6 dtype: float64
s.apply(function_name) used to apply some operation on each element.
adding 5 to each element , we can do this by simply series+5 becouse it is a vector, But lets do using this new techniqe s.apply(function)
print res
a 1 b 2 c 12 d 12 e 7 f 6 dtype: float64
print res+5
a 6 b 7 c 17 d 17 e 12 f 11 dtype: float64
def add_5(x):
return x+5
print res.apply(add_5)
a 6 b 7 c 17 d 17 e 12 f 11 dtype: float64
automaticaly plot index vs data plot
%pylab inline
res.plot()
Populating the interactive namespace from numpy and matplotlib
<matplotlib.axes.AxesSubplot at 0x5746350>