Just like NumPy, pandas offers powerful vectorized methods. It also leans on broadcasting.

Let's explore!

In [1]:
import pandas as pd

test_balance_data = {
'pasan': 20.00,
'treasure': 20.18,
'ashley': 1.05,
'craig': 42.42,
}

test_deposit_data = {
'pasan': 20,
'treasure': 10,
'ashley': 100,
'craig': 55,
}

balances = pd.Series(test_balance_data)
deposits = pd.Series(test_deposit_data)


Vectorization¶

While it is indeed possible to loop through each item and apply it to another...

In [2]:
for label, value in deposits.iteritems():
balances[label] += value
balances

Out[2]:
pasan        40.00
treasure     30.18
ashley      101.05
craig        97.42
dtype: float64

...it's important to remember to lean on vectorization and skip the loops altogether. Vectorization is faster and as you can see, easier to read and write.

In [3]:
# Undo the change using inplace subtraction
balances -= deposits

# This is the same as the loop above using inplace addition
balances += deposits
balances

Out[3]:
pasan        40.00
treasure     30.18
ashley      101.05
craig        97.42
dtype: float64

Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same operation.

In [4]:
# 5 is brodacsted and added to each and every value. This returns a new Series.
balances + 5

Out[4]:
pasan        45.00
treasure     35.18
ashley      106.05
craig       102.42
dtype: float64

Labels are used to line up entries. When the label only exists in one side, a np.nan (not a number ) is put in place.

CashBox is giving out free coupons that user's can scan into the app to get $1 added to their accounts. In [5]: coupons = pd.Series(1, ['craig', 'ashley', 'james']) coupons  Out[5]: craig 1 ashley 1 james 1 dtype: int64 Now we are going to add the coupons to people who cashed them in. This addition will return a new Series. In [6]: # Returns a new Series balances + coupons  Out[6]: ashley 102.05 craig 98.42 james NaN pasan NaN treasure NaN dtype: float64 Notice how values that are not in both Series are set to np.nan. This isn't what we want! Pasan had$45.00 and now he has nothing. He is going to be so bummed!

Also take note that James is not in the balances Series but he is in the coupons Series. Note how he is now added to the new Series, but his value is also set to np.nan.

Using the fill_value parameter¶

It is possible to fill missing values so that everything aligns. The concept is to use the add method directly along with the the keyword argument fill_value.

In [7]:
# Returns a new Series

ashley      102.05
dtype: float64