# Just like how NumPy is almost always abbreviated as np...
import numpy as np
# pandas is usually shortened to pd
import pandas as pd
Let's use this sample data here we got from CashBox. They want to track the balances of their users. This is how much money each user currently has in their account. CashBox requires that users create a username.
In our example, test_balance_data
is just a standard Python dictionary the key is username, and the value is that user's current account balance.
test_balance_data = {
'pasan': 20.00,
'treasure': 20.18,
'ashley': 1.05,
'craig': 42.42,
}
The Series
constructor accepts any dict-like object
balances = pd.Series(test_balance_data)
Notice that labels have been set from the test_balance_data.keys()
and the values are set from test_balance_data.values()
balances
pasan 20.00 treasure 20.18 ashley 1.05 craig 42.42 dtype: float64
You can pass any iterable as the first argument
unlabeled_balances = pd.Series([20.00, 20.18, 1.05, 42.42])
NOTE: When labels are not present they're defaulted to incremental integers starting at 0
unlabeled_balances
0 20.00 1 20.18 2 1.05 3 42.42 dtype: float64
You can also provide the index
argument which requires an iterable the same size as your data.
labeled_balances = pd.Series(
[20.00, 20.18, 1.05, 42.42],
index=['pasan', 'treasure', 'ashley', 'craig']
)
Note, the order of the labels is guaranteed to match the same order of the supplied index.
labeled_balances
pasan 20.00 treasure 20.18 ashley 1.05 craig 42.42 dtype: float64
One thing to remember is that a NumPy array is also iterable, so you can create a new Series
from an ndarray
. In fact, you'll find NumPy and Pandas get along very well together.
ndbalances = np.array([20.00, 20.18, 1.05, 42.42])
pd.Series(ndbalances)
0 20.00 1 20.18 2 1.05 3 42.42 dtype: float64
If you pass in a scalar, remember that is a single value, it will be broadcast to each of the keys specified in the index
keyword argument.
pd.Series(20.00, index=["guil", "jay", "james", "ben", "nick"])
guil 20.0 jay 20.0 james 20.0 ben 20.0 nick 20.0 dtype: float64
In other words, each key is assigned the same scalar value for the entire Series
.