Creating a Series

There are couple of ways to create a new Series from scratch.

Let's explore.

In [1]:
# Just like how NumPy is almost always abbreviated as np...
import numpy as np
#  pandas is usually shortened to pd
import pandas as pd

Creating from a dictionary

Let's use this sample data here we got from CashBox. They want to track the balances of their users. This is how much money each user currently has in their account. CashBox requires that users create a username.

In our example, test_balance_data is just a standard Python dictionary the key is username, and the value is that user's current account balance.

In [2]:
test_balance_data = {
    'pasan': 20.00,
    'treasure': 20.18,
    'ashley': 1.05,
    'craig': 42.42,
}

The Series constructor accepts any dict-like object

In [3]:
balances = pd.Series(test_balance_data)

Notice that labels have been set from the test_balance_data.keys() and the values are set from test_balance_data.values()

In [4]:
balances
Out[4]:
pasan       20.00
treasure    20.18
ashley       1.05
craig       42.42
dtype: float64

Creating from an Iterable

You can pass any iterable as the first argument

In [5]:
unlabeled_balances = pd.Series([20.00, 20.18, 1.05, 42.42])

NOTE: When labels are not present they're defaulted to incremental integers starting at 0

In [6]:
unlabeled_balances
Out[6]:
0    20.00
1    20.18
2     1.05
3    42.42
dtype: float64

You can also provide the index argument which requires an iterable the same size as your data.

In [7]:
labeled_balances = pd.Series(
    [20.00, 20.18, 1.05, 42.42],
    index=['pasan', 'treasure', 'ashley', 'craig']
)

Note, the order of the labels is guaranteed to match the same order of the supplied index.

In [8]:
labeled_balances
Out[8]:
pasan       20.00
treasure    20.18
ashley       1.05
craig       42.42
dtype: float64

One thing to remember is that a NumPy array is also iterable, so you can create a new Series from an ndarray. In fact, you'll find NumPy and Pandas get along very well together.

In [9]:
ndbalances = np.array([20.00, 20.18, 1.05, 42.42])
pd.Series(ndbalances)
Out[9]:
0    20.00
1    20.18
2     1.05
3    42.42
dtype: float64

Creating from a scalar and an index

If you pass in a scalar, remember that is a single value, it will be broadcast to each of the keys specified in the index keyword argument.

In [10]:
pd.Series(20.00, index=["guil", "jay", "james", "ben", "nick"])
Out[10]:
guil     20.0
jay      20.0
james    20.0
ben      20.0
nick     20.0
dtype: float64

In other words, each key is assigned the same scalar value for the entire Series.