Creating a Series

There are couple of ways to create a new Series from scratch.

Let's explore.

In [1]:
# Just like how NumPy is almost always abbreviated as np...
import numpy as np
#  pandas is usually shortened to pd
import pandas as pd

Creating from a dictionary

Let's use this sample data here. In our example, test_balance_data is just a standard Python dictionary the key is username, and the value is that user's current account balance.

In [2]:
test_balance_data = {
    'pasan': 20.00,
    'treasure': 20.18,
    'ashley': 1.05,
    'craig': 42.42,
}

The Series constructor accepts any dict-like object

In [3]:
balances = pd.Series(test_balance_data)

Notice that labels have been set from the test_balance_data.keys() and the values are set from test_balance_data.values()

In [4]:
balances
Out[4]:
pasan       20.00
treasure    20.18
ashley       1.05
craig       42.42
dtype: float64

Creating from an Iterable

You can pass any iterable as the first argument

In [5]:
unlabeled_balances = pd.Series([20.00, 20.18, 1.05, 42.42])

NOTE: When labels are not present they're defaulted to incremental integers starting at 0

In [6]:
unlabeled_balances
Out[6]:
0    20.00
1    20.18
2     1.05
3    42.42
dtype: float64

You can also provide the index argument which requires an iterable the same size as your data.

In [7]:
labeled_balances = pd.Series(
    [20.00, 20.18, 1.05, 42.42],
    index=['pasan', 'treasure', 'ashley', 'craig']
)

Note, the order of the labels is guaranteed.

In [8]:
labeled_balances
Out[8]:
pasan       20.00
treasure    20.18
ashley       1.05
craig       42.42
dtype: float64

One thing to remember is that a NumPy array is also iterable. In fact, you'll find NumPy and Pandas get along really well together.

In [9]:
ndbalances = np.array([20.00, 20.18, 1.05, 42.42])
pd.Series(ndbalances)
Out[9]:
0    20.00
1    20.18
2     1.05
3    42.42
dtype: float64

Creating from a scalar and an index

If you pass in a scalar that value will be broadcasted to the keys specified in the index argument

In [10]:
pd.Series(20.00, index=["guil", "jay", "james", "ben", "nick"])
Out[10]:
guil     20.0
jay      20.0
james    20.0
ben      20.0
nick     20.0
dtype: float64