Creating a Series¶

There are couple of ways to create a new Series from scratch.

Let's explore.

In [1]:

# Just like how NumPy is almost always abbreviated as np...
import numpy as np
#  pandas is usually shortened to pd
import pandas as pd

Creating from a dictionary¶

Let's use this sample data here. In our example, test_balance_data is just a standard Python dictionary the key is username, and the value is that user's current account balance.

In [2]:

test_balance_data = {
    'pasan': 20.00,
    'treasure': 20.18,
    'ashley': 1.05,
    'craig': 42.42,
}

The Series constructor accepts any dict-like object

In [3]:

balances = pd.Series(test_balance_data)

Notice that labels have been set from the test_balance_data.keys() and the values are set from test_balance_data.values()

In [4]:

balances

Out[4]:

pasan       20.00
treasure    20.18
ashley       1.05
craig       42.42
dtype: float64

Creating from an Iterable¶

You can pass any iterable as the first argument

In [5]:

unlabeled_balances = pd.Series([20.00, 20.18, 1.05, 42.42])

NOTE: When labels are not present they're defaulted to incremental integers starting at 0

In [6]:

unlabeled_balances

Out[6]:

0    20.00
1    20.18
2     1.05
3    42.42
dtype: float64

You can also provide the index argument which requires an iterable the same size as your data.

In [7]:

labeled_balances = pd.Series(
    [20.00, 20.18, 1.05, 42.42],
    index=['pasan', 'treasure', 'ashley', 'craig']
)

Note, the order of the labels is guaranteed.

In [8]:

labeled_balances

Out[8]:

pasan       20.00
treasure    20.18
ashley       1.05
craig       42.42
dtype: float64

One thing to remember is that a NumPy array is also iterable. In fact, you'll find NumPy and Pandas get along really well together.

In [9]:

ndbalances = np.array([20.00, 20.18, 1.05, 42.42])
pd.Series(ndbalances)

Out[9]:

0    20.00
1    20.18
2     1.05
3    42.42
dtype: float64

Creating from a scalar and an index¶

If you pass in a scalar that value will be broadcasted to the keys specified in the index argument

In [10]:

pd.Series(20.00, index=["guil", "jay", "james", "ben", "nick"])

Out[10]:

guil     20.0
jay      20.0
james    20.0
ben      20.0
nick     20.0
dtype: float64

Learn More¶

Introduction to Data Structures - Series (pandas documentation)