Hands-on: Python Fundamentals -- Sets

Objectives:

Upon completion of this lesson, you should be able to:

  • Describe the characteristics of the builtin set container in Python

  • Perform basic operations with sets including creation, "querying", updates, and basic set operations

  • Get an idea in which situations sets are and should be used

The set data structure

  • In Python, a set is an efficient storage for "membership" checking

  • set is like a dict but only with keys and without values

The set data structure

  • Values are similar to "keys" in dict and can be any Python data type BUT

  • they should be immutable

Creating a set

  • There are a number of ways to create and fill a set. E.g. you can create an empty one and keep assining new values
In [ ]:
# Create an empty set
eng = set()
print eng
In [ ]:
eng.add('one')
print eng
In [ ]:
eng.add('two')
print eng

Creating a set "hardcoded" way

  • Very similar to dict but without values. As well as with dict, the order of items in a set is unpredictable
In [ ]:
eng = {'one', 'two', 'three'}
print eng

Creating a set from a list

You can create a set from an iterable (e.g. list):

In [ ]:
eng = set(['one', 'two', 'three'])
print eng

Set comprehension

Very similar to dict comprehensions:

In [ ]:
{e for e in ['one', 'two', 'three'] if 'e' in e}

The in operator

Very similar to lists and tuples:

In [ ]:
6 in {4, 5, 6, 7}
True

Deleting items

In [ ]:
eng.add('five')
print eng
eng.remove('five')
print eng

Why the heck "set"s?

Why do we need sets if we could check for membership in lists and tuples?

1. Because lookup in a set is much faster

In [ ]:
def lookups(container):
    for i in range(100, 200):
        i in container
In [ ]:
import random
l = range(1000)
random.shuffle(l)
t = tuple(l)
s = set(l)
In [ ]:
%timeit lookups(l)
In [ ]:
%timeit lookups(t)
In [ ]:
%timeit lookups(s)

2. Because they provide set operations

In [ ]:
print dir(set)
In [ ]:
{1, 2, 3, 'mom', 'dad'}.union({2, 3, 10})
In [ ]:
{1, 2, 3, 'mom', 'dad'} | {2, 3, 10}
In [ ]:
{1, 2, 3, 'mom', 'dad'}.intersection({2, 3, 10})
In [ ]:
{1, 2, 3, 'mom', 'dad'} & {2, 3, 10}
In [ ]:
{1, 2, 3, 'mom', 'dad'}.difference({2, 3, 10})
In [ ]:
{1, 2, 3, 'mom', 'dad'} - {2, 3, 10}

More on sets could be found in the documentation: https://docs.python.org/2/library/stdtypes.html#set