Learn How to Use Sets in Python

Sets are an unordered collection of distinct objects in python. Sets are a popular choice when it comes to membership testing, removing duplicates, and calculating mathematical operations such as union, intersection, and difference. Like the other collections, sets support the in, not in, and len() operations.

However, one distinction about sets compared to other sequence types like lists and tuples is that there’s no way to access the index of a set. As mentioned previously, sets are unordered which means that keeping track of their indices is a moot point. If order is important then a set is not the data structure you want; instead consider lists or tuples. Since sets don’t support indexing then they also don’t support slicing which is another feature of sequence types in python.

The difference between Sets and Frozensets

There’s two types of sets in python, a set and a frozenset. The main difference between sets and frozensets are that sets are mutable meaning their elements can be modified while a frozenset is immutable which indicates that their elements are locked once created. Examples of immutable objects in python are numbers, strings, and tuples.

Well, tuples are kind of immutable because tuples can contain mutable objects like lists which therefore can be modified. Since immutable objects cannot be modified a new object must be created if a different value has to be stored. Since frozen sets are immutable they’re hashable, and since sets are mutable (changeable) they’re not hashable. What’s hashability? Well, lets look at the python glossary for more details:

  • An object is hashable if it has a hash value which never changes during it’s lifetime.
  • It needs to implement the __hash__() method
  • If being compared to other objects it needs an __eq__() method.
  • Hashable objects which compares equal must have the same hash value.

Hashability allows an object to be used as a key in a dictionary and as a member for a set. The reason being is because dictionaries and sets both use hash values internally. All of python’s built-in immutable types are hashable by default. Mutable containers such as lists and dictionaries are not hashable. Objects which are instances of user defined classes are hashable by default meaning that the only other object it’s equal to is itself. You can get the hash value of an object by using the built-in hash() function as shown in the short code snippets:

>>> class a:
...     print('a')
... 
a
>>> class b:
...     print('b')
... 
b
>>> class c:
...     print('c')
... 
c
>>> hash(a)
-9223366121730466307
>>> hash(b)
-9223366121730466236
>>> hash(c)
-9223366121730466165

Now that we’ve got some of the basics of set theory in python out of the way lets play with some python code.

How to create sets in python

You can create sets and frozensets by using their respective constructors in python:

set([iterable])

frozenset(iterable])
>>> a = set([1, 2, 3])
>>> a
{1, 2, 3}

>>> b = frozenset([4, 'a', 10])
>>> b
frozenset({10, 4, 'a'})

The first statement returns a set object while the second one returns a frozenset. Since a set is mutable and the elements inside a set must be immutable, you can’t have a set within a set. However, since frozensets are immutable you can have a frozenset inside a set.

Below is another way to create a set.

>>> primes = {1, 3, 5, 7, 11}
>>> primes
{1, 3, 5, 7, 11}

Below is a frozenset:

>>> composite = frozenset([2, 4, 6, 8, 10])

We can place the composite set inside of a set.

# update adds elements to set 
>>> primes.update(composite)
>>> primes
{1, 2, 3, 4, 5, 6, 7, 8, 10, 11}

Properties of both sets and frozensets in python

Sets and frozensets have properties in common, but due to their differences there’s some properties in sets that’s not available for frozensets. We will first take a look at some of the properties that’s available in both of them:

   len(x)    x in y     x not in y
   isdisjoint(other)   issubset(other)   issuperset(other)
  intersection(*others)   difference(*others)   symmetric_difference(other)
  copy()   >   >=
  <   <=   ==
  –   ^   &
  !

 

Note, some of the functions have respective operators that behave the same. For example:

   difference    –
  symmetric difference   ^
  intersection   &
    union     |

The difference is that if you look at the function’s signatures they have a parameter which states other; this indicates that they can accept iterable types. The operators can only accept other sets. The following code snippets illustrates this:

>>> x = [2, 4, 6]
>>> set(x) | x
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for |: 'set' and 'list'

The above is wrong because the union operator can’t be used with an iterable like list. To fix it we will use the union function as shown in the following code snippet:

>>> set(x).union(x)
{2, 4, 6}

The following code snippets shows the set and frozenset functions in action.

>>> a = {2, 4, 6, 8}
>>> b = {1, 2, 3, 4}
>>> len(a) * len(b)
16
>>> a in b
False
>>> 2 in a and 2 in b
True
>>> {2} in a
False
>>> 3 and 4 not in b
False
>>> a.isdisjoint(b)
False
>>> a.issubset(b)
False
>>> a.issuperset(b)
False
>>> a.intersection(b)
{2, 4}
>>> a.union(b)
{1, 2, 3, 4, 6, 8}
>>> a.difference(b)
{8, 6}
>>> c = b.copy()
>>> c.isdisjoint(b)
False
>>> c.issubset(b)
True
>>> c.issuperset(b)
True

Binary operations that mix set instances with frozenset returns the type of the first operand as shown in the code snippet below:

>>> frozenset('aeiou') | set('bcde')
frozenset({'a', 'c', 'i', 'o', 'u', 'b', 'e', 'd'})

Functions in Sets but not Frozensets

I said it once, and I’ll say it again, sets are mutable, and therefore changeable. Frozensets are immutable and therefore changeable. Since a set can be modified it has built in methods that will allow programmers to modify them. Some of these methods are pop(), add(), and discard(). Below is a list of methods from the set class that’s available in python:

  update(*others)   set |= other | …
 intersection_update(*others)   set &= other &…
  difference_update(*others)    set -= other | …
  symmetric_difference(other)   set ^= other

 

Below is an example of these methods in action.

>>> vowels.update('o', 'u', 'y')
>>> vowels
{'a', 'i', 'o', 'u', 'y', 'e'}
>>> vowels.intersection_update(['a', 'y', 'e'])
>>> vowels
{'y', 'e', 'a'}
>>> vowels.difference_update(['a', 3, 'i'])
>>> vowels
{'y', 'e'}
>>> vowels.symmetric_difference_update(['y', 'a'])
>>> vowels
{'e'}
>>> vowels.add('a')
>>> vowels.add('a')
>>> vowels.add('e')
{'e', 'a'}
>>> vowels.discard('e')
>>> vowels
{'z', 'a'}
>>> vowels.pop()
'z'
>>> vowels.clear()
>>> vowels
set()

Iterating over sets

You can use the for loop to iterate over the elements of a set. Note, since sets are not indexed order is not reserved so the output may be a way you’re not expecting.

>>> vowels = {'a', 'e', 'i', 'o', 'u', 'y'}
>>> for letter in vowels:
...     print(letter)
a
i
o
y
e
u
============================================================================ Want to learn how to use Python's most popular IDE Pycharm? In the free pdf guide "Getting the Hang of PyCharm" you'll learn all of the amazing features in PyCharm along with how to get started with data science. Subscribe to the Purcell Consult newsletter and get started A.S.A.P.