8  Collection types

Warning

Any changes you make to the code on this page, including your solutions to exercises, are temporary. If you leave or reload the page, all changes will be lost. If you would like to keep your work, copy and paste your code into a separate file or editor where it can be saved permanently.

Python provides several collection types, which allow us to store and organize multiple values together. In this chapter, we will look at the most common ones: lists, tuples, sets, and dictionaries.

8.1 Lists

We introduced lists in Chapter 2. Now let us look at them in more detail.

Recall that a list is an ordered collection of elements enclosed in square brackets. For example:

Although modifying slices can be used to add and remove elements, it is more convenient to use list methods and del:

Assigning a list to another variable does not create a copy. Instead, both variables refer to the same list:

NoteNote for beginners

This might be difficult to understand at first. You might expect that after lst2 = lst1 the variables lst1 and lst2 would represent two independent lists with the same elements ‘A’, ‘B’, ‘C’, ‘D’. But in fact, both variables refer to the same list in memory.

Think about memory as a book. If lst1 represents a list that we have written down on page 26, then lst2 = lst1 means that lst2 also points to page 26. Both variables refer to the same list (the one on page 26), so modifying lst2 will also affect lst1.

This also means that sending a list to a function only sends a reference to it. If the function modifies the list, it modifies the same list the caller has passed. Often we will not want this, but sometimes we can take advantage of this.

If we want an independent copy, we must create one explicitly, for example with the .copy method:

Just like strings, we can concatenate lists with + and repeat them with *:

We can use the in operator to check whether a given value is an element of the list:

If we want to know the index of an element, we can use the .index method:

If the element does not exist, .index raises an error:

We can use the .count method to count how many times an element occurs in the list:

The following example shows how to use built-in aggregate functions, such as min, max and sum to calculate some statistics:

Exercise. There is no built-in function to calculate the average of a list. Write an expression to calculate the average of lst using the functions you have seen before.

Sample solution

lst = [1, -2, 3., 4., 5, 0]
sum(lst) / len(lst)
1.8333333333333333

To sort a list, we have two options:

  • the sorted function, which returns a new list,
  • the .sort method, which modifies the list in place.

We have already seen that we can use a for loop to iterate over a list:

If we need the index of the current element, we can use the enumerate function:

Exercise. Write a function that accepts a list of numbers and returns a new list containing only the negative ones.

Sample solution

def negative_elements(lst):
    result = []
    for element in lst:
        if element < 0:
            result.append(element)
    return result

negative_elements([-2, 1, 7, -1, 2, 5, -6, 0])
[-2, -1, -6]

To reverse the order of elements in a list, we can use the slicing operator with step -1, the built-in reversed function, or the .reverse() method (which modifies the list in place):

8.1.1 List comprehensions

List comprehensions offer a concise method to build a new list based on the elements of an existing list:

result = [f(element) for element in lst] 

is equivalent to:

result = []
for element in lst:
    result.append(f(element))

Here, f represents a function we want to apply on all elements of lst.

We can also add a condition to consider only some elements of the existing list:

result = [f(element) for element in lst if condition]

is equivalent to:

result = []
for element in lst:
    if condition:
        result.append(f(element))

Example. Creating a list containing the squares of numbers from 1 to 10 (inclusive):

Example. In the next example we use a list comprehension to filter a list of strings to retain only those that have more than 5 characters:

Exercise. Solve the previous exercise using a list comprehension.

Sample solution

def negative_elements(lst):
    return [element for element in lst if element < 0]

negative_elements([-2, 1, 7, -1, 2, 5, -6, 0])
[-2, -1, -6]

8.2 Tuples

Tuples are similar to lists, but they are immutable (unchangeable).

They are defined with parentheses () (instead of brackets []):

Parentheses are optional in situations when commas are not used for a different purpose, e.g.:

An example of when the parentheses are mandatory is when using a tuple as an argument in a function (commas are separating the arguments) or as an element of a list (commas are used to separate the elements of that list).

Be careful when creating a tuple with a single element. We need to include a trailing comma after the element; (1) or 1 is just the number 1:

Most of the functions and operators like +, *, indexing and slicing that work with lists also work with tuples as well (and so do the .count and .index methods):

One particular and very common use of tuples is when we need to return multiple values from a function, for example:

Here the function returned a tuple and we used a parallel assignment to store its first element in the minimum variable and the second element in the maximum variable. Parallel assignment allows us to simultaneously assign elements of an iterable (such as a list or a tuple) to multiple variables. We have actually seen an example of a parallel assignment before when swapping two variables’ values:

x, y = y, x

The expression on the right side (y, x) represents a tuple, and we are assigning its first element to x and its second element to y.

8.3 Sets

Sets, similar to their mathematical counterparts, are unordered collections of unique elements.

To create a non-empty set, we enclose its elements (separated by commas as usual) curly brackets ({ and }). Note that we cannot use {} to create an empty set (it creates an empty dictionary). To create an empty set, we must use set():

As always, we can get the number of elements with len:

We can check if a given value is an element of the set with the in operator:

We can iterate over all elements of a set using a for loop. Since sets are unordered, the order in which elements appear is arbitrary and should not be relied upon:

To add and remove elements from sets, we can use the .add, .remove and .discard methods:

However, we can neither index nor slice sets:

To perform set operations such as union, intersection, and difference, we can use | for union, & for intersection, and - for difference:

We can easily convert lists to sets and vice versa:

As with other mutable collections, assigning an existing set to a new variable does not create a copy. If we need to create a copy, we can use .copy method:

8.4 Dictionaries

A dictionary stores key-value pairs. All values in this type of collection are identified by a unique key (usually a string, but numbers or tuples are also possible).

NoteNote for beginners

You can think of a list as a collection of values identified by their position (index). For example, we access the element at index 2 or change the element at index 7. In a dictionary, instead of using positions, we use keys. We might want to get the value labeled ‘first_name’ or update the one labeled ‘last_name’.

To create a dictionary we define the key-value pairs by enclosing them within curly brackets ({ and }), and separating them with commas. The key and its corresponding value are separated by a colon (:):

As you can see from this example, the values in a dictionary can be of different types, allowing flexibility in storing various data structures, including lists, tuples, and even other dictionaries.

To get an element of a dictionary by its key, we can use the [] operator:

To check if there is a given key in the dictionary, we can use the in operator:

The .get method allows us to obtain the value associated with a key without raising an error if the key does not exist. If the key is not found, the method returns None or another value provided as an additional argument.

If we want to get the value associated with a given key and remove it from the dictionary at the same time, we can use .pop instead of .get.

Dictionaries are mutable, so we can update them:

To remove an existing key, we can use the del keyword (which we have previously seen when removing elements of a list):

A for loop can be used to iterate over the dictionary keys:

We can also use the .items method to iterate over the key-value pairs:

As with other mutable collections, assigning an existing dictionary to a new variable does not create a copy. To create an independent copy of the dictionary, we can use the .copy method:

Exercise. Write a function called stats that accepts a string as input and returns a dictionary containing the unique characters in the string as keys and their respective counts as values.

Sample arguments and return values:

Argument: 'banana'
Return value: {'b': 1, 'a': 3, 'n': 2} 

Argument: ''
Return value: {}

Next, implement another function called print_stats that accepts a string as input, uses the stats function to compute character statistics, and prints the results in the following format:

<character> | <count> time(s)
<character> | <count> time(s)

For example, calling print_stats("banana") will print:

b | 1 time(s)
a | 3 time(s)
n | 2 time(s)

8.5 Summary

The following table provides a quick reference guide comparing the most important collection types. Use it when you are unsure which type is most appropriate for your task, or when you need to recall their essential properties at a glance.

List Tuple Set Dictionary Str (as a sequence of characters)
Empty collection [] or list() () or tuple() set() {} or dict() "" or str()
Example [1, "hello", -0.7, 1] (1, "hello", -0.7, 1)
(1,)
{1, "hello", -0.7} {"name": "Adam", 0: "zero"} "hello"
Mutable Yes No Yes Yes No
Ordered Yes Yes No Yes* Yes
Unique elements No No Yes Unique keys No
Type of elements Any Any Any immutable Immutable keys Characters
x[i] Element at index i Element at index i Value for key i Character at index i
for i in x loops over Elements Elements Elements Keys Characters
Adding x.append(element) x.add(element) x[key] = value
Removing del x[index] or x.remove(element) x.remove(element) or x.discard(element) del x[key]

*) Dictionaries preserve insertion order in Python 3.7 and later.