16 Multi-level indexes

Warning

Any changes you make to the code on this page, including your solutions to exercises, are temporary. If you leave or reload the page, all changes will be lost. If you would like to keep your work, copy and paste your code into a separate file or editor where it can be saved permanently.

Multi-level (hierarchical) indexes allow us to organize data with multiple levels of labels. This is especially useful when working with grouped or hierarchical data.

Consider the following DataFrame:

Here, the combination of city and department uniquely identifies each row, so it makes sense to use them as a multi-level index:

We can visualize the hierarchy as Index level 0 (city) → Index level 1 (department) → row data (revenue, employees).

16.1 Selecting rows

To select a single row, we use the loc indexer with a tuple:

We can select all rows for a specific city (top-level index) using .loc:

To select rows based on a lower-level index (department in our case), we can either use a cross section:

Or use loc(axis=0) to specify labels independently at each level:

Note that in the second example, both index levels are preserved.

16.2 Swapping index levels

If we need to swap the order of index levels, we can use the .swaplevel method:

16.3 Summary functions at different index levels

We can group data based on a specific level, for example summarizing per city (level 0):

Or per department (level 1):

16.4 Resetting indexes

Recall the .reset_index method. When applied to a multi-level index, all levels are moved into a DataFrame as separate columns:

We can also use this method to reset only a specific level of the index: