17 Plotting and visualization
Any changes you make to the code on this page, including your solutions to exercises, are temporary. If you leave or reload the page, all changes will be lost. If you would like to keep your work, copy and paste your code into a separate file or editor where it can be saved permanently.
Data visualization helps us understand the structure and patterns in our data. In Python, most visualizations are created using the libraries Matplotlib, its Pandas interface, and Seaborn, which together provide a flexible framework for both quick exploration and publication-quality graphics.
17.1 Introduction to Matplotlib
Matplotlib is the foundational library for plotting in Python. We imported its pyplot interface as plt above.
Let’s start with a simple example plotting two polynomial functions:
17.2 Plotting with Pandas and Seaborn
While Matplotlib provides the foundation for all plots, Pandas and Seaborn offer convenient interfaces built on top of it.
17.2.1 Pandas
As we have already seen in previous chapters, Pandas provides a .plot method for Series and DataFrames to quickly generate plots directly from our data.
Key parameters of the .plot method include:
kind: type of plot:line(default),bar,barh(horizontal bar plot),hist(histogram),box,kde(kernel density estimation) ordensity,area,pie,scatter(only for DataFrames),hexbin(only for DataFrames)x,y: variables (e.g. a column name or a list of column names) to plot
We can also call methods .plot.<kind> directly, e.g. df.plot.bar(...) instead of df.plot(kind="bar", ...) or df.plot.hist(...) instead of df.plot(kind="hist", ...).
17.2.2 Seaborn
In Seaborn there are dedicated functions for different types of plots (e.g. barplot, scatterplot, boxplot, etc.), and most of them support the following parameters:
data: input dataset (Series or DataFrame)x,y: variables to plothue: grouping by a categorical variable represented by different colors (if supported by the plot type)rowandcol: creating subplots for different values of a variable (if supported by the plot type)
Tip: Use Pandas .plot for quick exploratory plots and Seaborn for more detailed visualizations.
17.3 Plot catalog
Below is an overview of the most common plot types, and how to create them using both Pandas and Seaborn. The goal is not to memorize all functions, but to see patterns in how the plotting APIs are structured.
We will use two datasets for these examples:
17.3.1 Bar plot
17.3.2 Count plot (Seaborn only)
17.3.3 Pie plot (Pandas only)
17.3.4 Line plot
17.3.5 Scatter plot
We can use lmplot to visualize simple linear relationships and regression fits:
17.3.6 Histogram and density estimation
17.3.7 Box and violin plots
17.3.8 Strip and swarm plots (Seaborn only)
17.3.9 Joint and pair plots (Seaborn only)
Further examples and customization options can be found in Seaborn’s example gallery.