Plotting With Pandas#
This chapter covers how to generate matplotlib plots for data stored in a pandas DataFrame. It provides an overview of how to generate a variety of common plot types, including line plots, bar charts, histograms, box plots, area plots, scatter plots, and pie charts. It also covers how pandas’s plotting function handles missing data.
It provides a comparison of the matplotlib and seaborn plotting packages and provides an introduction to seaborn with sample code.
Acknowledgements#
The author consulted the following resources when writing this tutorial:
Chapter 4.14 “Visualization With Seaborn” from Jake VanderPlas, Python Data Science Handbook: Essential Tools for Working with Data (O’Reilly, 2016)
pandas, User Guide, “Visualization”pandas, Getting Started, “Plotting”Chapter 9 “Plotting and Visualization” from Wes McKinney, Python for Data Analysis: Data Wrangling With pandas, Numpy, and IPython (O’Reilly, 2017)
Chapter 15 “Generating Data” from Eric Matthes, Python Crash Course: A Hands-On, Project-Based Introduction to Programming (No Starch Press, 2019).
Chapter Contents#
Data#
This chapter uses a few different datasets:
-
Code to load from URL is included
City of South Bend budget data
Code to load/process this data is included
We’ll also use some data generated randomly with
numpy. Code to generate random data is included.The
seabornsection of the chapter uses sample datasets that are included in the library.tipsdots
Application#
Click here for a Jupyter Notebook template for this chapter’s application problems.