Plotting With Pandas
#
This chapter covers how to generate matplotlib
plots for data stored in a pandas
DataFrame
. It provides an overview of how to generate a variety of common plot types, including line plots, bar charts, histograms, box plots, area plots, scatter plots, and pie charts. It also covers how pandas
’s plotting function handles missing data.
It provides a comparison of the matplotlib
and seaborn
plotting packages and provides an introduction to seaborn
with sample code.
Acknowledgements#
The author consulted the following resources when writing this tutorial:
Chapter 4.14 “Visualization With Seaborn” from Jake VanderPlas, Python Data Science Handbook: Essential Tools for Working with Data (O’Reilly, 2016)
pandas
, User Guide, “Visualization”pandas
, Getting Started, “Plotting”Chapter 9 “Plotting and Visualization” from Wes McKinney, Python for Data Analysis: Data Wrangling With pandas, Numpy, and IPython (O’Reilly, 2017)
Chapter 15 “Generating Data” from Eric Matthes, Python Crash Course: A Hands-On, Project-Based Introduction to Programming (No Starch Press, 2019).
Chapter Contents#
Data#
This chapter uses a few different datasets:
-
Code to load from URL is included
City of South Bend budget data
Code to load/process this data is included
We’ll also use some data generated randomly with
numpy
. Code to generate random data is included.The
seaborn
section of the chapter uses sample datasets that are included in the library.tips
dots
Application#
Click here for a Jupyter Notebook template for this chapter’s application problems.