Pie Charts

Pie Charts#

Pie charts don’t play as nicely with DataFrames, because they only accept one-dimensional data (values with attributes) and a DataFrame is a two-dimensional structure.

One option is to generate a subplot for each column.

import pandas as pd, numpy as np # import statements
df = pd.DataFrame(3 * np.random.rand(4, 2), index=["a", "b", "c", "d"], columns=["x", "y"]) # generate data
df.plot.pie(subplots=True, figsize=(8,4)) # show plot

array([<Axes: ylabel='x'>, <Axes: ylabel='y'>], dtype=object)

../_images/e8776656a72accea6bf75bd07195670b726ff885089ab47992138e72751cd13e.png

The other option is to isolate a subset of the data to get to a one-dimension data structure.

Let’s go back to our air quality data.

df = pd.read_csv('https://raw.githubusercontent.com/kwaldenphd/more-with-matplotlib/main/data/air_quality_no2.csv', index_col=0, parse_dates=True) # load data
df.head() # inspect df

	station_antwerp	station_paris	station_london
datetime
2019-05-07 02:00:00	NaN	NaN	23.0
2019-05-07 03:00:00	50.5	25.0	19.0
2019-05-07 04:00:00	45.0	27.7	19.0
2019-05-07 05:00:00	NaN	50.4	16.0
2019-05-07 06:00:00	NaN	61.9	NaN

Let’s say we want to know what proportion of observations come from each station. We could use some of the aggregating and reshaping functions to get to a one dimensional structure.

df2 = df.melt() # melt dataframe
df3 = df2['variable'].value_counts() # get number of station observations
df3 # inspect output

station_antwerp    1035
station_paris      1035
station_london     1035
Name: variable, dtype: int64

df3.plot.pie() # generate pie chart

<Axes: ylabel='variable'>

../_images/13604cf6c72a86333b6dc57db07f416f7928b494e3972473f5257beefcf04592.png

Additional Resources#

For more on pie plots:

Pie Charts

Contents

Pie Charts#

Additional Resources#