Pivot Table

Contents

Pivot Table#

# general syntax
DataFrame.pivot_table(self, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All', observed=False)

Parameters:

index
- Row-level grouping
columns
- Column-level grouping
values
- Numeric values that will be summarized
aggfunc
- Summary statistic calculation

Let’s start with a DataFrame.

# import statements
import numpy as np
import pandas as pd

# create df
df = pd.DataFrame({"P": ["f1", "f1", "f1", "f1", "f1",
                         "b1", "b1", "b1", "b1"],
                   "Q": ["one", "one", "one", "two", "two",
                         "one", "one", "two", "two"],
                   "R": ["small", "large", "large", "small",
                         "small", "large", "small", "small",
                         "large"],
                   "S": [1, 2, 2, 3, 3, 4, 5, 6, 7],
                   "T": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
# show df
df

Example #1#

In our first example…

Values in the S column will be summarized using Numpy’s sum function
Values in the P & Q columns will be the row indices
Values in the R column will be the column index
Values in the T column will be dropped

# pivot table
pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum)

# assign to new df
dfPivotTable = pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum)
# show new df
dfPivotTable

Example #2#

.pivot_table() has a few additional parameters that can be helpful, depending on what type of summary statistic you’re using.

fill_value
- Value to replace missing values with
dropna
- Ignores columns with NaN values

# fill value example
pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum, fill_value=0)

# dropna example
pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum, dropna=True)

Additional Resources#