Plotting Categorical Data

Plotting Categorical Data#

For our purposes, categorical data is defined as qualitative, nominal, or ordinal data that is discrete, or non-continuous. Categorical data contrasts with numerical data that is continuous. The axis type determines how the data is plotted in the resulting figure.

Axis types recognized in plotly:

  • linear

  • log

  • date

  • category

  • multicategory

The axis type is auto-detected by plotly based on the data linked to the specific axis.

  • If plotly does not recognize the data as multicategory, date, or category (it checks sequentially in that order), it defaults to linear.

  • When testing for multicategory data, plotly looks to see if there is a nested array.

  • When testing for date or category, plotly requires more than twice as many distinct date or category strings as distinct numbers in order to choose one of these axis types.

We can imagine scenarios in which we are working with categorical data that would not be accurately auto-detected by plotly. We can instruct plotly to recognize an axis as having categorical data through the xaxis_type and yaxis_type attributes.

Example #1#

An example of categorical data represented as a bar chart.

import plotly.express as px # import statement
fig = px.bar(x=["a", "b", "c", "d"], y = [1,2,3,4]) # create figure
fig.update_xaxes(type='category') # update axis type
fig # show output

In this example, the auto-detected X axis type would be linear. By using .update_xaxes(type='category'), we force the X axis to be categorical.

Example #2#

We can also control the category order by passing a dictionary to the category_orders parameter. An example with side-by-side bar charts of categorical data for the restaurant and tip data.

df = px.data.tips() # load data

# create figure
fig = px.bar(df, x="day", y="total_bill", color="smoker", barmode="group", facet_col="sex",
             category_orders={"day": ["Thur", "Fri", "Sat", "Sun"],
                              "smoker": ["Yes", "No"],
                              "sex": ["Male", "Female"]})
fig.show() # show figure

In addition to setting color, barmode, and facet_col parameters, we pass a dictionary to category_orders to determine the order for each category in the plot.

Ordering Categories#

We can also automatically sort categories by name or total value by using .update_xaxes() or .update_yaxes() in combination with the categoryorder parameter.

  • Setting categoryorder to category ascending or category descending sorts categories alphabetically.

  • Setting categoryorder to total ascending or total descending sorts categories numerically by total value.

Additional Resources#