# Summary & Descriptive Statistics

The `.mean()`, `.median()`, `.describe()`, and `.agg()` arguments apply to columns with numeric data and typically ignore missing data.

`.info()` is a good place to start to check data types.

In these examples, we'll work with the Census Bureau API return.
- Specific dataset we're working with: [
American Community Survey 1-Year Estimates Public Use Microdata Sample](https://www.census.gov/data/developers/data-sets/census-microdata-api.html)

In [None]:
# creating dataframe from API return
import pandas as pd, requests, json # import statements
page = requests
r = requests.get("https://api.census.gov/data/2022/acs/acs1/pums?get=SEX,AGEP,MAR&SCHL=24")
data = r.json() # store response
df = pd.DataFrame(data[1:], columns=data[0]) # create the dataframe, making the first sublist the column headers, and starting with the first row of data to avoid duplicating headers)
df # show output

We can check the data types and general summary.

In [None]:
df.info() # show info

In order to perform calculations, we'll want to convert the data type to integer:

In [None]:
df = df.astype(int) # change datatype for all columns
df.info() # show updated technical summary

Now we can start to explore summary statistics for specific columns.

In [None]:
df['AGEP'].mean() # mean value for PWGTP column

In [None]:
df['AGEP'].median() # median vlaue for PWGTP column

We can also use `.describe()` to get a predetermined set of summary statistics for all columns.

In [None]:
df.describe() # summary statistics for entire dataframe

We can use `.agg()` to return specific combinations of aggregate statistics for specific columns.

In [None]:
df.agg(
    {
        'AGEP': ['min', 'max', 'median', 'mean', 'skew'],
        'MAR': ['mean']
    }
)

## Additional Resources

- [Pandas documentation, "Descriptive statistics"](https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#descriptive-statistics)
- [Pandas documentation, "How to calculate summary statistics"](https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html)