Art of Data

3.1 Introduction to Descriptive Statistics

We’ve spent a lot of time now on how data can be obtained and stored, as well as how to evaluate the quality of a dataset. Once we are confident in a dataset, we can start using statistics to describe it.

Introduction (slides)

We will discuss the first step of descriptive statistics: obtaining frequencies. We will learn some vocabulary on how to visualize and discuss data distributions.

⊕ summary statistics
⊕ limitations (sample vs population)
⊕ frequency, mean, mode
⊕ categorical bar chart (Pareto)
⊕ distribution, histogram, normal, distribution
⊕ uni/bimodal, shape, skew, symmetric, tail

Measuring Center and Variability (slides)

Two other common measures of a dataset’s distributions are center and variability.

⊕ mean, median, mode
⊕ range, standard deviation, z-score
⊕ normal distribution, bell curve
⊕ percentile, quartile, interquartile range,
⊕ box plot, violin plot