3.4 Descriptive Lab
Objectives
This lab is intended to evaluate your ability to:
- prepare a dataset for analysis
- calculate summary statistics for a dataset
- describe the distribution of a variable
- analyze associations between variables
- statistically describe a dataset with supporting evidence
Task
Find a dataset on a topic that you are interested in. You will be writing a Python program (or notebook) to analyze this dataset. This time, you will try to draw conclusions about the dataset based on your analysis of summary statistics. You do not have to analyze every attribute of the entire dataset! You are welcome to analyze smaller sections of the dataset that you’re interested in.
Write a blog post for your website with responses to the following:
- Which dataset did you work with?
- Which aspect of this dataset are you interested in? What do you hope to learn from analyzing this dataset?
- Discuss your analysis of the dataset. Include details such as:
- The variables you looked at
- Distributions of variables (center and variability)
- Relationships between variables
- Visualizations of the dataset
- Limitations of your analysis and the dataset
- What conclusions can you draw about this dataset? What is your supporting evidence?
Your blog post should showcase your understanding of the material covered in this unit.
Academic Honesty
You are allowed to work with others on this lab, as long as you do not share any code or files! Please refer to the syllabus for more details.
You are allowed to use modules we haven’t talked about in class, as long as they are cited, and in your blog post you include an explanation of how and why they are used.