Statistics & Probability | Instructor: Suman Ganguli

# Category: Assignment Instructions(Page 1 of 2)

• summary statistics: mean, median, max, min, standard deviation
• frequency table and frequency histogram
• optional: time series plot (data over time)
• a short paragraph describing your project:
• background: what variable you chose, why you were interested in that variable, and your method for recording the data;
• some comments on the summary statistics, including any patterns you notice in the frequency distribution and/or the time series plot

You can consult my Personal Data Project spreadsheet and use it as a model.

Please visit and explore the Gapminder website which I will show in class after spring break:

• the homepage Gapminder.org has links to various features on the site
• the Tools page has an interactive scatterplot tool which I will show in class:

GapMinder has a LOT of data that is available for download, and so is a very good source for project topics and datasets.  They provide datasets for 519 (!) different “indicators” listed alphabetically–everything from “Adults with HIV (%, age 15-49)”) to “Yearly CO2 emissions (1000 tonnes).”

Here is a scatterplot I will show in class titled the “Wealth & Health of Nations“, as measured by life expectancy (a measure of a country’s health) vs. GDP per capita (a measure of its wealth):

GapMinder actually shows a time-lapse animation of scatterplots, showing how this paired data set evolved over the past 200 years.

(In fact, they produced a video called “200 years that changed the world” in which Hans Rosling, the medical doctor and statistician who created GapMinder, provides commentary on this time-lapse data.  Rosling became widely known through his TED talks. His first one, from 2006, is titled “The best stats you’ve ever seen“–it’s worth watching!)

For this project, you will collect and analyze data regarding some “personal metric” of your choosing. This project will count as 5% of your course grade.

Choose something you’re interested in measuring about your daily life. We will discuss some examples in class this Wednesday (and we will post some ideas in the comments below).

You can get some ideas by searching the web for “quantified self” or “self-tracking.” In fact, there is a recent MIT Press book titled Self-Tracking, which has this in its description:

People keep track. In the eighteenth century, Benjamin Franklin kept charts of time spent and virtues lived up to. Today, people use technology to self-track: hours slept, steps taken, calories consumed, medications administered. Ninety million wearable sensors were shipped in 2014 to help us gather data about our lives. This book examines how people record, analyze, and reflect on this data, looking at the tools they use and the communities they become part of.

https://mitpress.mit.edu/books/self-tracking

Data collection:

After you have chosen your personal variable, start recording your data on a (more or less) daily basis:

• Set up a spreadsheet with columns for “Date” and “[Variable name]”; you can also include a third column for “Notes.”