Yesterday in class, I introduced the 2nd project for the semester (please remember to continue collecting your commute time data for that project!)
This project will involve:
- finding a paired data set on a topic you’re interested in
- creating a scatterplot with the linear regression trendline
- computing the correlation coefficient and linear regression parameters
- writing up a short (1-2pp) discussion of the data and your findings.
Here is a timeline for the first steps for this project:
- by Mon Nov 25: decide whether you want to work on this project individually or together with a partner
- if the latter, find a research partner in the class!
- by Wed Nov 27: decide on a topic (broadly speaking) that you’re interested in studying statistically
- some examples: economics, sports, public health, law/crime, business, finance, entertainment (movies, music, etc), demographics (population, race, gender, etc), politics/elections, transit/transportation, weather, environment, energy, …
- here are some websites you can browse for ideas for topics: Our World in Data, GapMinder, NYC OpenData, FlowingData, The Guardian’s Datablog, FiveThirtyEight, NYT’s Upshot
- by Wed Dec 4: decide on a specific topic & find an appropriate paired data set (we will spend some class time on this during which I will help you individually!)