## Projects: Commute Time & Linear Regression

See below for some instructions about the two projects–both are due this Friday (Dec 18), but the linear regression project is optional and will be counted as extra credit:

• Linear regression project: this project will count as extra credit.Â  Again, you can either print out a hardcopy to hand in, or email me your spreadsheet.Â  Your project should include the following:
• a scatterplot of your paired data set, including the “trendline” (i.e., the linear regression line)
• the correlation coefficient and the linear regression parameters (slope and y-intercept)
• a brief written description (1-2 paragraphs) about the scatterplot and statistics (e.g., how strongly are the variables correlated? is the linear regression line a good model? are there any outliers?)
• you can refer to the solutions to the exam questions about paired data sets and linear regression for examples of how to write about these topics (Exam #1, Question #4 & Exam #2, Question #5)

## OpenLab Assignment: Post your linear regression project topic (part 1)

As I discussed in class and posted on here last week, you should choose a topic for your linear regression project today.

To encourage you to do this, I’m making this an OpenLab assignment; completing this simple assignment will earn you one point towards the participation component of your course grade:

• decide whether you want to work on this project individually or together with a partner
• decide on a topic (broadly speaking) that youâ€™re interested in studying statistically
• some examples: economics, sports, public health, law/crime, business, finance, entertainment (movies, music, etc), demographics (population, race, gender, etc), politics/elections, transit/transportation, weather, environment, energy, â€¦
• post your topic in the comments below (if you are working with a partner, only one of you has to post, but then mention in the comment who you’re working with)
• this should just be one or two sentences. e.g., “I would like to work on a dataset related to the environment and energy consumption.”

This assignment is due this Friday (November 29).Â  Late submissions will receive partial credit. (But it should only take 10minutes to complete, so just get it done today!)

There will be a “part 2” to this assignment next week, when I will ask you to decide on a specific topic, e.g., “I will analyze a paired dataset regarding CO2 emissions and wealth (GDP per capita), at the country-level.”Â  You can start thinking about that over the long weekend.

Here are some websites you can browse for ideas for specific topics:

## Linear Regression Project

Yesterday in class, I introduced the 2nd project for the semester (please remember to continue collecting your commute time data for that project!)

This project will involve:

• finding a paired data set on a topic you’re interested in
• creating a scatterplot with the linear regression trendline
• computing the correlation coefficient and linear regression parameters
• writing up a short (1-2pp) discussion of the data and your findings.

Here is a timeline for the first steps for this project:

• by Mon Nov 25: decide whether you want to work on this project individually or together with a partner
• if the latter, find a research partner in the class!
• by Wed Nov 27: decide on a topic (broadly speaking) that you’re interested in studying statistically
• by Wed Dec 4: decide on a specific topic & find an appropriate paired data set (we will spend some class time on this during which I will help you individually!)

## Commute Time Project

You can find the pdf of the Commute Times Project description I handed out yesterday in Files.

Also, here is the link to the spreadsheet I showed at the end of class: