Quiz #2 / “HW4-Paired Data”

We will have a quiz (Quiz #2) tomorrow (Wednesday, Feb 26). The quiz will be a simple exercise involving generating a scatterplot and calculating the correlation coefficient (using the spreadsheet command =correl) for a  given paired data set.

To prepare for the quiz, review the class outline on those topics and also review the exercises from “HW4-Paired Data” on scatterplots and the correlation coefficient (exercises #6, 9, 10, 13, 14, 19, 22):

  • you can use the built-in spreadsheet function =correl to calculate the correlation coefficient for #6, 19, and 22
  • #19 and #22 ask for additional statistics related to linear regression–those won’t be covered on tomorrow’s quiz

Here are additional notes and hints on “HW4-Paired Data” (which is due Mon March 2)

  • #1-2, 5 (review of equations of lines, independent/dependent variables)
    • recall that if we have y given as a function of x, we call x the independent variable, and y the dependent variable
    • especially in the context of linear regression, where we get a linear function (or “linear model”)  y = α + βx that seeks to explain the y-variable in terms of the x-variable, then x is sometimes called the explanatory or input variable, and y is called the response or output variable
  • #3, 4, 20, 21, 22 (linear regression)
    • for #3 and 22, use the built-in spreadsheet functions =slope(y_data, x_data) and =intercept(y_data, x_data) to find the “least squares line” (i.e., the linear regression line y = α + βx, where α is the y-intercept and β is the slope
  • #7, 8, 17, 19 ask about the “coefficient of determination”

Quiz #1: Take-home quiz

Quiz #1 is a take-home quiz which was handed out in class today (if you weren’t in class, or need to print out another copy, you can find the pdf of the quiz under Files.)

Additionally, the dataset for the quiz can be accessed at

https://docs.google.com/spreadsheets/d/1VGJQVJmQY6xnTuW5EwQTcCJU5bWo9ajUtt2Y0i8Fdpc/edit?usp=sharing

The quiz is due at the beginning of class on Wednesday Feb 19 (which is our next class meeting, since the college is closed on Wed Feb 12 and Mon Feb 17).

The quiz is mainly a review of the basic statistics, graphs and spreadsheet commands we have covered so far. Reviewing the spreadsheets we worked on in class may be helpful.

 

Google Spreadsheet: Blood Cholesterol Example

Here is the Google spreadsheet we created in class yesterday (Mon Feb 3), using the blood cholesterol data from the textbook example (Ross, pp32-34):

https://docs.google.com/spreadsheets/d/1D8juK7QANNTCzAZeoA1b84AgmUm-Su8chms8B-rdphQ/edit?usp=sharing

We will continue working with this spreadsheet/data in class tomorrow.

Commute Time Project

(You can find a pdf of this Commute Times Project description in Files.)

Project #1: Commute Time Statistics
Due Date: Friday, May 15

For this project, you will collect and analyze data regarding how long it takes you to commute to campus. This project will count as 5% of your course grade.

Data collection: Each time you commute to campus this semester, record how long your commute takes:

  • Set up a spreadsheet with columns for “Date” and “Commute time”; you can also include an optional third column for “Notes.”
  • Each time you commute to campus, make a note of what time you start your commute and what time you arrive (or just use a stopwatch on your phone). Subsequently enter the data in your spreadsheet.
    • If you are using Google Sheets you can record this data immediately if you install the Google Sheets app on your phone. Alternatively, write down the data, and later transfer it to your spreadsheet.
  • Use the optional “Notes” column to record information that may be useful later when you analyze your commute times. E.g., if you use different commute routes you may want to record which route you used; if your commute takes much longer than usual, you may want to record why (subway delay, stops along the way, etc).

 

 

Data analysis: At the end of the semester you will:

  • use your spreadsheet to create a frequency table and histogram using your data, and compute the standard summary statistics (mean, median, variance, standard deviation);
  • briefly describe (in 1-2 paragraphs) the distribution and analyze the summary statistics.  Further details (and my example) on how to describe the distribution and analyze the summary statistics will be discussed in class over the course of the semester.

Please set up a spreadsheet now, and start recording your commute times!