Interpreting the linear regression parameters

Yesterday in class I discussed how we can interpret the linear regression parameters (i.e., the y-intercept a (“alpha”) and the slope (“beta”) yielding a linear regression line (or what we also call a “linear model”)

y = a + bx

See below for a summary (you can also take a look at the Khan Academy videos “Interpreting y-intercept in regression” and “Interpreting slope in regression“):

  • Recall that the linear model is used to predict an “output” value y for a given “input” value x
  • In terms of the line, the y-intercept a is the y-value where the line intersects the x-axis, i.e., when x = 0.  Thus, in terms of the linear regression model, the y-intercept a is the predicted value of the dependent variable y when the independent variable x is 0.
  • In terms of the line, the slope b is how much y increases or decreases if x is increased by 1.  Thus, in terms of the linear regression model, the slope b is the predicted change in the dependent variable y if the independent variable x is increased by 1.

For example, this was an exercise on the “HW4-Paired Data” WebWork set:

Paired data set from WebWork HW4
Exercise from WebWork “HW4-Paired Data”

The results of linear regression for this data set (i.e., regressing the dependent variable y (final grade) on the independent variable x (verbal score) yield the linear regression parameters:

  • y-intercept a ≈ 99.1 ; this can be interpreted as the predicted final grade of a student who gets a verbal score of 0
  • slope b ≈ -0.333 ; this can be interpreted as saying that a student who increases their verbal score by 1 will decrease their final grade by -0.333

“The Aging of America”: Frequency Histograms For US Population Age Distributions

Here are some examples of frequency histograms showing the age distributions of the US population at different times in history (and projected into the future):

  • From the New York Times: “The Aging of America” (Published: February 5, 2011)
  • A similar post appeared on WashingtonPost’s Wonkblog: (published: August 13, 2013), which included this: “This is a mesmerizing little animation created by Bill McBride of Calculated Risk. It shows the distribution of the U.S. population by age over time, starting at 1900 and ending with Census Bureau forecasts between now and 2060.”

What do you notice about how the distributions evolve over time? Click thru to either the CalculatedRisk blog post on which this animation first appeared or to the WashingtonPost link to read some discussion.

Also here is a related set of histograms that were featured in the NYT Business section in May 2014, as part of an article titled “Younger Turn for a Graying Nation“:

NYT-graying

That was an installment of a weekly column in the NYT Business section titled “Off the Charts,” which discussed a graph and the underlying data.