Tag Archives: scatterplot

Example: Life Expectancy vs Household Income in the US

A major study of income and life expectancy in the United States was just published last week in the Journal of the American Medical Association, titled “The Association Between Income and Life Expectancy in the United States, 2001-2014.”  You can read the entire paper online (or download a pdf–see below), but the results were also reported on by various news sources, including The New York Times, which titled its post: “The Rich Live Longer Everywhere. For the Poor, Geography Matters.”  (There’s also a related interactive map that lets you look at life expectancies in your area: “Where the Poor Live Longer: How Your Area Compares“. Note that this is county-by-county data.

The NYT page has a number of graphs and tables, including this scatterplot of life expectancy vs household income for men and women:

Life Expectancy vs Household Income

Below is the full paper, whose lead authors are two Harvard economists (Raj Chetty and David Cutler). Note the “Design and Setting” paragraph, which describes where the data was obtained:

Income data for the US population were obtained from 1.4 billion deidentified tax records between 1999 and 2014. Mortality data were obtained from Social Security Administration death records. These data were used to estimate race- and ethnicity-adjusted life expectancy at 40 years of age by household income percentile, sex, and geographic area, and to evaluate factors associated with differences in life expectancy.

Download (PDF, 2.16MB)

Example/Project Ideas: GapMinder’s “Wealth & Health of Nations”

GapMinder is a website I showed earlier in the semester when we discussed scatterplots.   GapMinder has a wealth of data that is available for download, and so is a very good source for project topics and datasets.  They provide datasets for 519 (!) different “indicators” listed alphabetically–everything from “Adults with HIV (%, age 15-49)”) to “Yearly CO2 emissions (1000 tonnes)”!

Browse through the list to get some project ideas (clicking under the “Download” column downloads the data in an Excel file; clicking under “View” opens a Google spreadsheet with the dataset.)

The scatterplot I showed in class earlier in the semester showed the “Wealth & Health of Nations“, as measured by life expectancy (a measure of a country’s health) vs. GDP per capita (a measure of its wealth):

gapminder

Recall that GapMinder shows a time-lapse movie of such scatterplots, showing how this paired data set evolved over the past 200 years.

(In fact, they produced a video called “200 years that changed the world” in which Hans Rosling, the medical doctor and statistician who created GapMinder, provides commentary on this time-lapse data.  Rosling became widely known through his TED talks. His first one, from 2006, is titled “The best stats you’ve ever seen“–it’s worth watching!)