Math 1372 – Statistics with Probability – Fall 2014

You must be logged in to reply to this topic.

  • Project Outline/Template
  • #19413

    Suman Ganguli
    Participant

    I. Introduction (3-4 paragraphs)

    introduce/mention the following
    –the topic
    –data used (if any)
    –statistical analyses you did
    –results/conclusions

    II. Background/Data (~2 pages)

    –more background on the topic, why study it
    –more on the data (sources, how it was collected)
    –possibly include the data in a table (here or in an Appendix)

    III. Statistical Analyses/Results (~2 pages)

    –describe the statistics used (e.g., linear regression)
    –results (graphs: frequency histogram(s), scatterplot with regression line)
    –report statistics: mean, standard deviation, correlation coefficient

    IV. Discussion (1 page)

    –discuss and interpret statistical results from previous section

    V. Conclusion (a couple paragraphs)

    –conclusion + topics for further study

    VI. Bibliography

    #22810

    Suman Ganguli
    Participant

    Some of you have asked for a sample project to help guide you. Here is a sample expanded project proposal–assuming I make the time over the next week or two, I will write up a paper based on this:

    For this project, we will examine physical activity and sleep data collected using a wearable personal fitness tracker. Specifically, we will download and analyze data collected by a Jawbone UP (https://jawbone.com/up) over a period of a period of approximately four months. Like other such fitness trackers available on the market (e.g., Nike’s Fuel Band http://www.nike.com/FuelBand or the FitBit http://www.fitbit.com/), the Jawbone Up acts an electronic pedometer, and can also track amount of time slept.

    We will analyze this step and sleep data individually, in the form of frequency histograms, aggregated at the daily, weekly, and monthly time intervals. We will also analyze whether there is any correlation between physical activity and sleep, via a scatterplot and calculation of a correlation coefficient.

    Finally, we will comment on the growing trend of fitness tracking, and more generally on the area of “personal analytics” (or what is sometimes termed “the quantified self”; cf http://en.wikipedia.org/wiki/Quantified_Self & http://www.ted.com/talks/gary_wolf_the_quantified_self)

    #24377

    Suman Ganguli
    Participant

    I suggested last week that you write a detailed outline for your project to help guide you. To give you an example, below is an outline for my sample project.

    Note that the introduction is basically just the expanded project proposal I wrote above. The rest is bullet points, organized according to the template outline I gave you above.

    ——————–

    I. Introduction

    For this project, we have examined physical activity and sleep data collected using a wearable personal fitness tracker. Specifically, we have downloaded and analyzed daily data collected by a Jawbone UP
    over a period of a period of approximately four months.

    We have analyzed daily step and sleep data individually, in the form of summary statistics (means and standard deviations) and frequency histograms. We also perform some basic time series analyses on these data, by looking at time series plots using the daily data as well as moving averages of the daily data.

    We also analyze whether there is any correlation between physical activity and sleep, via a scatterplot and calculation of a correlation coefficient.

    [insert additional text about interpretation of analyses, conclusions]

    Finally, we will comment on the growing trend of fitness tracking, and more generally on the area of “personal analytics” (or what is sometimes termed “the quantified self”).

    II. Background/Data

    background: wearable fitness trackers
    * growing popularity of wearable fitness trackers: Jawbone UP, FitBit, Nike Fuel Band; incorporated into new Apple iWatch
    * what data do such fitness trackers collect; what might they collect in the future (physical activity: steps, workouts, etc; sleep; food/drink; heart rate; weight…)
    * part of “quantified self” movement; potential applications to/benefits for health care

    data:
    * Jawbone’s UP app provides basic data to user: daily, weekly, monthly totals (include screenshots?)
    * allows user to download complete daily data as csv file via website
    * imported csv file into Google spreadsheet
    * over 45 data fields, although no documentation is provided
    * approx half related to diet: calories, calcium, carbohydrates,cholesterol, fat, fiber, iron, etc;
    * handful of user attributes: age, gender, body fat, height, weight, goal weight
    * handful related to physical activity: active time, inactive time, calories burned, number of steps, distance
    * handful related to sleep: total duration, duration of “light sleep”, duration of “sound sleep”
    * for this project we analyze only two of these: number of steps and total sleep duration

    preliminary data wrangling:
    * converted daily date from text string to date using string parsing functions
    * converted sleep duration to numbers of hours–appears that data is given in number of seconds. Although no documentation is provided, we concluded this by comparing the numbers in the data file with the information provided via the app, which gives daily sleep data in hours. Hence, we divide that column by 3600 in order to get duration of sleep in hours.

    III. Statistical Analyses/Results (~2 pages)

    basic descriptive/summary statistics:
    * summary statistics for step and sleep data
    * frequency histograms
    * comment on values of summary statistics and shape of histograms
    * check whether data follows “empirical rule”–calculate what % of data lies within 1SD of mean; show mean and mean +/- 1SD on histograms?

    time series:
    * daily time series for step and sleep data
    * explain moving averages
    * show graphs of 15-day and 30-day moving averages
    * interpret patterns in time series–reasons for increased/decreased physical activity or sleep during certain time periods?
    * additional time series analyses?

    scatterplot/correlation:
    * create scatterplot of sleep vs steps
    * explain rationale for considering sleep as dependent variables, steps as independent
    * calculate correlation coefficient
    * if appropriate (i.e., if a reasonable degree of correlation is found), also do linear regression

    V. Conclusion (a couple paragraphs)

    Conclusions from data analyses

    Compare with UP’s aggregated user data

    Topics for further study:
    * use scripting language (Python or R) to make this sort of analysis quickly repeatable

    * utilize additional fields of data
    * multiple linear regression?

    VI. Bibliography

    Hamilton, J.D. Time Series Analysis
    Jawbone blog: https://jawbone.com/blog/jawbone-up-data-by-city/
    http://en.wikipedia.org/wiki/Quantified_Self

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.