Monthly Archives: June 2011


Use of and chatter surrounding statistics is quite prevalent in the popular media as well as in many work and play situations (such as sports). However, a good understanding and proper use of statistics requires us basically to start from zero and slowly develop a set of vocabulary and tools that we need to grapple with the subject. While its roots are in math, statistics became a separate profession in the latter part of the 20th century. [Computer science underwent a similar split from math over the course of the last 40 years or so.]

Statistics can broadly be divided into 2 branches, descriptive and inferential. For most of us, statistics is descriptive, easily computed numbers such as mean and standard deviation and an array of graphs used to display the distribution of the data, such as pie charts, bar and line graphs. From science classes, many of us are familiar with scatter plots, which are used to display a relation (or lack of) between two variables such as force and the stretch of a spring. We will spend much of the first three weeks doing descriptive statistics, during which time, we will get a heavy dose of Microsoft Excel and an introduction to the programming language R. I encourage the full use of these 2 tools for in-class work as well as homework.  The more difficult, predictive aspect of statistics, inferential, requires us to first to learn some probability, which is a branch of math. We will study probability for about 6 weeks. With probability under our belt, we will then devote the last 4 weeks of class to inferential statistics.

A word about resources. A quick look on Amazon shows that the textbook retails for $80 new and $20 used. In the past, students have been able to find PDF’s of the book as well. Probability and Statistics are such standard topics that there are many free online resources, including videos and archived course materials. I have decided to focus on the Khan academy videos (which will be a bit at a low-level for us). In addition, I will make use of lecture notes (Orloff) from an MIT course, which is at slightly high level for us. There is an online statistics textbook/course at a level which is low for us, but those of you who like to work in an integrated online environment (text, video, exercises, immediate feedback, etc.) will like it. Exam questions will be heavily based on homework. There are 2 probability textbook pdf’s (grinstead and snell and ross) you can use to supplement the required one and from which I myself will make use of for additional materials and problems.