2nd week post

As we finish our 2nd week of instruction today (T 9/8), I want to emphasize how important it is to NOT just rely on the textbook and classwork. In particular, the Blitzstein lectures are quite good if a bit challenging. In particular, the numbering I have in Days 1-9 are a bit behind. You should have actually watched the video I have labeled as Blitzstein 5 as part of your preparation for today’s class. I should mention that Blitzstein 4 is quite difficult and you will NOT be able to follow it the first time you see it. However, it is worth seeing several times and I recommend that you pause it frequently and verify all the calculations (he doesn’t actually SHOW the calculations, just the answers). By the way in class, I have emphasized tree diagrams and in Blitzstein 5, he introduces them during his Monty Hall presentation, perhaps validating my use of them. However, the best approach is perhaps to use both the formulas AND the tree, especially when you are using the concepts.

A second item is to note that the math department now has posted its own tutoring (as opposed to what is available in the tutoring centers). Unfortunately, the sessions are in midway, but I encourage you to make the special effort to go there. These are run by nearly-graduated students in the applied math and  math ed programs and many have 2 or 3 statistics classes BEYOND the one you are taking and may know 10 times more advanced statistics than I do (although I have TAUGHT the introductory topics over and over, and so know them probably better than they do).


Use of and chatter surrounding statistics is quite prevalent in the popular media as well as in many work and play situations (such as sports). However, a good understanding and proper use of statistics requires us basically to start from zero and slowly develop a set of vocabulary and tools that we need to grapple with the subject. While its roots are in math, statistics became a separate profession in the latter part of the 20th century. [Computer science underwent a similar split from math over the course of the last 40 years or so.]

Statistics can broadly be divided into 2 branches, descriptive and inferential. For most of us, statistics is descriptive, easily computed numbers such as mean and standard deviation and an array of graphs used to display the distribution of the data, such as pie charts, bar and line graphs. From science classes, many of us are familiar with scatter plots, which are used to display a relation (or lack of) between two variables such as force and the stretch of a spring. The more difficult, predictive aspect of statistics, inferential, requires us to make a somewhat extensive study of probability, which is the branch of math that provides us the bridge to this separate discipline called statistics.

A glance at the schedule shows that sessions 1-20 are taken up with probability, that mathematical foundation we are creating so that you may properly study inferential statistics in the latter part of this course as well as in subsequent courses that you may take (e.g., Probability and Statistics II). Along the way, we will make use of many aspects of descriptive statistics, largely through the use of software such as Microsoft Excel and the programming language R. I encourage their full use for in-class work and homework and will make the class as hands-on as possible.

A word about resources. A quick look on Amazon shows that the textbook retails for $120 new. You can pick up the 4th edition used for much cheaper. If someone lends me their 4th edition, I would be happy to check if the section numbers have changed, although the problem numbers may have changed slightly as well and that is something you will have to deal with on your own. Probability and Statistics are such standard topics that there are many free online resources, including videos and archived course materials. I have decided to focus on the Khan academy videos (which will be a bit at a low-level for us), a Harvard course which we will use mainly for probability (Blitzstein). In addition, I will make use of lecture notes (Orloff) and slides (Sheffield) from 2 MIT courses, which are at slightly low and slightly high levels for us, respectively. There is an online statistics textbook/course at a level which is low for us, but those of you who like to work in an integrated online environment (text, video, exercises, immediate feedback, etc.) will like it. Our official textbook does have a student solution manual available which I highly recommend for students who may need that extra boost to get them a passing score. Exam questions will be heavily based on homework. Finally, there are 2 probability textbook pdf’s (grinstead and snell and ross) you can use to supplement the required one and from which I myself will make use of for additional materials and problems.