Use of and chatter surrounding statistics is quite prevalent in the popular media as well as in many work and play situations (such as sports). However, a good understanding and proper use of statistics requires us basically to start from zero and slowly develop a set of vocabulary and tools that we need to grapple with the subject. While its roots are in math, statistics became a separate profession in the latter part of the 20th century. [Computer science underwent a similar split from math over the course of the last 40 years or so.]
Statistics can broadly be divided into 2 branches, descriptive and inferential. For most of us, statistics is descriptive, easily computed numbers such as mean and standard deviation and an array of graphs used to display the distribution of the data, such as pie charts, bar and line graphs. From science classes, many of us are familiar with scatter plots, which are used to display a relation (or lack of) between two variables such as force and the stretch of a spring. The more difficult, predictive aspect of statistics, inferential, requires us to make a somewhat extensive study of probability, which is the branch of math that provides us the bridge to this separate discipline called statistics.
A glance at the schedule shows that sessions 1-20 are taken up with probability, that mathematical foundation we are creating so that you may properly study inferential statistics in the latter part of this course as well as in subsequent courses that you may take (e.g., Probability and Statistics II). Along the way, we will make use of many aspects of descriptive statistics, largely through the use of software such as Microsoft Excel and the programming language R. I encourage their full use for in-class work and homework and will make the class as hands-on as possible.
A word about resources. A quick look on Amazon shows that the textbook retails for $120 new. You can pick up the 4th edition used for much cheaper. If someone lends me their 4th edition, I would be happy to check if the section numbers have changed, although the problem numbers may have changed slightly as well and that is something you will have to deal with on your own. Probability and Statistics are such standard topics that there are many free online resources, including videos and archived course materials. I have decided to focus on the Khan academy videos (which will be a bit at a low-level for us), a Harvard course which we will use mainly for probability (Blitzstein). In addition, I will make use of lecture notes (Orloff) and slides (Sheffield) from 2 MIT courses, which are at slightly low and slightly high levels for us, respectively. There is an online statistics textbook/course at a level which is low for us, but those of you who like to work in an integrated online environment (text, video, exercises, immediate feedback, etc.) will like it. Our official textbook does have a student solution manual available which I highly recommend for students who may need that extra boost to get them a passing score. Exam questions will be heavily based on homework. Finally, there are 2 probability textbook pdf’s (grinstead and snell and ross) you can use to supplement the required one and from which I myself will make use of for additional materials and problems.