(This is in addition to the Datacamp assignments, of course!)
Below you will find an R script which makes a histogram of the probability distribution function for problem 3.2.2a. I have added some comments, to include the description of the problem and to explain some of the coding.
I have also included the graph, which I exported as a jpeg.
Your assignment is to write a similar script for the probability distribution functions of problems 3.2.1(a and b) and 3.2.2(b). Each problem should have a separate script.
You may do this by editing my script, but make sure that you change everything that needs to be changed. Also make sure that your variable name(s) are good and descriptive.
You should also explore the “help” in Rstudio for the barplot function, and see what various features you can add or change in the graph. If you add a feature, put in a comment to describe what you did.
It is possible to write the scripts in a word processing program and save as a text file with the extension .r (although your word processor may object to that!), but you will need to run them in Rstudio anyway, so it is probably best to do the final editing in Rstudio. The “R script” menu item is found by clicking on the green + sign at the upper left of the Rstudio window.
Save your scripts with names of the following format:
Lastname_Firstname_problemnumber_Graph.r
Where Lastname = your last name
Firstname = your first name
problemnumber = the number of the problem
For example, my script was saved under the name
Shaver_Sybil_3.2.2a_Graph.r
Also, export the graphs as either jpegs or pdfs, your choice. The “export” is at the top of the Plots tab. Save them under the same names as the scripts, but with the extension .jpeg or .pdf instead of .r
Post the three scripts and the three graphs in Piazza in a private note to me. This is how you will submit your work.
The scripts and graphs are due by 10 PM Monday the 23rd of April.
Here is my R script and after it is my graph:
#Problem 3.2.2a: two numbers are selected from the integers 1 through 5, with replacement.
# X represents the larger of the two numbers. This is the pdf for X.
problem3_2_2a_dist <- c(1/25, 3/25, 5/25, 7/25, 9/25)
# The line below adds labels to the bars showing the X value.
# as.character is used because the “names” attribute must be of character type.
# If we wanted to list the numbers, we would have to put them in quotes to make them characters.
names(problem3_2_2a_dist) <- as.character(1:5)
# I could put a comment here to explain the features I have added to this graph.
barplot(problem3_2_2a_dist, space=0, xlab = “larger number”, ylab = “probability”, col=”blue”)