DRAFT ā PART 3
Ā Ā Ā Ā Today we live in the Information Age where we understand a great deal about the world around us. Much of this information was determined mathematically by using statistics. When used correctly, statistics tell us any trends in what happened in the past and can be useful in predicting what may happen in the future. But as Brooks says in one of his article, āthere are many things big data does poorly.ā One of the problems here is correct interpretation of data and its reliability. As Brooks states in his example of use of words āIā, āmeā, āmineā, people with confidence use a fewer of those words and vice versa.[] But we can look at it from other perspective. What if āconfidentā people just know the psychological effect of those words and, in order to persuade someone, avoid using inappropriate one?Ā Data canāt register oneās intentions, thoughts or moral condition; it can only display visible events and facts.
The article, āUse and Misuse of statistics,ā presented by Harvard Business school, talks that before we use any data we should know how much this is reliable. In order to accomplish that, we should be clear about purpose of using the data, what we want to discover. For example, customer satisfaction survey results in the arithmetic mean or average of a group of numbers and equal, letās say, 3.5 on a scale of 1 to 5.[] But in reality it could be that no one gave to a product a rating of 3.5. Instead, the responses could cluster around a group of very satisfied customers, who scored it a 5, and unsatisfied customers, who gave it a 1. In this case the mean isnāt the most helpful metric for research.
Real life is more complicated than data report, so we donāt have to take cause and effect by granted. If we do such-and-such, then such-and-such will happen. The desire if not requirement that data must be used with every decision creates paralysis.Ā In present world we are taught to seek perfection, but sometimes we forget that the one thing more important than perfection is simply progress[]. If everything should base on statistics, we would never have great unexpected breakthroughs in the human history.
With statistics, we canāt prove things with 100% certainty.[] For instance, people, recording survey results, may be dishonest or sloppy in those results. Ā That question has emerged with survey conducted by two criminologists that has raised doubts about the integrity of the New York Police Department’s highly regarded crime tracking program, CompStat. Relying on the anonymous responses of hundreds of retired high-ranking police officials, the survey found that tremendous pressure to reduce crime, year after year, prompted some supervisors and precinct commanders to distort crime statistics[].
The biggest limit to big data is our ability to interpret it. [] Gordon B. Drummond in his work āData Interpretation: Using Probabilityā talks about principles of data interpretation. One of his key points is that we should ensure that a sample studied was right chosen and random. As Drummond states, āItās possible that some scientists are not even clear that the word āāsampleāā has a special meaning in statistics, or understand the importance of taking an unbiased sample,ā we can never be certain that a sample will exactly reļ¬ect the properties of the entire group of possible candidates available to be studied. Ā Drummond suggests planning study, establishing hypothesis and estimating the probabilities that the observed data could have occurred by chance, āA properly designed study that aims to answer speciļ¬c questions will have deļ¬ned outcomes of interest at the outset, before data collection has started. These questions are then recast as hypotheses that need to be tested.ā At the end, when we draw a conclusion, we should consider that absence of evidence in any study is not evidence of absence. If we canāt detect or analyze something, there is no prove it doesnāt exist.
Brooks in his āWhat Data Canāt Doā publication states that raw data have been structured and analyzed by people who use their own values to draw a conclusion. I canāt disagree with this point. Computers can collect the data but only human beings with their own prejudices, gaps in education and sympathy to a certain things will draw final conclusions. People canāt be 100 % impartial. They will always see things through oneās life experience and can easily bend the accurate data sources by simply asking and changing the question to suit their end goals.
In spite of all flaws, everything we do now in modern world is data driven. Weather forecasts, academic success, politics, stock market, etc. ā it all depends on data analysis, which is the best way to understand the present and the past. Big data has its uses, but we should remember that just because we have a lot of data doesn’t mean we have the right data to answer a particular question. Data is only useful if it is honestly and thoroughly gathered. When we are summarizing and interpreting data we shouldnāt blindly rely on raw facts. In order to understand and predict the future outcomes we need to see the problem from different aspects and be maximum objective.
“Data canāt register oneās intentions, thoughts or moral condition; it can only display visible events and facts.” This sentence is interesting because it’ s interesting and makes me curious of what is to come.
In the second paragraph I think it would good if you could add an example that will help understand the topic a little bit more.
In the five paragraph : “A properly designed study that aims to answer specific questions will have deļ¬ned outcomes of interest at the outset, before data collection has started. These questions are then recast as hypotheses that need to be tested” this part could be better if it could be rewritten a different way.
In paragraph 6, you could add an situation that explain why people can’t be 100% impartial.
In conclusion, the topic you was interesting and the introduction was well done. However I could be better if you could add some more example and explain why you chose this topic