The Art of Statistics, How to Learn From Data, David Spiegelhalter
The Art of Statistics: How to Learn From Data - David Spiegelhalter (Highlight: 40; Note: 0)
───────────────
◆ CHAPTER 8 Probability–the Language of Uncertainty and Variability
▪ What would I expect to happen if I tried the experiment a number of times?
▪ there is no such thing as the unconditional probability of an event
▪ mammography is roughly 90% accurate
▪ 1% of women being screened
▪ very counter-intuitive result
▪ in spite of the ‘90% accuracy’ of the scan, the vast majority of women with a positive mammogram do not have breast cancer
▪ prosecutor’s fallacy
▪ There is no probability-ometer
▪ In each of these situations there is a very large number of opportunities for an event to happen, but each with a very low chance of occurrence, and this gives rise to the extraordinarily versatile Poisson distribution.
◆ CHAPTER 9 Putting Probability and Statistics Together
▪ variability in the observed proportion gets smaller as the sample size increases
▪ Law of Large Numbers,
▪ temptation is to believe that tails is somehow now ‘due’ so that the proportion gets balanced out—this is known as the ‘gambler’s fallacy
▪ But the coin has no memory—
▪ two types of uncertainty: what is known as aleatory uncertainty before I flip the coin—the ‘chance’ of an unpredictable event—and epistemic uncertainty after I flip the coin—an expression of our personal ignorance about an event that is fixed but unknown.
▪ confidence interval
▪ conventional to use 95% intervals, which are generally set as plus or minus two standard errors
◆ CHAPTER 10 Answering Questions and Claiming Discoveries
▪ hypothesis testing
▪ The idea of a null hypothesis now becomes central: it is the simplified form of statistical model that we are going to work with until we have sufficient evidence against it
▪ It is just a working assumption until something better comes along
▪ whether the observed difference of 7% is big enough
▪ tail-area is known as a P-value
▪ P-value is the probability of getting a result at least as extreme as we did, if the null hypothesis (and all other modelling assumptions) were really true.
◆ CHAPTER 11 Learning from Experience the Bayesian Way
▪ use probability as an expression of our lack of knowledge
▪ not only for future events
▪ continuously revise our current probabilities in the light of new evidence
▪ Bayes’ theorem
▪ testing and then the truth about doping
▪ doping and then testing
▪ This ‘reversal’ is exactly what Bayes’ theorem does
▪ inverse probability
▪ Bayes’ theorem is essentially prohibited from British courts
▪ allowed in court—the likelihood ratio
▪ Bayes’ theorem, which simply says that
the initial odds for a hypothesis × the likelihood ratio = the final odds for the hypothesis
▪ scientifically correct way to change our mind on the basis of new evidence.
▪ given something has happened or not happened on a known number of similar occasions, what probability should we give to it
page-324
happening next time?
▪ before any red balls are thrown at all, we can estimate the position to be (0 + 1)/(0 + 2) = ½, whereas the intuitive approach might suggest that we could not give any answer since there is not yet any data
▪ Bayesian analysis uses knowledge about how the position of the dashed line was decided to establish a prior distribution for its position, combines it with evidence from the data known as the likelihood, to give a final conclusion known as the posterior distribution,
▪ main controversy about Bayesian analysis is the source of the prior distribution
▪ multi-level regression and post-stratification (MRP)
▪ posterior odds = likelihood ratio × prior odds.