The Art of Statistics, How to Learn From Data, David Spiegelhalter

The Art of Statistics: How to Learn From Data - David Spiegelhalter (Highlight: 40; Note: 0)

───────────────

◆ CHAPTER 8 Probability–the Language of Uncertainty and Variability

▪ What would I expect to happen if I tried the experiment a number of times?

▪ there is no such thing as the unconditional probability of an event

▪ mammography is roughly 90% accurate

▪ 1% of women being screened

▪ very counter-intuitive result

▪ in spite of the ‘90% accuracy’ of the scan, the vast majority of women with a positive mammogram do not have breast cancer

▪ prosecutor’s fallacy

▪ There is no probability-ometer

▪ In each of these situations there is a very large number of opportunities for an event to happen, but each with a very low chance of occurrence, and this gives rise to the extraordinarily versatile Poisson distribution.

◆ CHAPTER 9 Putting Probability and Statistics Together

▪ variability in the observed proportion gets smaller as the sample size increases

▪ Law of Large Numbers,

▪ temptation is to believe that tails is somehow now ‘due’ so that the proportion gets balanced out—this is known as the ‘gambler’s fallacy

▪ But the coin has no memory—

▪ two types of uncertainty: what is known as aleatory uncertainty before I flip the coin—the ‘chance’ of an unpredictable event—and epistemic uncertainty after I flip the coin—an expression of our personal ignorance about an event that is fixed but unknown.

▪ confidence interval

▪ conventional to use 95% intervals, which are generally set as plus or minus two standard errors

◆ CHAPTER 10 Answering Questions and Claiming Discoveries

▪ hypothesis testing

▪ The idea of a null hypothesis now becomes central: it is the simplified form of statistical model that we are going to work with until we have sufficient evidence against it

▪ It is just a working assumption until something better comes along

▪ whether the observed difference of 7% is big enough

▪ tail-area is known as a P-value

▪ P-value is the probability of getting a result at least as extreme as we did, if the null hypothesis (and all other modelling assumptions) were really true.

◆ CHAPTER 11 Learning from Experience the Bayesian Way

▪ use probability as an expression of our lack of knowledge

▪ not only for future events

▪ continuously revise our current probabilities in the light of new evidence

▪ Bayes’ theorem

▪ testing and then the truth about doping

▪ doping and then testing

▪ This ‘reversal’ is exactly what Bayes’ theorem does

▪ inverse probability

▪ Bayes’ theorem is essentially prohibited from British courts

▪ allowed in court—the likelihood ratio

▪ Bayes’ theorem, which simply says that
the initial odds for a hypothesis × the likelihood ratio = the final odds for the hypothesis

▪ scientifically correct way to change our mind on the basis of new evidence.

▪ given something has happened or not happened on a known number of similar occasions, what probability should we give to it

page-324
happening next time?

▪ before any red balls are thrown at all, we can estimate the position to be (0 + 1)/(0 + 2) = ½, whereas the intuitive approach might suggest that we could not give any answer since there is not yet any data

▪ Bayesian analysis uses knowledge about how the position of the dashed line was decided to establish a prior distribution for its position, combines it with evidence from the data known as the likelihood, to give a final conclusion known as the posterior distribution,

▪ main controversy about Bayesian analysis is the source of the prior distribution

▪ multi-level regression and post-stratification (MRP)

▪ posterior odds = likelihood ratio × prior odds.