-
Notifications
You must be signed in to change notification settings - Fork 3
/
0.IntroBayesian.Rmd
executable file
·230 lines (160 loc) · 9.35 KB
/
0.IntroBayesian.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
---
######################################
# Click on "Run Document" in RStudio #
######################################
title: "Quick Introduction to Bayesian Statistics"
author: "Miguel de Navascués"
output: learnr::tutorial
runtime: shiny_prerendered
bibliography: src/references.bib
---
These learning resources are available at [GitHub](https://github.com/mnavascues/ABCRFtutorial) & [Zenodo](http://doi.org/10.5281/zenodo.1435503).
## Setup (tl;dr)
This unit uses R packages `learnr` [@Schloerke2020].
```{r setup, include=TRUE}
library(learnr)
# function to calculate likelihood for the coin toss experiment
source("src/coin_toss.R", local = TRUE)
```
## Question, model, experiment, data
**The question**: We have a coin that we use to choose randomly between two alternatives. Is the coin fair?
**The [model](http://en.wikipedia.org/wiki/Bernoulli_trial)**: For each coin toss there is a probability $p$ of getting *heads* and $q=1-p$ of getting *tails*. Under this model, the question can be formulated as: is $p=q=0.5$?
**The experiment**: A coin is tossed 10 times and the results (*heads/tails*, plough/cereal in this case) are recorded.
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/10_lire_1975_O.jpg/240px-10_lire_1975_O.jpg" alt="heads" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/10_lire_1975_R.jpg/240px-10_lire_1975_R.jpg" alt="tails" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/10_lire_1975_R.jpg/240px-10_lire_1975_R.jpg" alt="tails" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/10_lire_1975_R.jpg/240px-10_lire_1975_R.jpg" alt="tails" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/10_lire_1975_O.jpg/240px-10_lire_1975_O.jpg" alt="heads" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/10_lire_1975_R.jpg/240px-10_lire_1975_R.jpg" alt="tails" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/10_lire_1975_O.jpg/240px-10_lire_1975_O.jpg" alt="heads" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/10_lire_1975_R.jpg/240px-10_lire_1975_R.jpg" alt="tails" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/10_lire_1975_R.jpg/240px-10_lire_1975_R.jpg" alt="tails" style="height: 80px; width:80px;"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/10_lire_1975_O.jpg/240px-10_lire_1975_O.jpg" alt="heads" style="height: 80px; width:80px;"/>
**The data**: The sequence of coin tosses is: H, T, T, T, H, T, H, T, T, H. That is `h=4` and `t=6`.
## Likelihood
The likelihood that $p=q=0.5$ given our data ($D=(h=4,t=6)$) is the probability of the data under the model with $p=q=0.5$:
\[ L(p=0.5|D) = P(D|p=0.5) \]
### Exercise 1
How do you calculate that probability?
```{r likelihood, exercise=TRUE}
p<-0.5; h<-4; t<-6
_____
```
```{r likelihood-hint-1}
p<-0.5; h<-4; t<-6
_____ * p^h * _____
```
```{r likelihood-hint-2}
p<-0.5; h<-4; t<-6
_____ * p^h * (1-p)^t
```
```{r likelihood-solution}
p<-0.5; h<-4; t<-6
choose(h+t, h) * p^h * (1-p)^t
```
### Exercise 2
This value in itself does not solve your question, you need to compare with the likelihood for other values of $p$. Create a function that calculates the likelihood and plot the likelihood values for `p=seq(0,1,0.01)`.
```{r likelihood_function, exercise=TRUE}
flip_coin_likelihood <- function(p,h,t){
likelihood <- _____
return (list(p=p,l=likelihood))
}
likelihood_profile <- _____
plot(likelihood_profile$_____,
likelihood_profile$_____,
xlab = "p", ylab = "Likelihood", type = "l", col="red")
```
```{r likelihood_function-solution}
flip_coin_likelihood <- function(p,h,t){
likelihood <- choose(h+t,h) * p^h * (1-p)^t
return (list(p=p,l=likelihood))
}
likelihood_profile <- flip_coin_likelihood(p=seq(0,1,0.01),h=4,t=6)
plot(likelihood_profile$p,
likelihood_profile$l,
xlab = "p", ylab = "Likelihood", type = "l", col="red")
```
Note: function `flip_coin_likelihood()` has been loaded with `source("src/coin_toss.R")` in the setup and is available for all exercises.
### Exercise 3
From the likelihood profile we can obtain which is the value of $p$ that explain the best the observed data: the [maximum likelihood estimate](http://en.wikipedia.org/wiki/Maximum_likelihood_estimation). Which is the MLE estimate of $p$ ($\hat{p}$)?
```{r maximum_likelihood, exercise=TRUE}
likelihood_profile <- flip_coin_likelihood(p=seq(0,1,0.01),h=4,t=6)
maxL <- _____(likelihood_profile$l)
p_hat <- likelihood_profile$p[which(likelihood_profile$l==_____)]
p_hat
```
```{r maximum_likelihood-solution}
likelihood_profile <- flip_coin_likelihood(p=seq(0,1,0.01),h=4,t=6)
maxL <- max(likelihood_profile$l)
p_hat <- likelihood_profile$p[which(likelihood_profile$l==maxL)]
p_hat
```
For this simple model there is an analytical solution for the maximum likelihood estimator of $p$:
\[ \hat{p}=h/(h+t) \]
What about our initial question? Is the coin fair? From the likelihood profile we can also calculate whether the maximum likelihood estimate for $p$ is significantly better that $p=0.5$ (*via* the [confidence interval](http://en.wikipedia.org/wiki/Confidence_interval#Methods_of_derivation)). Based on the likelihood-ratio test (with $\alpha=0.05$):
```{r confidence_interval, eval=TRUE, echo=TRUE}
likelihood_profile <- flip_coin_likelihood(p=seq(0,1,0.01),h=4,t=6)
maxL <- max(likelihood_profile$l)
CI95 <- likelihood_profile$p[which(likelihood_profile$l > exp(log(maxL)-1.92))]
CI95 <- CI95[c(1,length(CI95))]
CI95
```
According to this results, we cannot reject the hypothesis that $p=0.5$ (with $\alpha=0.05$).
### Exercise 4
What would happen for a larger number of coin tosses? What about `h=20` and `t=30`? Or `h=400` and `t=600`? Explore this and other values below:
```{r explore_binomial_likelihood, exercise=TRUE}
h=_____
t=_____
likelihood_profile <- flip_coin_likelihood(p=seq(0,1,0.001),h=h,t=t)
maxL <- max(likelihood_profile$l)
p_hat <- likelihood_profile$p[which(likelihood_profile$l==maxL)]
CI95 <- likelihood_profile$p[which(likelihood_profile$l > exp(log(maxL)-1.92))]
CI95 <- CI95[c(1,length(CI95))]
plot(x = likelihood_profile$p,
y = likelihood_profile$l,
xlab = "p", ylab = "Likelihood", type = "l", col="red")
abline(v=p_hat, col="red")
abline(v=CI95[1],lt=2, col="red")
abline(v=CI95[2],lt=2, col="red")
abline(v=0.5)
```
## Bayesian statistics
The posterior probability is the probability of the parameter $p$ given the data $D$: $P(p|D)$. Using Bayes' Theorem:
\[P(p|D)=\frac{P(D|p)P(p)}{P(D)}\]
where $P(D|p)=L(p|D)$ is the likelihood and $P(p)$ is the prior probability. The marginal likelihood, $P(D)$, is constant and does not need to be calculated: $P(p|D)\propto L(p|D)P(p)$.
In this simple case (coin flipping experiment) there is an analytical solution for the posterior when using a conjugate prior such as the beta distribution, $\mathrm{B}(\alpha, \beta)$. Thus, the posterior distribution is $\mathrm{B}(\alpha+h, \beta+t)$. Note that the uniform distribution $\mathrm{U}(0,1)$ is the same as beta distribution $\mathrm{B}(1,1)$.
In Bayesian statistics a point estimate of the parameters is often given by the median posterior probability and credibility intervals by the quantile probabilities.
### Exercise 5
Practice the Bayesian analysis of the coin toss experiment. How does the results change when the prior parameters change? What is the interpretation of the prior? i.e. what is the interpretation of $\alpha$ and $\beta$?
```{r explore_binomial_posterior, exercise=TRUE}
h=4;t=6;a=1;b=1
p=seq(0,1,0.001)
plot(p, y = dbeta(p,a,b), # prior probability
xlab = "p", ylab = "Probability density", type = "l", ylim=c(0,5))
lines(p, y = dbeta(p,a+h,b+t), col="blue") # posterior probability
quant <- qbeta(c(0.5,0.025,0.975),a+h,b+t)
abline(v=quant[1], col="blue")
abline(v=quant[2], lt=2, col="blue")
abline(v=quant[3], lt=2, col="blue")
abline(v=0.5)
```
## Likelihood *vs.* Posterior probability
### Exercise 6
Explore the difference between the likelihood and the posterior probability. Note the scale difference. What happens to the likelihood and posterior probability values when the amount of data ($h+t$) increases? What happens to the likelihood when the prior changes? Note: you might need to change the scale (`ylim=c(0,_____)`) depending on the values you explore.
```{r explore_likelihood_vs_posterior, exercise=TRUE}
h=4;t=6;a=1;b=1
p=seq(0,1,0.001)
plot(p, y = dbeta(p,a,b), # prior probability
xlab = "p", ylab = "Probability density/Likelihood", type = "l", ylim=c(0,3))
lines(p, dbeta(p,a+h,b+t), col="blue") # posterior probability
lines(p, y = flip_coin_likelihood(p,h,t)$l, col = "red")# likelihood
```
## *All models are wrong*
Until this point we have considered that the binomial model describes well the reality of our coin. However, in addition to heads and tails there are other possible outcomes when tossing a coin, e.g. the coin stays on its side, falls into a drainage grate... [see also @Diaconis2007]
### Exercise 7
Analyze the data under a model that considers three outcomes (i.e. heads/tails/side):
```{r multinomial, exercise=TRUE, , exercise.lines=15}
h=4;t=6;s=0
```
#
#### References