forked from jrnold/bugs-examples-in-stan
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathunidentified.Rmd
51 lines (45 loc) · 2.25 KB
/
unidentified.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# Unidentified: Over-Parameterization of a Normal Mean {#unidentified}
The following example illustrates the need for caution in diagnosing convergence, and is based on an example appearing in @CarlinLouis2000a [p174].
Consider a model of the mean, in which it it the additive sum of two parameters,
$$
\begin{aligned}[t]
y &\sim \mathsf{Normal}(\mu, 1) \\
\mu &= \theta_1 + \theta_2
\end{aligned}
$$
The data have no information about about either $\theta_1$ and $\theta_2$, but the data are informative about $\mu = \theta_1 + \theta_2$.
The likelihood function for the two unidentified parameters ($\theta_1$, $\theta_2$) has a ridge along the line,
$$
\left\{ q_1, q_2 : \bar{y} = q_1 + q_2 \right\} ,
$$
where $\bar{y}$ is the mean of the observed data.
Bayesian models require the specification of priors for model parameters. Proper priors will ensure unimodal posteriors for $q_1$ and $q_2$, and
can be used to sample from the posterior for this problem. @CarlinLouis2000a show (see their Q25, p191) the dangers of models of this
type. The posteriors for $\theta$ are not identical to the prior (the posterior
standard deviations are 7.05, while the prior standard deviations used below
are 10), suggesting that the data are somewhat informative about both $\theta$
parameters, when this is not the case. An inexperienced user of Markov chain
Monte Carlo methods might fail to recognize that the $q$ parameters are not
identified, and naively report the posterior summaries for theta generated by
the software. On the other hand, note that the identified parameter $m = q_1 +
q_2$ is well behaved
```{r}
library("rstan")
mod_unidentified <- stan_model("stan/unidentified.stan")
```
Use very large scales for this; though the behavior is still present with weakly informative scales.
```{r}
data_unidentified <- list(
y = 0,
theta_mean = rep(0, 2),
theta_scale = rep(100, 2)
)
```
```{r results='hide',message=FALSE}
fit_unidentified <- sampling(mod_unidentified, data = data_unidentified,
refresh = -1)
```
```{r}
fit_unidentified
```
This example is derived from Simon Jackman, "Unidentified: over-parameterization of normal mean", 2007-07-24, [URL](https://web-beta.archive.org/web/20070724034211/http://jackman.stanford.edu:80/mcmc/unidentified.odc).