unidentified.Rmd

# Unidentified: Over-Parameterization of a Normal Mean {#unidentified}

The following example illustrates the need for caution in diagnosing convergence, and is based on an example appearing in @CarlinLouis2000a [p174].

Consider a model of the mean, in which it it the additive sum of two parameters,
$$
\begin{aligned}[t]
y &\sim \mathsf{Normal}(\mu, 1) \\
\mu &= \theta_1 + \theta_2
\end{aligned}
$$
The data have no information about about either $\theta_1$ and $\theta_2$, but the data are informative about $\mu = \theta_1 + \theta_2$.
The likelihood function for the two unidentified parameters ($\theta_1$, $\theta_2$) has a ridge along the line,
$$
\left\{ q_1, q_2 : \bar{y} = q_1 + q_2 \right\} ,
$$
where $\bar{y}$ is the mean of the observed data.

Bayesian models require the specification of priors for model parameters.  Proper priors will ensure unimodal posteriors for $q_1$ and $q_2$, and
can be used to sample from the posterior for this problem. @CarlinLouis2000a show (see their Q25, p191) the dangers of models of this
type.  The posteriors for $\theta$ are not identical to the prior (the posterior
standard deviations are 7.05, while the prior standard deviations used below
are 10), suggesting that the data are somewhat informative about both $\theta$
parameters, when this is not the case.  An inexperienced user of Markov chain
Monte Carlo methods might fail to recognize that the $q$ parameters are not
identified, and naively report the posterior summaries for theta generated by
the software.  On the other hand, note that the identified parameter $m = q_1 +
q_2$ is well behaved

```{r}
library("rstan")
mod_unidentified <- stan_model("stan/unidentified.stan")
```

Use very large scales for this; though the behavior is still present with weakly informative scales.
```{r}
data_unidentified <- list(
  y = 0,
  theta_mean = rep(0, 2),
  theta_scale = rep(100, 2)
)
```
```{r results='hide',message=FALSE}
fit_unidentified <- sampling(mod_unidentified, data = data_unidentified,
                             refresh = -1)
```
```{r}
fit_unidentified
```

This example is derived from Simon Jackman, "Unidentified: over-parameterization of normal mean", 2007-07-24, [URL](https://web-beta.archive.org/web/20070724034211/http://jackman.stanford.edu:80/mcmc/unidentified.odc).