In this model there are
I model that a handin score
\begin{align} S_{hg} = T_{h} + B{g} + \epsilon \end{align}
An unexplained iid residue
This model assumes that there is an underling true score for all handins and a grading bias of all graders.
With this the true score
\begin{align} S_{hg} &\sim N(T_{h}+B_{g}, \epsilon) \end{align}
Defining the model as Bayesian hierarchical model I assume that the true score of a handin
\begin{align} T_h &\sim N(T_h|\mu_h, \tau_h) \ B_g &\sim N(B_g|\mu_g, \tau_g) \ \epsilon &\sim \Gamma(\epsilon| \alpha_{\epsilon},\beta_{\epsilon}) \end{align}
The parameters
\begin{align} \mu_h, \tau_h &\sim N(\mu_h|\gamma_h, \tau_h*\lambda_h)\Gamma(\tau_h|\alpha_h, \beta_h) \ \mu_g, \tau_g &\sim N(\mu_g|\gamma_g, \tau_g*\lambda_g)\Gamma(\tau_g|\alpha_g, \beta_g) \end{align}
Where the parameters
##Markov Chain Monte Carlo
An Markov Chain Monte Carlo (MCMC) algorithm can be used to find the posterior distribution. In this case the latent score of a handing and bias of a grader. There exist multiple algorithms for solving such a problem such. The thesis will only look into random walk Monte Carlo methods (why?).
###Gibbs Sampling
For the case of hierarchical model with multiple dimension of parameters the most efficient random walk method are are Gibbs sampling as it does not require any 'tuning' and sampling each variable in turn compared to other methods. Additionally it can Incorporate other methods in the sampling process. The main problem with the Gibbs sampling methods is that it requires the conditional probability of the target distributions. These probability can be hard, even impossible, to calculate.
Finding the conditional probability can be done in different ways but the easiest part is to find the full conditional probability of all the parameters. From here, each conditional probability can be found.
The full conditional probability:
\begin{align} \begin{split} p(T_{h},B_{g},\epsilon,\mu_g,\tau_g,\mu_h,\tau_h|S_{hg},...) \propto& \prod_{g \in \mathcal{G}(h)}[N(S_{hg}|T_{h}+B_{g},\epsilon)] \ & \times \prod_{h \in \mathcal{H}(g)}[N(S_{hg}|T_{h}+B_{g},\epsilon)] \ & \times \Gamma(\epsilon|\alpha_{\epsilon},\beta_{\epsilon}) \ & \times N(\mu_g|\gamma_g, \tau_g*\lambda_g) \Gamma(\tau_g|\alpha_g, \beta_g) \ & \times N(\mu_h|\gamma_h, \tau_h,\lambda_h) \Gamma(\tau_h|\alpha_h, \beta_h) \end{split} \end{align}
where
We can then find the conditional probability for the different parameters.
For
\begin{align} \begin{split} p(T_h|S_{hg},...) \propto& \prod_{g \in \mathcal{G}(h)}[N(S_{hg}|T_h+B_g,\epsilon] \ & \times N(T_h|\mu_h, \tau_h) \end{split} \end{align}
For
\begin{align} \begin{split} p(B_g|S_{hg},...) \propto& \prod_{h \in \mathcal{H}(g)}[N(S_{hg}|T_h+B_g,\epsilon] \ & \times N(B_g|\mu_g, \tau_g) \end{split} \end{align}
For
\begin{align} \begin{split} p(\epsilon|S_{hg},...) \propto& \prod_{h \in \mathcal{H}(g)}\prod_{g \in \mathcal{G}(h)}[N(S_{hg}|T_h+B_g,\epsilon] \ & \times \Gamma(\epsilon|\alpha_{\epsilon}, \beta_{\epsilon}) \end{split} \end{align}
For
\begin{align} \label{eq:conh} \begin{split} p(\mu_h,\tau_h|S_{h},...) \propto& N(T_{h}|\mu_h,\tau_h) \ & \times N(\mu_h|\gamma_h, \tau_h*\lambda_h) \Gamma(\tau_h|\alpha_h, \beta_h) \end{split} \end{align}
and
\begin{align} \label{eq:cong} \begin{split} p(\mu_h,\tau_h|S_{h},...) \propto& N(B_{g}|\mu_g,\tau_g) \ & \times N(\mu_g|\gamma_g, \tau_g*\lambda_g) \Gamma(\tau_g|\alpha_g, \beta_g) \end{split} \end{align}
These conditional probability can be reduced by using the probability density function (PDF of the distributions that the probability consist of. The aim is to reduce it down to a form that defines a single distribution.
The normal distribution have the PDF:
\begin{align} f(x|\mu,\tau) =& \frac{\sqrt{\tau}}{\sqrt{2\pi}}e^{-\tau\frac{(x-\mu)^2}{2}} \end{align}
Gamma distribution:
\begin{align} f(x|\alpha,\beta) =& \frac{\beta^\alpha}{\Gamma(\alpha)}\tau^{\alpha-\frac{1}{2}}e^{-\beta\tau} \end{align}
and the normal-gamma distributions PDF is:
\begin{align} f(\mu,\tau|\gamma_g, \lambda_g, \alpha_g, \beta_g) =& \frac{\beta^\alpha\sqrt{\lambda}}{\Gamma(\alpha)\sqrt{2\pi}}\tau^{\alpha-\frac{1}{2}}e^{-\beta\tau}e^{\frac{-\lambda\tau(\mu-\gamma)^2}{2}} \end{align}