Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group means for transformed parameters #156

Open
Nathaniel-Haines opened this issue May 27, 2023 · 3 comments
Open

Group means for transformed parameters #156

Nathaniel-Haines opened this issue May 27, 2023 · 3 comments
Assignees
Labels

Comments

@Nathaniel-Haines
Copy link
Member

Nathaniel-Haines commented May 27, 2023

@youngahn For transformed parameters (e.g., learning rates), our stan models look something like this:

parameters {
  // group-level pars
  real mu_A_pr;
  real sigma_A;
  
  // person-level (non-centered)
  vector[N] A_pr;
}

transformed parameters {
  // person-level transformed
  vector[N] A;
  
  for (i in 1:N) {
    A[i] = Phi_approx(mu_A_pr + sigma_A * A_pr[i]);
  }
 }

Then, we later compute the group-level mean for the given parameter like so:

generated quantities {
  real  mu_A;
  mu_A = Phi_approx(mu_A_pr);
  ...
}

This is actually not fully correct assuming that we want mu_A to indicate the mean across person-level parameters after undergoing the transformation. This can easily be observed with a simulation:

mu_A_pr = -1
sigma_A = 1

A_pr = rnorm(100, mu_A_pr, sigma_A)

A = pnorm(mu_A_pr + sigma_A * A_pr)

mu_A = pnorm(mu_A_pr)
actual_mu_A = mean(A)

Above, mu_A will consistently mis-estimate actual_mu_A, as it is capturing the group-level median rather than mean. The reason is outlined in #117. The fix is actually rather simple, although it requires some post-hoc simulation. Instead of simply taking mu_A = Phi_approx(mu_A_pr) in the Stan model, we could do something like the following externally:

pars = <extract parameters from Stan>

mu_A = foreach(i=1:n_MCMC_samples, .combine="c") %do% {
  mu_A_pr = pars$mu_A_pr[i] 
  sigma_A = pars$sigma_A[i] 
  A_approx = pnorm(rnorm(10000, mu_A_pr, sigma_A)); 
  
  mean(A_approx)
}

In the above, we simulate 10000 "people/examples" from each MCMC iteration in the group-level distribution, do the transformation, and then compute the mean of the simulated examples. 10000 is arbitrary, but higher values will produce less approximation error so it it a reasonable default. The resulting mu_A will now actually be the posterior distribution on the group-level A parameter on the (0,1) scale.

NOTE: this is only relevant for the transformed parameters. e.g., if we use Phi_approx or similar for bounded parameters. In the case of parameters that are not transformed/are unbounded, the above steps are redundant (mu_A and actual_mu_A would be the same if we did not need Phi_approx/pnorm)

NOTE 2: This is all related to #117, although the solution was not spelled out in that issue. It is not really a huge bug, but still one that should be addressed. Any comparisons that people have made between the group-level parameters are still valid, although the estimand has unintentionally been the median rather than the mean 🤓

@Nathaniel-Haines
Copy link
Member Author

Actually, I think we could do better by doing the full simulation within the generated quantities block, avoiding the need to do something externally in both R and Python. Perhaps we could create a custom function that take the group mean and SD as input and does the simulation, returning the corrected values as needed 🤔

@Nathaniel-Haines
Copy link
Member Author

e.g., the following function signature should do the trick for the parameters that use Phi_approx:

functions {
  real Phi_approx_group_mean_rng(real mu, real sigma, int n_samples) {
    vector[n_samples] par;
    for (i in 1:n_samples) {
      par[i] = Phi_approx(normal_rng(mu, sigma));
    }
    return mean(par);
  }
}

then in generated quantities later on:

// Group-level means
mu_A = Phi_approx_group_mean_rng(mu_A_pr, sigma_A, 10000);
...

@youngahn
Copy link
Member

youngahn commented Jul 3, 2023

Thanks @Nathaniel-Haines and I will follow up on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants