Group means for transformed parameters #156

Nathaniel-Haines · 2023-05-27T15:10:21Z

@youngahn For transformed parameters (e.g., learning rates), our stan models look something like this:

parameters {
  // group-level pars
  real mu_A_pr;
  real sigma_A;
  
  // person-level (non-centered)
  vector[N] A_pr;
}

transformed parameters {
  // person-level transformed
  vector[N] A;
  
  for (i in 1:N) {
    A[i] = Phi_approx(mu_A_pr + sigma_A * A_pr[i]);
  }
 }

Then, we later compute the group-level mean for the given parameter like so:

generated quantities {
  real  mu_A;
  mu_A = Phi_approx(mu_A_pr);
  ...
}

This is actually not fully correct assuming that we want mu_A to indicate the mean across person-level parameters after undergoing the transformation. This can easily be observed with a simulation:

mu_A_pr = -1
sigma_A = 1

A_pr = rnorm(100, mu_A_pr, sigma_A)

A = pnorm(mu_A_pr + sigma_A * A_pr)

mu_A = pnorm(mu_A_pr)
actual_mu_A = mean(A)

Above, mu_A will consistently mis-estimate actual_mu_A, as it is capturing the group-level median rather than mean. The reason is outlined in #117. The fix is actually rather simple, although it requires some post-hoc simulation. Instead of simply taking mu_A = Phi_approx(mu_A_pr) in the Stan model, we could do something like the following externally:

pars = <extract parameters from Stan>

mu_A = foreach(i=1:n_MCMC_samples, .combine="c") %do% {
  mu_A_pr = pars$mu_A_pr[i] 
  sigma_A = pars$sigma_A[i] 
  A_approx = pnorm(rnorm(10000, mu_A_pr, sigma_A)); 
  
  mean(A_approx)
}

In the above, we simulate 10000 "people/examples" from each MCMC iteration in the group-level distribution, do the transformation, and then compute the mean of the simulated examples. 10000 is arbitrary, but higher values will produce less approximation error so it it a reasonable default. The resulting mu_A will now actually be the posterior distribution on the group-level A parameter on the (0,1) scale.

NOTE: this is only relevant for the transformed parameters. e.g., if we use Phi_approx or similar for bounded parameters. In the case of parameters that are not transformed/are unbounded, the above steps are redundant (mu_A and actual_mu_A would be the same if we did not need Phi_approx/pnorm)

NOTE 2: This is all related to #117, although the solution was not spelled out in that issue. It is not really a huge bug, but still one that should be addressed. Any comparisons that people have made between the group-level parameters are still valid, although the estimand has unintentionally been the median rather than the mean 🤓

The text was updated successfully, but these errors were encountered:

Nathaniel-Haines · 2023-05-27T15:35:11Z

Actually, I think we could do better by doing the full simulation within the generated quantities block, avoiding the need to do something externally in both R and Python. Perhaps we could create a custom function that take the group mean and SD as input and does the simulation, returning the corrected values as needed 🤔

Nathaniel-Haines · 2023-05-27T15:45:56Z

e.g., the following function signature should do the trick for the parameters that use Phi_approx:

functions {
  real Phi_approx_group_mean_rng(real mu, real sigma, int n_samples) {
    vector[n_samples] par;
    for (i in 1:n_samples) {
      par[i] = Phi_approx(normal_rng(mu, sigma));
    }
    return mean(par);
  }
}

then in generated quantities later on:

// Group-level means
mu_A = Phi_approx_group_mean_rng(mu_A_pr, sigma_A, 10000);
...

youngahn · 2023-07-03T04:50:51Z

Thanks @Nathaniel-Haines and I will follow up on this!

Nathaniel-Haines added the bug label May 27, 2023

Nathaniel-Haines mentioned this issue May 27, 2023

mu_PARAMETER estimates median, not mean #117

Closed

Nathaniel-Haines mentioned this issue May 28, 2023

add new functions for group means and correlations hollysully/IGT_PP_Shared#10

Merged

youngahn assigned youngahn and lehlsy0904 Jul 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Group means for transformed parameters #156

Group means for transformed parameters #156

Nathaniel-Haines commented May 27, 2023 •

edited

Loading

Nathaniel-Haines commented May 27, 2023

Nathaniel-Haines commented May 27, 2023

youngahn commented Jul 3, 2023

Group means for transformed parameters #156

Group means for transformed parameters #156

Comments

Nathaniel-Haines commented May 27, 2023 • edited Loading

Nathaniel-Haines commented May 27, 2023

Nathaniel-Haines commented May 27, 2023

youngahn commented Jul 3, 2023

Nathaniel-Haines commented May 27, 2023 •

edited

Loading