-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Thanks for developing this package - I really like the way it structures simulations.
I've been running some simulations and have noticed some unexpected correlations in my results. I've traced this to the following issue:
Expected behavior: Data generated from a model with different parameter values will be independent
Observed behaviour: Some data generated from models with different parameters is perfectly correlated
Example
library(simulator)
make_data <- function(n_covariates) {
new_model(name = "test",
label = paste0("n_covariates = ", n_covariates),
params = list(n_covariates = n_covariates),
simulate = function(n_covariates, nsim) {
data <- vector(mode = "list", length = nsim)
for(i in 1:nsim){
x <- vector(mode = "list", length = n_covariates)
for(j in 1:n_covariates){
x[[j]]<- rnorm(1000)
}
data[[i]] <- x
}
return(data)
})
}
sim <- new_simulation(name = "correlated_draws",
label = "data from different models is the same") %>%
generate_model(make_data, seed = 1234,
n_covariates = list(1,2,3,4,5,6,7,8,9,10),
vary_along = c("n_covariates")) %>%
simulate_from_model(nsim = 2)
x1_model1.1 <- draws(sim)[[1]]@draws$r1.1[[1]]
x1_model2.1 <- draws(sim)[[2]]@draws$r1.1[[1]]
cor(x1_model1.1, x1_model2.1) # This is 1, I'd expect this to be zero as different draws should be independent
I believe this issue is caused by generate_model passing the same seed to generate_model_single for each different value of the parameters being varied along. This even happens if no seed is set as the default is for a seed of 123 to be used. A work around is to use seed = NULL in generate model.
Metadata
Metadata
Assignees
Labels
No labels