You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Florian. I have read your vignette on DHARMa, and just want to make sure that I am interpreting my diagnostics correctly, as some issues arose during model diagnostics.
I have the following GAMM:
M2a <- gam(Diel ~ fsex + fRiver +fArray + fRiver:fArray + Length.cm+
s(yday, by = fRiver, k = 10, bs = "cr")+
s(fTransmitter, bs = "re"),
method = "REML",
select = TRUE,
data = diel_migration,
family = binomial(link="logit"))
To test for differences in diel patterns (1/0 = day/night) in arrival times of fish (male/female) from two different rivers, from four locations in each river (3 in the river and one outside the river mouth). yday represents arrival date at a location in Julian day-of-year.
The qqplot and residuals vs predicted indicated some issues:
However, to me they seem quite minor, but I am not sure and would like your opinion. Also, from plotting the scaled residuals against covariates in the model, it was fine for all covariates. Upon further inspection by splitting the model into one for each river, I discovered that the yday pattern for one of the rivers is a flat, linear line, while for the other it is non-linear. When splitting the model into two separate models for each river, it seems that the model is fine, but I would prefer to keep it as one for higher statistical power, given the model is trustworthy. Based on the residual patterns pictured above, do you think this model including both rivers is fine? Thanks!
The text was updated successfully, but these errors were encountered:
Hi @LeneSo, sorry for the delayed response.
The significance of the tests also depends on the number of observations. Small deviations from the expected test statistics will appear significant if you have a large sample size. So, I'd suggest you look at the test statistics to base your conclusion. For example, for the dispersion test, how larger than 1 (or smaller for underdispersion) are the dispersion statistics (Note: based on recent tests with the available DHARMa dispersion tests, I recommend using the parametric bootstrapping for Pearson Chi-squared test:
res <- simulateResiduals( model, refit=T) # to perform the parametric bootstrapping
testDispersion(res, type="DHARMa") # to test it
Dispersion problems may arise from heteroscedasticity. So I'd check it first because, from the plot you shared, I suspect you may have some heteroscedasticity problem with your predictors. I would suspect the smooth for the yday, may be causing some issues. Have you tried a different k? Or maybe some interaction of river with other variables could help the original model to fit better.
Moreover, have you compared the predictions for both models (the original one and the ones you split the data)? maybe this may help in finding the differences (and if they are that big to decide on the separated models).
Hi Florian. I have read your vignette on DHARMa, and just want to make sure that I am interpreting my diagnostics correctly, as some issues arose during model diagnostics.
I have the following GAMM:
M2a <- gam(Diel ~ fsex + fRiver +fArray + fRiver:fArray + Length.cm+
s(yday, by = fRiver, k = 10, bs = "cr")+
s(fTransmitter, bs = "re"),
method = "REML",
select = TRUE,
data = diel_migration,
family = binomial(link="logit"))
To test for differences in diel patterns (1/0 = day/night) in arrival times of fish (male/female) from two different rivers, from four locations in each river (3 in the river and one outside the river mouth). yday represents arrival date at a location in Julian day-of-year.
The qqplot and residuals vs predicted indicated some issues:
However, to me they seem quite minor, but I am not sure and would like your opinion. Also, from plotting the scaled residuals against covariates in the model, it was fine for all covariates. Upon further inspection by splitting the model into one for each river, I discovered that the yday pattern for one of the rivers is a flat, linear line, while for the other it is non-linear. When splitting the model into two separate models for each river, it seems that the model is fine, but I would prefer to keep it as one for higher statistical power, given the model is trustworthy. Based on the residual patterns pictured above, do you think this model including both rivers is fine? Thanks!
The text was updated successfully, but these errors were encountered: