How do I combine 3 surveys worth of data? Only two of them are overlapping spatially #177
Replies: 3 comments
-
Hello, Response_variable ~ Flag + s(HBF.ratio, k = 3) + (1 | Vessel_ID). Does it matter if vessel_ID is input as a factor or character string? Also, there are a portion of vessels with no ID that I have listed as "missing". I am wondering if I have specified this correctly and if there is some way to configure this in a more efficient way or if it is unrealistic to include a random effect with so many different vessels? Many thanks, |
Beta Was this translation helpful? Give feedback.
-
Great questions. Currently, If you have a "missing" label on some vessels, that will be estimated as a separate level of the factor, rather than NA. In general, missing levels of random effect groupings aren't allowed -- so they could be filtered out prior to fitting. With sdmTMB, you can make predictions to data with missing group identifiers (so you could use the fitted model to make predictions for those vessels you don't have groups for). In terms of speed, several thousand random effects may make things a bit slow, but will still work. A few things to check might be: are there any vessels with just a few observations? If so, it might make sense to remove them. Second, you can look at the estimated random effect deviations -- are those ~ normally distributed, or is there some weirdness that may be affecting estimates? Specifically I'm thinking about multiple modes in the estimates, etc. Try this to get the vessel level random effects, |
Beta Was this translation helpful? Give feedback.
-
Thank you so much for your excellent response and for the suggestions. I will give them a try. |
Beta Was this translation helpful? Give feedback.
-
Question via email:
I am currently dealing with three different bottom survey time series, i.e., survey 1, survey 2, and survey 3. Surveys 2 and 3 are partially overlapping spatially (but different vessels / gears) - and Survey 1 overlaps neither. In this case, I thought I could use survey 3 time series to impute missing years of data in survey 2 using sdmTMB (or also use survey 1 data to impute missing years of data in survey 2). Can I try forecasting with sdmTMB to impute survey 2 missing data using either survey 3 or survey 1 data?
Response: there are several ways you could go about this and generate an index across the three surveys. It seems like Survey 1 should probably be dealt with separately, in that you can fit an sdmTMB model and make predictions to a grid specific to that survey. You can then fit a second model to data from Survey 2 and 3, with a couple of ways to incorporate differences between surveys (vessels / gears). Some options:
~ + (1 | vessel_id)
dispformula = ~ 0 + Survey
.Once the model is fit, you can predict to a grid for Surveys 2 and 3, and then combine predictions across the 2 areas to generate a total index that includes Survey 1.
There are subtle differences in how these models may be interpreted. When covariates such as Survey enter the dispersion parameter, they let each Survey have a different variance -- so you can imagine one being down-weighted relative to the other. When covariates such as Survey are included in the main formula (for example using a Tweedie distribution as a response), they affect the estimated latent log biomass available to be sampled (this is shifted up or down by an intercept for each survey). For delta models, Survey could be a covariate either in the presence-absence submodel (probability of species occurrence differs by survey region) or the positive submodel (catch rates vary by survey). In this particular application, Survey 2/3 are overlapping spatially -- so it may be important to consider which of these is reasonable (e.g. it may be unlikely that occurrence rates differ by Survey)
Beta Was this translation helpful? Give feedback.
All reactions