added vignettes, revised manual

brianstock · Mar 11, 2016 · d689550 · d689550
1 parent 8edf4f7
commit d689550
Show file tree

Hide file tree

Showing 23 changed files with 2,740 additions and 161 deletions.
diff --git a/Manual/mixsiar_manual.lyx b/Manual/mixsiar_manual.lyx
diff --git a/inst/doc/cladocera_ex.R b/inst/doc/cladocera_ex.R
@@ -0,0 +1,64 @@
+## ------------------------------------------------------------------------
+library(MixSIAR)
+mixsiar.dir <- find.package("MixSIAR")
+paste0(mixsiar.dir,"/example_scripts")
+
+## ---- eval=FALSE---------------------------------------------------------
+#  source(paste0(mixsiar.dir,"/example_scripts/mixsiar_script_cladocera.R"))
+
+## ------------------------------------------------------------------------
+library(MixSIAR)
+
+## ------------------------------------------------------------------------
+# Replace the system.file call with the path to your file
+mix.filename <- system.file("extdata", "cladocera_consumer.csv", package = "MixSIAR")
+
+mix <- load_mix_data(filename=mix.filename,
+					 iso_names=c("c14.0","c16.0","c16.1w9","c16.1w7","c16.2w4",
+					 	         "c16.3w3","c16.4w3","c17.0","c18.0","c18.1w9",
+					 	         "c18.1w7","c18.2w6","c18.3w6","c18.3w3","c18.4w3",
+					 	         "c18.5w3","c20.0","c22.0","c20.4w6","c20.5w3",
+					 	         "c22.6w3","BrFA"),
+					 factors="id",
+					 fac_random=FALSE,
+					 fac_nested=FALSE,
+					 cont_effects=NULL)
+
+## ------------------------------------------------------------------------
+# Replace the system.file call with the path to your file
+source.filename <- system.file("extdata", "cladocera_sources.csv", package = "MixSIAR")
+
+source <- load_source_data(filename=source.filename,
+						   source_factors=NULL,
+						   conc_dep=FALSE,
+						   data_type="means",
+						   mix)
+
+## ------------------------------------------------------------------------
+# Replace the system.file call with the path to your file
+discr.filename <- system.file("extdata", "isopod_discrimination.csv", package = "MixSIAR")
+
+discr <- load_discr_data(filename=discr.filename, mix)
+
+## ---- eval=FALSE---------------------------------------------------------
+#  # default "UNINFORMATIVE" / GENERALIST prior (alpha = 1)
+#  plot_prior(alpha.prior=1,source)
+
+## ---- eval=FALSE---------------------------------------------------------
+#  # Write the JAGS model file
+#  model_filename <- "MixSIAR_model.txt"
+#  resid_err <- TRUE
+#  process_err <- FALSE
+#  write_JAGS_model(model_filename, resid_err, process_err, mix, source)
+
+## ---- eval=FALSE---------------------------------------------------------
+#  jags.1 <- run_model(run="test", mix, source, discr, model_filename,
+#                      alpha.prior = 1, resid_err, process_err)
+
+## ---- eval=FALSE---------------------------------------------------------
+#  jags.1 <- run_model(run="normal", mix, source, discr, model_filename,
+#                      alpha.prior = 1, resid_err, process_err)
+
+## ---- eval=FALSE---------------------------------------------------------
+#  output_JAGS(jags.1, mix, source, output_options)
+
diff --git a/inst/doc/cladocera_ex.Rmd b/inst/doc/cladocera_ex.Rmd
@@ -0,0 +1,151 @@
+---
+title: "MixSIAR Script Example (Cladocera)"
+author: "Brian Stock"
+date: "March 10, 2016"
+output: html_vignette
+vignette: >
+  %\VignetteIndexEntry{MixSIAR Script Example (Cladocera)}
+  %\VignetteEngine{knitr::rmarkdown}
+  \usepackage[utf8]{inputenc} 
+---
+
+Here we step through the Cladocera Example using the **script** version of MixSIAR. For a demonstration using the **GUI** version, see the [MixSIAR Manual](https://github.com/brianstock/MixSIAR/blob/master/inst/mixsiar_manual_3.1.pdf). For a thorough walkthrough of how to use MixSIAR in a script, see the [Wolves Example](http://htmlpreview.github.com/?https://github.com/brianstock/MixSIAR/blob/master/inst/doc/wolves_ex.html), which provides more commentary and explanation.
+
+For a clean, runnable `.R` script, look at `mixsiar_script_cladocera.R` in the `example_scripts` folder of the MixSIAR package install:
+```{r}
+library(MixSIAR)
+mixsiar.dir <- find.package("MixSIAR")
+paste0(mixsiar.dir,"/example_scripts")
+```
+
+You can run the Cladocera Example script directly with:
+```{r, eval=FALSE}
+source(paste0(mixsiar.dir,"/example_scripts/mixsiar_script_cladocera.R"))
+```
+
+## Cladocera Example
+
+The Cladocera Example is from [Galloway et al. 2014](http://onlinelibrary.wiley.com/doi/10.1111/fwb.12394/full) and demonstrates MixSIAR applied to a 22-dimensional fatty acid dataset. Here the 14 mixture datapoints are Cladocera (water flea) fatty acid profiles from 6 lakes in Finland over 2 seasons. Besides the high dimensionality, the other difference with this analysis is that we fit each mixture datapoint individually, because there is no clear covariate structure (some sites have 2 seasons, some have 1, some sites in the same lake). We do this by creating an "id" column and treating "id" as a fixed effect.
+
++ **22 biotracers** (carbon 14.0, 16.0, 16.1w9, 16.1w7, 16.2w4, 16.3w3, 16.4w3, 17.0, 18.0, 18.1w9, 18.1w7, 18.2w6, 18.3w6, 18.3w3, 18.4w3, 18.5w3, 20.0, 22.0, 20.4w6, 20.5w3, 22.6w3, BrFA)
++ **Mix datapoints fit independently** ("process error" only, MixSIR)
++ Source data as means and SDs
+
+Here we fit the "process error" model of MixSIR, which we MUST do when we only have one mix datapoint (or here, one mix datapoint per fixed effect). With only one datapoint, there is no information to estimate an additional mixture variance term, so we have to assume a fixed variance based on the variance of the sources (see [Moore and Semmens 2008](http://onlinelibrary.wiley.com/doi/10.1111/j.1461-0248.2008.01163.x/full)).
+
+Here we treat "id" as a fixed effect---this will estimate the diet of each mixture data point separately (sample size of 1). This makes sense to do when you think there will be clear differences between sites/seasons/etc., but only have 1 or 2 points from each site/season (i.e, you don’t have enough data to estimate a site/season effect). If you are interested in the site/season effect, you need replicates within each site/season, and then it is best to fit site/season as a fixed or random effect.
+
+Fatty acid data greatly increase the number of biotracers beyond the typical 2 stable isotopes, d13C and d15N, which gives the mixing model power to resolve more sources. We caution, however, that using fatty acid data is not a panacea for the "underdetermined" problem (# sources > # biotracers + 1). As the number of sources increases, the "uninformative" prior $(\alpha=1)$ has greater influence, even if there are many more biotracers than sources. See the [Cladocera Example prior](https://github.com/brianstock/MixSIAR/blob/master/Manual/cladocera_prior_plot.pdf) with 7 sources and 22 biotracers.
+
+### Load MixSIAR package
+
+```{r}
+library(MixSIAR)
+```
+
+### Load mixture data
+
+See ?load_mix_data for details.
+
+The cladocera consumer data has 1 covariate (`factors="id"`), which we fit as a fixed effect (`fac_random=FALSE`). "Site" is not nested within another factor (`fac_nested=FALSE`). There are no continuous effects (`cont_effects=NULL`).
+
+```{r}
+# Replace the system.file call with the path to your file
+mix.filename <- system.file("extdata", "cladocera_consumer.csv", package = "MixSIAR")
+
+mix <- load_mix_data(filename=mix.filename,
+					 iso_names=c("c14.0","c16.0","c16.1w9","c16.1w7","c16.2w4",
+					 	         "c16.3w3","c16.4w3","c17.0","c18.0","c18.1w9",
+					 	         "c18.1w7","c18.2w6","c18.3w6","c18.3w3","c18.4w3",
+					 	         "c18.5w3","c20.0","c22.0","c20.4w6","c20.5w3",
+					 	         "c22.6w3","BrFA"),
+					 factors="id",
+					 fac_random=FALSE,
+					 fac_nested=FALSE,
+					 cont_effects=NULL)
+```
+
+### Load source data
+
+See ?load_source_data for details.
+
+We do not have any fixed/random/continuous effects or concentration dependence in the source data (`source_factors=NULL`, `conc_dep=FALSE`). We only have source means, SD, and sample size---not the original "raw" data (`data_type="means"`).
+
+```{r}
+# Replace the system.file call with the path to your file
+source.filename <- system.file("extdata", "cladocera_sources.csv", package = "MixSIAR")
+
+source <- load_source_data(filename=source.filename,
+						   source_factors=NULL,
+						   conc_dep=FALSE,
+						   data_type="means",
+						   mix)
+```
+
+### Load discrimination data
+
+See ?load_discr_data for details.
+
+Note that [Galloway et al. 2014](http://onlinelibrary.wiley.com/doi/10.1111/fwb.12394/full) conducted feeding trials to create a "resource library". In the mixing model, the sources are actually consumers fed exclusively each of the sources. This allowed them to set the discrimination = 0 (see `isopod_discrimination.csv`).
+
+```{r}
+# Replace the system.file call with the path to your file
+discr.filename <- system.file("extdata", "isopod_discrimination.csv", package = "MixSIAR")
+
+discr <- load_discr_data(filename=discr.filename, mix)
+```
+
+### Plot data
+
+DO NOT use the `plot_data` function! When there are more than 2 biotracers, MixSIAR currently plots every pairwise combination. Here, that means ${22 \choose 2} = 231$ plots are produced. Yikes! In the future, MixSIAR will offer non-metric multidimensional scaling (NMDS) plots for these cases.
+
+### Plot prior
+
+Define your prior, and then plot using "plot_prior"
+
++ RED = your prior
++ DARK GREY = "uninformative"/generalist (alpha = 1)
++ LIGHT GREY = "uninformative" Jeffrey's prior (alpha = 1/n.sources)
+
+```{r, eval=FALSE}
+# default "UNINFORMATIVE" / GENERALIST prior (alpha = 1)
+plot_prior(alpha.prior=1,source)
+```
+
+Note that the ["uninformative" prior with 7 sources](https://github.com/brianstock/MixSIAR/blob/master/Manual/cladocera_prior_plot.pdf) is starting to look more informative... imagine what it would look like with 15 sources.
+
+### Write JAGS model file
+
+Here we fit the **"Process only" error** model of MixSIR, which we MUST do when we only have one mix datapoint (or here, one mix datapoint per fixed effect). With only one datapoint, there is no information to estimate an additional mixture variance term, so we have to assume a fixed variance based on the variance of the sources. The differences between "Residual * Process", "Residual only", and "Process only" are explained in Stock and Semmens (in revision).
+
+```{r, eval=FALSE}
+# Write the JAGS model file
+model_filename <- "MixSIAR_model.txt"
+resid_err <- TRUE
+process_err <- FALSE
+write_JAGS_model(model_filename, resid_err, process_err, mix, source)
+```
+
+### Run model
+
+First use `run = "test"` to check if 1) the data are loaded correctly and 2) the model is specified correctly:
+```{r, eval=FALSE}
+jags.1 <- run_model(run="test", mix, source, discr, model_filename, 
+                    alpha.prior = 1, resid_err, process_err)
+```
+
+After a test run works, increase the MCMC run to a value that may converge:
+```{r, eval=FALSE}
+jags.1 <- run_model(run="normal", mix, source, discr, model_filename,
+                    alpha.prior = 1, resid_err, process_err)
+```
+
+### Analyze diagnostics and output
+
+See ?output_JAGS for details.
+
+```{r, eval=FALSE}
+output_JAGS(jags.1, mix, source, output_options)
+```
+
+Since we fit "id" as a fixed effect, there is no inference on diet at the overall population level (no p.global). You should see posterior plots for all 14 mixture samples individually ([ID 7](https://github.com/brianstock/MixSIAR/blob/master/Manual/cladocera_posterior_density_diet_p_id%207.pdf), [ID 14](https://github.com/brianstock/MixSIAR/blob/master/Manual/cladocera_posterior_density_diet_p_id%2014.pdf)).