Merge pull request #131 from nutriverse/dev

fix in vignettes, annotated numbered code chunk in quarto not showing properly #130
nutriverse · Dec 7, 2024 · a3c8e5d · a3c8e5d
2 parents 08ae58a + 95f4b19
commit a3c8e5d
Show file tree

Hide file tree

Showing 4 changed files with 160 additions and 40 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -2,7 +2,7 @@ Type: Package
 Package: mwana
 Title: An Efficient Workflow for Plausibility Checks and Prevalence Analysis of
     Wasting in R
-Version: 0.2.1
+Version: 0.2.1.9000
 Authors@R: c(
     person("Tomás", "Zaba", , "tomas.zaba@outlook.com", 
            role = c("aut", "cre", "cph"),

diff --git a/NEWS.md b/NEWS.md
@@ -1,3 +1,5 @@
+# mwana (development version)
+
 # mwana 0.2.1
 
 ## General updates

diff --git a/vignettes/ipc_amn_check.qmd b/vignettes/ipc_amn_check.qmd
@@ -13,38 +13,38 @@ vignette: >
   %\VignetteEncoding{UTF-8}
 ---
 
-```{r}
-#| label: global-setup
-#| echo: false
-#| message: false
 
-library(mwana)
-library(dplyr)
-```
 
 Evidence on the prevalence of acute malnutrition used in the IPC Acute Malnutrition (IPC AMN) can come from different sources: representative surveys, screenings, or community-based surveillance system (known as sentinel sites). The IPC sets minimum sample size requirements for each of these sources [@ipcmanual].
 
 In the IPC AMN analysis workflow, the first step a data analyst has to take is the checking of sample size requirements as set by IPC for each survey area to be included in the IPC AMN analysis. `mwana` provides the `mw_check_ipcamn_ssreq()` function for this purpose.
 
 To demonstrate its usage, we will use the built-in sample data set `anthro.01`.
 
-```{r}
-#| label: view-data
-#| echo: true
 
+``` r
 head(anthro.01)
 ```
 
+```
+## # A tibble: 6 × 11
+##   area       dos        cluster  team sex   dob      age weight height edema  muac
+##   <chr>      <date>       <int> <int> <chr> <date> <int>  <dbl>  <dbl> <chr> <int>
+## 1 District E 2023-12-04       1     3 m     NA        59   15.6  109.  n       146
+## 2 District E 2023-12-04       1     3 m     NA         8    7.5   68.6 n       127
+## 3 District E 2023-12-04       1     3 m     NA        19    9.7   79.5 n       142
+## 4 District E 2023-12-04       1     3 f     NA        49   14.3  100.  n       149
+## 5 District E 2023-12-04       1     3 f     NA        32   12.4   92.1 n       143
+## 6 District E 2023-12-04       1     3 f     NA        17    9.3   77.8 n       132
+```
+
 
 `anthro.01` contains anthropometry data from SMART surveys from anonymized locations. To learn more about this dataset, call `help("anthro.01")` in your `R` console. 
 
 Now that we got acquainted with the data set, we can proceed to executing the task. To achieve this, we simply do:
 
-```{r}
-#| label: check
-#| echo: true
-#| eval: false
 
+``` r
 mw_check_ipcamn_ssreq(
   df = anthro.01,         # <1>
   cluster = cluster,      # <2>
@@ -60,11 +60,8 @@ mw_check_ipcamn_ssreq(
 
 We can also chain `anthro.01` to the function using the native pipe operator `|>`:
 
-```{r}
-#| label: pipe_operator
-#| echo: true
-#| eval: false
 
+``` r
 anthro.01 |>
   mw_check_ipcamn_ssreq(
     cluster = cluster,
@@ -73,15 +70,12 @@ anthro.01 |>
 ```
 
 Either way, the returned output will be: 
-```{r}
-#| label: view_check
-#| echo: false
 
-anthro.01 |>
-  mw_check_ipcamn_ssreq(
-    cluster = cluster,
-    .source = "survey"
-  )
+```
+## # A tibble: 1 × 3
+##   n_clusters n_obs meet_ipc
+##        <int> <int> <chr>   
+## 1         30  1191 yes
 ```
 
 A `tibble` object is returned with three columns:  
@@ -94,11 +88,8 @@ A `tibble` object is returned with three columns:
 
 The above output is not quite useful yet as we often deal with multiple-area datasets. We can get a summarized output by area as follows: 
 
-```{r}
-#| label: group_by
-#| echo: true
-#| eval: false
 
+``` r
 ## Load the dplyr package ----
 library(dplyr)
 
@@ -113,16 +104,13 @@ anthro.01 |>
 
 This will return: 
 
-```{r}
-#| label: view_group_by
-#| echo: false
 
-anthro.01 |>
-  group_by(area) |>
-  mw_check_ipcamn_ssreq(
-    cluster = cluster,
-    .source = "survey"
-  )
+```
+## # A tibble: 2 × 4
+##   area       n_clusters n_obs meet_ipc
+##   <chr>           <int> <int> <chr>   
+## 1 District E         30   505 yes     
+## 2 District G         30   686 yes
 ```
 
 For screening or sentinel site-based data, we approach the task the same way; we only have to change the `.source` parameter to "screening" or to "ssite" as appropriate, as well as to supply `cluster` with the right column name of the sub-areas inside the main area (villages, localities, comunas, communities, etc).

diff --git a/vignettes/ipc_amn_check.qmd.orig b/vignettes/ipc_amn_check.qmd.orig
@@ -0,0 +1,130 @@
+---
+title: "Checking if IPC Acute Malnutrition sample size requirements were met"
+author: Tomás Zaba
+bibliography: references.bib
+csl: harvard-cite-them-right-11th-edition.csl
+knitr:
+  opts_chunk: 
+    collapse: true
+    comment: "#>"
+vignette: >
+  %\VignetteIndexEntry{Checking if IPC Acute Malnutrition sample size requirements were met}
+  %\VignetteEngine{quarto::html}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r}
+#| label: global-setup
+#| echo: false
+#| message: false
+
+library(mwana)
+library(dplyr)
+```
+
+Evidence on the prevalence of acute malnutrition used in the IPC Acute Malnutrition (IPC AMN) can come from different sources: representative surveys, screenings, or community-based surveillance system (known as sentinel sites). The IPC sets minimum sample size requirements for each of these sources [@ipcmanual].
+
+In the IPC AMN analysis workflow, the first step a data analyst has to take is the checking of sample size requirements as set by IPC for each survey area to be included in the IPC AMN analysis. `mwana` provides the `mw_check_ipcamn_ssreq()` function for this purpose.
+
+To demonstrate its usage, we will use the built-in sample data set `anthro.01`.
+
+```{r}
+#| label: view-data
+#| echo: true
+
+head(anthro.01)
+```
+
+
+`anthro.01` contains anthropometry data from SMART surveys from anonymized locations. To learn more about this dataset, call `help("anthro.01")` in your `R` console. 
+
+Now that we got acquainted with the data set, we can proceed to executing the task. To achieve this, we simply do:
+
+```{r}
+#| label: check
+#| echo: true
+#| eval: false
+
+mw_check_ipcamn_ssreq(
+  df = anthro.01,         # <1>
+  cluster = cluster,      # <2>
+  .source = "survey"      # <3>
+)
+```
+
+1. The argument `df` should be specified with the dataset you want to assess sample sizes for. In this case, `anthro.01`.
+
+2. The argument `cluster` should be specified with the unquoted variable name in `df` that contains information for the unique cluster or screening or sentinel site identifiers. In this case, `anthro.01` has a variable called `cluster` which we supply to this argument unquoted.
+
+3. The argument `.source` should be specified with the type of the source for the data in `df`. Since `anthro.01` data is from a survey, we specify this argument as *"survey"*.
+
+We can also chain `anthro.01` to the function using the native pipe operator `|>`:
+
+```{r}
+#| label: pipe_operator
+#| echo: true
+#| eval: false
+
+anthro.01 |>
+  mw_check_ipcamn_ssreq(
+    cluster = cluster,
+    .source = "survey"
+  )
+```
+
+Either way, the returned output will be: 
+```{r}
+#| label: view_check
+#| echo: false
+
+anthro.01 |>
+  mw_check_ipcamn_ssreq(
+    cluster = cluster,
+    .source = "survey"
+  )
+```
+
+A `tibble` object is returned with three columns:  
+
+  + `n_clusters` counts the number of unique cluster or villages or community identifiers in the data set where the data collection took place.
+
+  + `n_obs` counts the number of children from which data were collected.
+
+  + `meet_ipc` indicates whether the IPC AMN sample size requirements (for surveys in this case) were met or not.
+
+The above output is not quite useful yet as we often deal with multiple-area datasets. We can get a summarized output by area as follows: 
+
+```{r}
+#| label: group_by
+#| echo: true
+#| eval: false
+
+## Load the dplyr package ----
+library(dplyr)
+
+## Use the group_by() function ----
+anthro.01 |>
+  group_by(area) |>
+  mw_check_ipcamn_ssreq(
+    cluster = cluster,
+    .source = "survey"
+  )
+```
+
+This will return: 
+
+```{r}
+#| label: view_group_by
+#| echo: false
+
+anthro.01 |>
+  group_by(area) |>
+  mw_check_ipcamn_ssreq(
+    cluster = cluster,
+    .source = "survey"
+  )
+```
+
+For screening or sentinel site-based data, we approach the task the same way; we only have to change the `.source` parameter to "screening" or to "ssite" as appropriate, as well as to supply `cluster` with the right column name of the sub-areas inside the main area (villages, localities, comunas, communities, etc).
+
+# References