Skip to content

Commit

Permalink
Merge pull request #131 from nutriverse/dev
Browse files Browse the repository at this point in the history
fix in vignettes, annotated numbered code chunk in quarto not showing properly #130
  • Loading branch information
ernestguevarra authored Dec 7, 2024
2 parents 08ae58a + 95f4b19 commit a3c8e5d
Show file tree
Hide file tree
Showing 4 changed files with 160 additions and 40 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Type: Package
Package: mwana
Title: An Efficient Workflow for Plausibility Checks and Prevalence Analysis of
Wasting in R
Version: 0.2.1
Version: 0.2.1.9000
Authors@R: c(
person("Tomás", "Zaba", , "tomas.zaba@outlook.com",
role = c("aut", "cre", "cph"),
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# mwana (development version)

# mwana 0.2.1

## General updates
Expand Down
66 changes: 27 additions & 39 deletions vignettes/ipc_amn_check.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -13,38 +13,38 @@ vignette: >
%\VignetteEncoding{UTF-8}
---

```{r}
#| label: global-setup
#| echo: false
#| message: false

library(mwana)
library(dplyr)
```

Evidence on the prevalence of acute malnutrition used in the IPC Acute Malnutrition (IPC AMN) can come from different sources: representative surveys, screenings, or community-based surveillance system (known as sentinel sites). The IPC sets minimum sample size requirements for each of these sources [@ipcmanual].

In the IPC AMN analysis workflow, the first step a data analyst has to take is the checking of sample size requirements as set by IPC for each survey area to be included in the IPC AMN analysis. `mwana` provides the `mw_check_ipcamn_ssreq()` function for this purpose.

To demonstrate its usage, we will use the built-in sample data set `anthro.01`.

```{r}
#| label: view-data
#| echo: true

``` r
head(anthro.01)
```

```
## # A tibble: 6 × 11
## area dos cluster team sex dob age weight height edema muac
## <chr> <date> <int> <int> <chr> <date> <int> <dbl> <dbl> <chr> <int>
## 1 District E 2023-12-04 1 3 m NA 59 15.6 109. n 146
## 2 District E 2023-12-04 1 3 m NA 8 7.5 68.6 n 127
## 3 District E 2023-12-04 1 3 m NA 19 9.7 79.5 n 142
## 4 District E 2023-12-04 1 3 f NA 49 14.3 100. n 149
## 5 District E 2023-12-04 1 3 f NA 32 12.4 92.1 n 143
## 6 District E 2023-12-04 1 3 f NA 17 9.3 77.8 n 132
```


`anthro.01` contains anthropometry data from SMART surveys from anonymized locations. To learn more about this dataset, call `help("anthro.01")` in your `R` console.

Now that we got acquainted with the data set, we can proceed to executing the task. To achieve this, we simply do:

```{r}
#| label: check
#| echo: true
#| eval: false

``` r
mw_check_ipcamn_ssreq(
df = anthro.01, # <1>
cluster = cluster, # <2>
Expand All @@ -60,11 +60,8 @@ mw_check_ipcamn_ssreq(

We can also chain `anthro.01` to the function using the native pipe operator `|>`:

```{r}
#| label: pipe_operator
#| echo: true
#| eval: false

``` r
anthro.01 |>
mw_check_ipcamn_ssreq(
cluster = cluster,
Expand All @@ -73,15 +70,12 @@ anthro.01 |>
```

Either way, the returned output will be:
```{r}
#| label: view_check
#| echo: false

anthro.01 |>
mw_check_ipcamn_ssreq(
cluster = cluster,
.source = "survey"
)
```
## # A tibble: 1 × 3
## n_clusters n_obs meet_ipc
## <int> <int> <chr>
## 1 30 1191 yes
```

A `tibble` object is returned with three columns:
Expand All @@ -94,11 +88,8 @@ A `tibble` object is returned with three columns:

The above output is not quite useful yet as we often deal with multiple-area datasets. We can get a summarized output by area as follows:

```{r}
#| label: group_by
#| echo: true
#| eval: false

``` r
## Load the dplyr package ----
library(dplyr)

Expand All @@ -113,16 +104,13 @@ anthro.01 |>

This will return:

```{r}
#| label: view_group_by
#| echo: false

anthro.01 |>
group_by(area) |>
mw_check_ipcamn_ssreq(
cluster = cluster,
.source = "survey"
)
```
## # A tibble: 2 × 4
## area n_clusters n_obs meet_ipc
## <chr> <int> <int> <chr>
## 1 District E 30 505 yes
## 2 District G 30 686 yes
```

For screening or sentinel site-based data, we approach the task the same way; we only have to change the `.source` parameter to "screening" or to "ssite" as appropriate, as well as to supply `cluster` with the right column name of the sub-areas inside the main area (villages, localities, comunas, communities, etc).
Expand Down
130 changes: 130 additions & 0 deletions vignettes/ipc_amn_check.qmd.orig
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
---
title: "Checking if IPC Acute Malnutrition sample size requirements were met"
author: Tomás Zaba
bibliography: references.bib
csl: harvard-cite-them-right-11th-edition.csl
knitr:
opts_chunk:
collapse: true
comment: "#>"
vignette: >
%\VignetteIndexEntry{Checking if IPC Acute Malnutrition sample size requirements were met}
%\VignetteEngine{quarto::html}
%\VignetteEncoding{UTF-8}
---

```{r}
#| label: global-setup
#| echo: false
#| message: false

library(mwana)
library(dplyr)
```

Evidence on the prevalence of acute malnutrition used in the IPC Acute Malnutrition (IPC AMN) can come from different sources: representative surveys, screenings, or community-based surveillance system (known as sentinel sites). The IPC sets minimum sample size requirements for each of these sources [@ipcmanual].

In the IPC AMN analysis workflow, the first step a data analyst has to take is the checking of sample size requirements as set by IPC for each survey area to be included in the IPC AMN analysis. `mwana` provides the `mw_check_ipcamn_ssreq()` function for this purpose.

To demonstrate its usage, we will use the built-in sample data set `anthro.01`.

```{r}
#| label: view-data
#| echo: true

head(anthro.01)
```


`anthro.01` contains anthropometry data from SMART surveys from anonymized locations. To learn more about this dataset, call `help("anthro.01")` in your `R` console.

Now that we got acquainted with the data set, we can proceed to executing the task. To achieve this, we simply do:

```{r}
#| label: check
#| echo: true
#| eval: false

mw_check_ipcamn_ssreq(
df = anthro.01, # <1>
cluster = cluster, # <2>
.source = "survey" # <3>
)
```

1. The argument `df` should be specified with the dataset you want to assess sample sizes for. In this case, `anthro.01`.

2. The argument `cluster` should be specified with the unquoted variable name in `df` that contains information for the unique cluster or screening or sentinel site identifiers. In this case, `anthro.01` has a variable called `cluster` which we supply to this argument unquoted.

3. The argument `.source` should be specified with the type of the source for the data in `df`. Since `anthro.01` data is from a survey, we specify this argument as *"survey"*.

We can also chain `anthro.01` to the function using the native pipe operator `|>`:

```{r}
#| label: pipe_operator
#| echo: true
#| eval: false

anthro.01 |>
mw_check_ipcamn_ssreq(
cluster = cluster,
.source = "survey"
)
```

Either way, the returned output will be:
```{r}
#| label: view_check
#| echo: false

anthro.01 |>
mw_check_ipcamn_ssreq(
cluster = cluster,
.source = "survey"
)
```

A `tibble` object is returned with three columns:

+ `n_clusters` counts the number of unique cluster or villages or community identifiers in the data set where the data collection took place.

+ `n_obs` counts the number of children from which data were collected.

+ `meet_ipc` indicates whether the IPC AMN sample size requirements (for surveys in this case) were met or not.

The above output is not quite useful yet as we often deal with multiple-area datasets. We can get a summarized output by area as follows:

```{r}
#| label: group_by
#| echo: true
#| eval: false

## Load the dplyr package ----
library(dplyr)

## Use the group_by() function ----
anthro.01 |>
group_by(area) |>
mw_check_ipcamn_ssreq(
cluster = cluster,
.source = "survey"
)
```

This will return:

```{r}
#| label: view_group_by
#| echo: false

anthro.01 |>
group_by(area) |>
mw_check_ipcamn_ssreq(
cluster = cluster,
.source = "survey"
)
```

For screening or sentinel site-based data, we approach the task the same way; we only have to change the `.source` parameter to "screening" or to "ssite" as appropriate, as well as to supply `cluster` with the right column name of the sub-areas inside the main area (villages, localities, comunas, communities, etc).

# References

0 comments on commit a3c8e5d

Please sign in to comment.