Skip to content

Commit

Permalink
Merge pull request #26 from mitre/t15m-rename-duplicates
Browse files Browse the repository at this point in the history
Replaces "Duplicates" with "Extraneous-Same-Day"
  • Loading branch information
dchud authored Jan 14, 2021
2 parents 07e876d + 42e8843 commit 5d500bc
Show file tree
Hide file tree
Showing 6 changed files with 175 additions and 175 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Authors@R: c(
person("Campos","Diego",,"camposd@email.chop.edu","aut")
)
Maintainer: Robert Grundmeier <grundmeier@email.chop.edu>
Description: Cleans growth data that may contain implausible data based on unit or data range
Description: growthcleanr cleans growth data that may contain implausible data based on unit or data range.
Imports:
data.table (>= 1.13.0),
tidyr (>= 1.1.0),
Expand Down
252 changes: 126 additions & 126 deletions R/growth.R

Large diffs are not rendered by default.

28 changes: 14 additions & 14 deletions README-adjustcarryforward.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,15 +92,15 @@ parameters).
* Warning: this will take much longer!

The default number of sweep steps is 9; this can be changed with the option
`--gridlength`.
`--gridlength`.

For testing options of handling strings of multiple carried forward
values, several options from 0 to 3 have been incorporated. 0 (no change) is the
default option, and can be changed `--exclude_opt`. More information on each
values, several options from 0 to 3 have been incorporated. 0 (no change) is the
default option, and can be changed `--exclude_opt`. More information on each
option can be found in the `adjustcarryforward()` documentation.

In addition to multiple options for carried-forward strings, "answers" for a given
dataset have been incorporated. When the `--add_answers` flag is set to `TRUE`
dataset have been incorporated. When the `--add_answers` flag is set to `TRUE`
(`TRUE` by default), a column called `acf_answers` will have, for each height value, "Definitely Exclude", "Definitely Include", or "Unknown" (if it does not fall in
either category). Weight values are set as `NA`.

Expand Down Expand Up @@ -180,16 +180,16 @@ sweep (hence the examples w/5 and 9 step sweeps).
And the first few result rows in `test_adjustcarrforward_DATE_TIME.csv` would be:

```R
id subjid sex agedays param measurement clean_value run-1 run-2 run-3 run-4 run-5
1510 775155 0 889 HEIGHTCM 84.9 Exclude-Duplicate Missing Missing Missing Missing Missing
1511 775155 0 889 HEIGHTCM 89.06 Include No Change No Change No Change No Change No Change
1512 775155 0 1071 HEIGHTCM 92.5 Include No Change No Change No Change No Change No Change
1513 775155 0 1253 HEIGHTCM 96.2 Include No Change No Change No Change No Change No Change
1514 775155 0 1435 HEIGHTCM 96.2 Exclude-Carried-Forward No Change No Change Include Include Include
1515 775155 0 1435 HEIGHTCM 99.692 Include No Change No Change No Change No Change No Change
1516 775155 0 1806 HEIGHTCM 106.1 Include No Change No Change No Change No Change No Change
1517 775155 0 2177 HEIGHTCM 112.3 Include No Change No Change No Change No Change No Change
1518 775155 0 889 WEIGHTKG 13.1 Include No Change No Change No Change No Change No Change
id subjid sex agedays param measurement clean_value run-1 run-2 run-3 run-4 run-5
1510 775155 0 889 HEIGHTCM 84.9 Exclude-Extraneous-Same-Day Missing Missing Missing Missing Missing
1511 775155 0 889 HEIGHTCM 89.06 Include No Change No Change No Change No Change No Change
1512 775155 0 1071 HEIGHTCM 92.5 Include No Change No Change No Change No Change No Change
1513 775155 0 1253 HEIGHTCM 96.2 Include No Change No Change No Change No Change No Change
1514 775155 0 1435 HEIGHTCM 96.2 Exclude-Carried-Forward No Change No Change Include Include Include
1515 775155 0 1435 HEIGHTCM 99.692 Include No Change No Change No Change No Change No Change
1516 775155 0 1806 HEIGHTCM 106.1 Include No Change No Change No Change No Change No Change
1517 775155 0 2177 HEIGHTCM 112.3 Include No Change No Change No Change No Change No Change
1518 775155 0 889 WEIGHTKG 13.1 Include No Change No Change No Change No Change No Change
```

The fifth row in the example above demonstrates the results of the experimental script;
Expand Down
64 changes: 32 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -378,31 +378,31 @@ with `cleangrowth()` will likely take a few minutes to complete.
> setkey(data, subjid, param, agedays)
> cleaned_data <- data[, clean_value:=cleangrowth(subjid, param, agedays, sex, measurement)]
> head(cleaned_data)
id subjid sex agedays param measurement clean_value
1: 1510 775155 0 889 HEIGHTCM 84.900 Exclude-Duplicate
2: 1511 775155 0 889 HEIGHTCM 89.060 Include
3: 1512 775155 0 1071 HEIGHTCM 92.500 Include
4: 1513 775155 0 1253 HEIGHTCM 96.200 Include
5: 1514 775155 0 1435 HEIGHTCM 96.200 Exclude-Carried-Forward
6: 1515 775155 0 1435 HEIGHTCM 99.692 Include
id subjid sex agedays param measurement clean_value
1: 1510 775155 0 889 HEIGHTCM 84.900 Exclude-Extraneous-Same-Day
2: 1511 775155 0 889 HEIGHTCM 89.060 Include
3: 1512 775155 0 1071 HEIGHTCM 92.500 Include
4: 1513 775155 0 1253 HEIGHTCM 96.200 Include
5: 1514 775155 0 1435 HEIGHTCM 96.200 Exclude-Carried-Forward
6: 1515 775155 0 1435 HEIGHTCM 99.692 Include
> cleaned_data %>% group_by(clean_value) %>% tally(sort=TRUE)
# A tibble: 14 x 2
clean_value n
<ord> <int>
1 Include 38875
2 Exclude-Duplicate 10546
3 Exclude-Carried-Forward 6694
4 Exclude-SD-Cutoff 168
5 Exclude-EWMA-8 135
6 Exclude-EWMA-Extreme 95
7 Exclude-EWMA-9 93
8 Exclude-Min-Height-Change 65
9 Swapped-Measurements 16
10 Exclude-Too-Many-Errors 6
11 Exclude-EWMA-11 5
12 Exclude-EWMA-12 2
13 Exclude-Pair-Delta-18 2
14 Exclude-Max-Height-Change 1
clean_value n
<ord> <int>
1 Include 38875
2 Exclude-Extraneous-Same-Day 10546
3 Exclude-Carried-Forward 6694
4 Exclude-SD-Cutoff 168
5 Exclude-EWMA-8 135
6 Exclude-EWMA-Extreme 95
7 Exclude-EWMA-9 93
8 Exclude-Min-Height-Change 65
9 Swapped-Measurements 16
10 Exclude-Too-Many-Errors 6
11 Exclude-EWMA-11 5
12 Exclude-EWMA-12 2
13 Exclude-Pair-Delta-18 2
14 Exclude-Max-Height-Change 1
```

If you are able to run these steps and see a similar result, you have the
Expand Down Expand Up @@ -596,7 +596,7 @@ the algorithm's step labels and labels used in comment text in `growthcleanr`.
| - | - | - | - |
| 2d | 0 | Include | - |
| 2d | 1 | Missing | - |
| 5b | 2 | Exclude-Temporary-Duplicate | - |
| 5b | 2 | Exclude-Temporary-Extraneous-Same-Day | - |
| 7d | - | Swapped-Measurement | - |
| 8f | - | Unit-Error-High | - |
| 8f | - | Unit-Error-Low | - |
Expand All @@ -605,7 +605,7 @@ the algorithm's step labels and labels used in comment text in `growthcleanr`.
| 10c | 4 | Exclude-SD-Cutoff | 10d, 10e |
| 11d | 5 | Exclude-EWMA-Extreme | 11e |
| 11f.ii | 6 | Exclude-EWMA-Extreme-Pair | 11i (R only) |
| 12d.i | 7 | Exclude-Duplicate | 12diii, 12ei, 12f |
| 12d.i | 7 | Exclude-Extraneous-Same-Day | 12diii, 12ei, 12f |
| 14f.i | 8 | Exclude-EWMA-8 | Set in 14h (in R) |
| 14f.ii | 9 | Exclude-EWMA-9 | Set in 14h (in R) |
| 14f.iii | 10 | Exclude-EWMA-10 | Set in 14h (in R) |
Expand Down Expand Up @@ -826,13 +826,13 @@ parameter type as `type`, specify each, with quotes:

```R
> head(my_cleaned_data)
id subjid sex aged type measurement clean_value
1: 1510 775155 0 889 HEIGHTCM 84.90 Exclude-Duplicate
2: 1511 775155 0 889 HEIGHTCM 89.06 Include
3: 1518 775155 0 889 WEIGHTKG 13.10 Include
4: 1512 775155 0 1071 HEIGHTCM 92.50 Include
5: 1519 775155 0 1071 WEIGHTKG 14.70 Include
6: 1513 775155 0 1253 HEIGHTCM 96.20 Include
id subjid sex aged type measurement clean_value
1: 1510 775155 0 889 HEIGHTCM 84.90 Exclude-Extraneous-Same-Day
2: 1511 775155 0 889 HEIGHTCM 89.06 Include
3: 1518 775155 0 889 WEIGHTKG 13.10 Include
4: 1512 775155 0 1071 HEIGHTCM 92.50 Include
5: 1519 775155 0 1071 WEIGHTKG 14.70 Include
6: 1513 775155 0 1253 HEIGHTCM 96.20 Include
> longwide(my_cleaned_data, agedays="aged", param="type")
```

Expand Down
2 changes: 1 addition & 1 deletion man/cleangrowth.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tests/testthat/test-utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,7 @@ test_that("longwide works as expected with custom values", {
# run longwide on changed data with some exclusion types included
inc_types <- c("Include",
"Exclude-Carried-Forward",
"Exclude-Duplicate")
"Exclude-Extraneous-Same-Day")
wide_syn <- longwide(sub_syn,
clean_value = "cv",
inclusion_types = inc_types)
Expand Down

0 comments on commit 5d500bc

Please sign in to comment.