-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add haven
option to raw_or_label
#180
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Some minor comments, but I think the output looks like what's needed and there were some clever methods used here.
I would next see if we can get the issuer to test this and get their feedback on if it meets expectations.
} else if (is.numeric(ptype)) { | ||
out <- parse_double(x) | ||
} else if (is.Date(ptype)) { | ||
out <- parse_date(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can address these Codecov warnings by adding the associated data types to tests in your new tests. I don't think it's useful to write a test for this function in particular since it's so basic, but you could do either.
REDCapTidieR/tests/testthat/test-utils.R
Line 436 in c58e783
test_that("apply_labs_haven works", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Will do! Btw are those code coverage warnings getting added to the files changed screen new? I've never seen them before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure, I started noticing them in the past month or two. Relatively helpful for helping keep our % up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay interesting finding (and I think why I skipped including those tests to start).
With labelled
if you try to apply labels to vector that has no corresponding val_labels<-
method it just returns the unlabelled vector:
labels <- as.Date(c("The first day of 2023" = "2023-01-01"))
x <- labelled::set_value_labels(as.Date("2023-01-01"), .labels = labels)
x
#> [1] "2023-01-01:
class(x)
#> [1] "Date"
That means if someone has some crazy redcap where the value options are dates and they do raw_or_label = "haven"
, their value labels won't get applied at all. BUT if they went ahead and implemented a val_labels<-.Date
method then they would get applied. This is such a weird situation I don't think we should even care tbh. I added tests for force_cast()
just to get rid of the coverage warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, very interesting find. In that case yea, much better to just test force_cast()
. If for some reason someone encounters this then an issue can be opened.
I figured we wouldn't test force_cast()
since most of the options for it were already tested in the parent function but whatever appeases the CodeCov gods is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value labels could be applied only to character or numeric vectors. It is not possible to add value labels to date vectors.
In fact, an error should have occured. I have open an issue on labelled website: larmarange/labelled#156
Description
This PR modifies
read_redcap()
to let users specifyraw_or_label = "haven"
and have categorical fields converted tohaven_labelled
vectors instead of factors. I keptlabelled
inSuggests
and added a check that the user has it installed if they specifyraw_or_label = "haven"
.Implementation considerations
haven_labelled
vectors, unlike factors, preserve the underlying data values/types. The non-trivial part about this is that to apply the labels, the data types in the data must match the data types of the vector of values we read from the metadata. This is slightly tricky because there's no foolproof way to say "cast vector x to type of vector y" for arbitrary y. My approach was to basically do our best at casting usingreadr
's parsing functions and fall back to converting the underlying values tochr
if all else fails.See below for a concrete example if it's helpful.
Proposed Changes
"haven"
option toread_redcap()
raw_or_label
and update appropriate arg checksmulti_choice_to_labels()
raw_or_label
argument fromread_redcap()
label_handler
function which will be one ofapply_labs_factor()
orapply_labs_haven()
Worked Example
Suppose we have a redcap field coded like this:
The
db_data
we get from redcap will have a field like this:where the
int
datatype was actually determined byreadr
since we don't allow users to pass data type specifications toredcap_read_oneshot()
The
db_metadata
will containwhere the values are stored as
chr
rather thanint
.My implementation:
db_data$my_field
and finds it'sint
db_metadata
,c(apple = "3", orange = "5", banana = "9")
, toint
withreadr::parse_integer()
labelled::set_value_labels()
If step 2 had failed for some reason then it would cast
db_data$my_field
tochr
and apply the labels to that.Issue Addressed
Related to #178
PR Checklist
Before submitting this PR, please check and verify below that the submission meets the below criteria:
.RDS
) updated underinst/testdata/create_test_data.R
usethis::use_version()
Code Review
This section to be used by the reviewer and developers during Code Review after PR submission
Code Review Checklist