Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-factor functions to wrangle wfhz and muac data #57

Merged
merged 6 commits into from
Oct 31, 2024
Merged

Conversation

tomaszaba
Copy link
Collaborator

@tomaszaba tomaszaba commented Oct 27, 2024

Ernest, in this PR I addressed issues #26 #27 #28. There were some tasks related to issues #24 #25. For these two I was not quite sure if you did not address them yet. Please let me know.

Now, onto the changes made and the rationale behind

  1. In utils.R:
  • In flag_outliers() and remove_flags()`:

    • I changed class of the @param x from double to numeric due to the presence of the absolute MUAC values. In principle, if it was just WFHZ or MFAZ, it would be correct to say double as they are. For MUAC at this stage, when flag_muac() is used/called, MUAC would have been set to millimetres; A MUAC in mm would be an integer. Thus, a middle point for both type of vectors would be numeric.

    • I changed the argument unit to .from as I believe it makes more sense like that. Maybe .in could also make sense?

    • In the choices, I changed "crude" to "absolute" to align with the use of absolute MUAC values in the documentation.

  • In recode_muac():

    • You will see that I in the documentation I changed to class double or numeric as I want to capture the fact that MUAC can come in centimeters or millimeters. For the latter, I tried to put integer but it fails as checking with either "is(x, "integer") or is.integer(x) fails (FALSE).

Up to now, all functions inside utils.R are exported.

  • In mw_wrangle_muac() I did not check if age is numeric or not. This is because:
    • This would have been checked in mw_wrangle_age()
    • I did not feel like it would be necessary because this vector is just used to control flow, what is actually used is age_days from mw_wrangle_age(), which is an output that is checked in the tests.
    • I added an argument .decimals so user can control the number of decimal places for the zscores
    • I did not check the class of sex. Although I indicated that they should either be character() or numeric(), I think checking for this can be measleading as users can supplie any vector. You may say that this is what is in my docs, which is true, but later on in the docs I mention this specificity about this character or numeric(). To me, this is what I don’t think I should check for. happy to read your review though.

On the general file organisation

  • I split the functions in two files. This is to address your comment about my files being too long. So I have decided to put one major function into one .R.

Thanks

@tomaszaba tomaszaba added documentation Improvements or additions to documentation refactor code improvement/refactoring testing code testing labels Oct 27, 2024
@tomaszaba
Copy link
Collaborator Author

I have just made some changes in three files. I realised that my application of NSE was inconsistent; I had some lines of code where I had difused and tidy evaluated some arguments but I was simply not using them. So I fixed this in mw_wrangle_age, mw_wrangle_wfhz and mw_wrangle_muac, while keeping the code neat.

Copy link
Member

@ernestguevarra ernestguevarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomaszaba, see my detailed feedback below. I make suggestions. You decide if you want to follow them or not.

On testing, I did devtools::test_coverage_active_file() when the test file for wrangle_muac is open and I note 5 lines of code note covered by testing. Please add test that cover these five lines of code.

R/utils.R Outdated Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/wrangle_wfhz.R Outdated Show resolved Hide resolved
R/wrangle_wfhz.R Outdated Show resolved Hide resolved
R/wrangle_wfhz.R Outdated Show resolved Hide resolved
R/wrangle_wfhz.R Outdated Show resolved Hide resolved
R/wrangle_wfhz.R Show resolved Hide resolved
@tomaszaba
Copy link
Collaborator Author

@tomaszaba, see my detailed feedback below. I make suggestions. You decide if you want to follow them or not.

On testing, I did devtools::test_coverage_active_file() when the test file for wrangle_muac is open and I note 5 lines of code note covered by testing. Please add test that cover these five lines of code.

@ernestguevarra, thanks for the review. I'll reflect on these and take the appropriate action.

@tomaszaba
Copy link
Collaborator Author

Hi @ernestguevarra,
I've just pushed some commits on changes made in the code re-factoring and improving the clarity and the comprehensiveness of the function documentation. Please see below an overview of the changes made:

  1. Function documentation:

    • I indicated the function's behaviour or reaction when a wrong input vector is provided opposite to the expected. I did this in all relevant @params of .R files we have re-factored thus far.
    • I added a note that gives a context about the flags or flagged record. I hope this is fine.
    • Removed the links and used @reference instead. It looks nice, neat and tidy. Thanks for your suggestion.
  2. Re-codes:

    • Re-factored the function checks to ensure that only expected inputs are supplied. I have to say, working on this task was quite satisfying. It reminded me of common programmes that I use that when I supply a wrong input they error and display a message. That felt like really programming. At the beginning I lacked the idea on how to address this task. Then I checked some of your repos and this resource. I also realised that I still have to workout a lot on the power of string manipulation and regular expressions. As usual, thanks for sharing your wisdom with me.
    • On recode-muac() I decided to broaden the checks to include integers too. This is to avoid an unnecessary issues where R would deny a MUAC vector that would be of class integer. There was no harm in doing it, so I did it.
  3. Test coverage:

    • wrangle_age.R: 100%
    • utils.R: 100%
    • wrangle_wfhz.R: 100%
    • wrangle_muac.R: 97.06% In this file, there is a line of code that I couldn't figure out how to test it. I thought it would have been addressed in the "mother" function recode_muac().

Looking forward to your feedback.

Copy link
Member

@ernestguevarra ernestguevarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomaszaba please merge to dev or main whichever one you want.

@tomaszaba tomaszaba merged commit 17f63c0 into dev Oct 31, 2024
@tomaszaba tomaszaba deleted the wrangle_data branch November 3, 2024 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation refactor code improvement/refactoring testing code testing
Projects
None yet
2 participants