-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readme #29
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomaszaba, great work on this!
See my specific comments below. Please address them on the same branch. When you push your changes, they will go to this same pull request and then I can do a final review.
DESCRIPTION
Outdated
@@ -1,28 +1,19 @@ | |||
Type: Package | |||
Package: ipccheckr | |||
Title: Toolkit for Performing IPC Acute Malnutrition-related Data Checks | |||
Title: Utilities for analysing children's nutritional status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this title. Good job.
My only comment here is that this title will not pass CRAN. If you look at the rules for what the title of a package should be (https://cran.r-project.org/doc/manuals/R-exts.html#The-DESCRIPTION-file), CRAN says:
The mandatory ‘Title’ field should give a short description of the package. Some package listings may truncate the title to 65 characters. It should use title case (that is, use capitals for the principal words: tools::toTitleCase can help you with this), not use any markup, not have any continuation lines, and not end in a period (unless part of …). Do not repeat the package name: it is often used prefixed by the name. Refer to other packages and external software in single quotes, and to book titles (and similar) in double quotes.
The title is currently not in title case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this review. I skimmed the rules through, then I checked some package titles on GitHub. Found that the major error in this title was probably not capitalised like a title. I have revised this to the following: "Utilities for Analysing Children's Nutritional Status". I have to admit though that I am not quite happy with my title. Do you have any suggestions?
Version: 0.0.0.9000 | ||
Authors@R: c( | ||
person("Tomás", "Zaba", , "tomas.zaba@outlook.com", role = c("aut", "cre", "cph"), | ||
comment = c(ORCID = "0000-0002-7079-3574")), | ||
person("Ernest", "Guevarra", role = c("aut", "cph"), | ||
comment = c(ORCID = "0000-0002-4887-4415")), | ||
person("Douglas", "Jayasekaran", role = "ctb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should revisit Douglas' role. If he reviews the package, then we should put him down as rev
for reviewer. If he gives substantial feedback other than "looks great" or "I wish it can be better", then we can consider upgrading this role to ctb
. For me, my criteria for a contributor is someone who provides at least one of the following:
- feedback on code for function/s with a specific suggestion on what improvement is needed and/or what functionality should be added;
- feedback on documentation other than spelling or grammar focusing on coherence and content that improves documentation for general audience; or.
- provides a PR with suggested changes in functions and/or documentation.
A PR that is accepted and merged will automatically merit a ctb
role as code from that person will actually get incorporated onto the package and will show in the git commits and diffs. The first two above will require really determining the significance of the person's feedback and whether or not we used that feedback to change/edit the package.
So, let us revisit this after you have shared/presented the package to him.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will be able to ascertain the type of his contribution once I share the link with him to review the documentation.I expect his reviews on the Vignettes, other than the function documentation per se. We can revisit this later.
DESCRIPTION
Outdated
Description: A streamlined and comprehensive implementation of SMART Methodology | ||
guidelines for data quality checks and prevalence analyses, with enhanced | ||
efficiency for handling large datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really great description. Well done! I have a few comments that you can consider:
- in line 11, CRAN will most likely comment that SMART abbreviation should be spelled out so I suggest we do that now instead of waiting for CRAN to tell us that when we submit. So, we should say "Standardized Monitoring and Assessment of Relief and Transitions (SMART)".
- There is a good chance that CRAN will ask for a link to SMART Methodology. So I suggested we add that. In DESCRIPTION file, you put a link using
<https://smartmethodology.org/>
- I suggest saying "undernutrition prevalence estimation" rather than "prevalence analyses"
- In lines 12-13 you say "enhanced efficiency for handling large datasets", I think I know what you mean by that but I am not sure it means the same the way you state it here. For someone external, this might seem that you are making a claim that this package will make data quality checks and prevalence estimation can handle large datasets faster than other approaches. But, I think what you are saying here is more in terms of the package helping to streamline the checks and estimation process such that multiple datasets that need checking and analysing (which I think is what you mean by large datasets) can be done through a streamlined, reproducible process. If I misread this, please clarify. Otherwise, if we stick to this language, we will have to be able to show that this package is faster than the gold standard (SMART ENA?) and that it can handle large datasets (tens of thousands to millions of rows of data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this
"But, I think what you are saying here is more in terms of the package helping to streamline the checks and estimation process such that multiple datasets that need checking and analysing (which I think is what you mean by large datasets) can be done through a streamlined, reproducible process. "
is correct about what I meant. It is also on the fact that using ENA, one needs to repeat the workflow X times, depending on the number of provinces, districts in the dataset, as the units of analysis. How do you suggest we clear this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the revised text:
"A streamlined and comprehensive implementation of the Standardized
Monitoring and Assessment of Relief and Transition (SMART) Methodology
https://smartmethodology.org/ guidelines for data quality checks and
prevalence analyses, with enhanced programmable process particularly when
handling large multiple datasets."
I used your text on "programmable process particularly when handling large multiple datasets" from the Motivation edition you made. Does this make the description less ambiguous?
DESCRIPTION
Outdated
License: GPL (>= 3) | ||
URL: https://github.com/tomaszaba/ipccheckr | ||
BugReports: https://github.com/tomaszaba/ipccheckr/issues | ||
URL: https://github.com/nutriverse/mwana |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you can already include the upcoming URL of the website at https://nutriverse.io/mwana by adding it to the github URL after a comma ,
. So something like this:
URL: https://github.com/nutriverse/mwana, https://nutriverse.io/mwana/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
DESCRIPTION
Outdated
Depends: | ||
R (>= 2.10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will probably have to increase the Depends R version to 4.1 which is the version when the |>
native pipe operator was introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental) | ||
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move these badges first on top before the check badges. This is more of a style thing if you see the packages in nutriverse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental) | ||
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip) | ||
<!-- badges: end --> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here before the Background, it would be good to have one paragraph (of 2-3 sentences) that links the DESCRIPTION file to the README. My suggestion is the following:
Child anthropometric assessments, implemented routinely in most countries worldwide, are the cornerstones of child nutrition and food security surveillance around the world. Ensuring the quality of child anthropometric data, the accuracy of child undernutrition prevalence estimates, and the timeliness of reporting is therefore critical in establishing accurate, robust, and up-to-date child undernutrition status globally.
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip) | ||
<!-- badges: end --> | ||
|
||
## Background |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to ## Motivation
|
||
## Background | ||
|
||
`mwana`, for “child” in *Elómwè*, a local language spoken in the central-northern regions of Mozambique, with a similar meaning across various other Bantu languages, including Swahili, that is spoken in many parts of Africa, is a package designed for analysing acute malnutrition’s prevalence among “mwana”’s aged 6 to 59 months. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this after the paragraph above and before the section on Motivation. Suggest to re-write the text as follows:
`mwana`, term for *child* in the *Elómwè* language spoken in the central-northern regions of Mozambique and also a word with a similar meaning across other Bantu languages (such as Swahili) spoken in many parts of Africa, is a package that streamlines child anthropometric data quality checks and undernutrition prevalence estimation for children 6-59 months old through the comprehensive implementation of the SMART Methodology guidelines in R.
`mwana` was born out of the author’s frequent wrestle when, in his capacity as member of the Quality Assurance team for nutrition of the IPC, is frequently presented with the task of handling large datasets to conduct data quality and prevalence appraisal before every IPC analysis to ensure the use of reliable evidence in the analysis. The typical data appraisal workflow in the context of IPC is usually cumbersome, as it requires significant time and effort, whilst ensuring that the right analysis procedure is used by checking for different conditionals. Analysts often need to switch between software: SPSS or Excel for data processing, then import data into ENA for SMART software to run checks and prevalence analysis, then extract outputs into a summary spreadsheet. This process is repeated one by one for the number of units of analysis in the dataset. Oftentimes this workflow needs to be implemented in relatively short period time, leading to errors in the workflow due to fatigue. | ||
|
||
In this way, more than just an R-based implementation of the ENA for SMART software, mwana’s key added value lies in its ability to simplify the above alluded cumbersome workflow into a wholesome experience, all in one place. This is especially beneficial when handling large datasets, a day-to-day practice at IPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest re-phrasing this section to the following:
`mwana` was borne out of the author’s own experience of having to work with multiple child anthropometric datasets to conduct data quality assessment and prevalence estimation on as part of the data quality assurance team of the Integrated Phase Classification (IPC). The current standard child anthropometric data appraisal workflow is extremely cumbersome requiring significant time and effort utilising different software tools (SPSS, Excel, Emergency Nutrition Assessment or ENA software) for each step of the process for a single dataset. This process is repeated for every dataset needing to be processed and often needing to be implemented in a relatively short period time. This manual and repetitive process, by its nature, is extremely error-prone.
`mwana`, which is primarily a an R-based implementation of the ENA for SMART software, simplifies this cumbersome workflow into a programmable process particularly when handling large multiple datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomaszaba, all this looks good! Well done! You can merge this to main now and go on to the next task.
Hi Ernest,
Please review this PR.
I addressed issues #2, #3, #4 , #5 , #6 , #7 , #8 , #9 , #11 , #12 , #13.
Regarding issue #10 , I was not sure where to put it, so I left where it was before. Please feel free to move where you find it fit.
On #7, the flowchart does not get centralised when it gets uploaded to GitHub. Locally, it got centralised after I reduced the width parameters. See below:
Please note that this time I am using
quarto
.Regards.