Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Thank you for your contribution to pycytominer!
Please succinctly summarize your proposed change.
What motivated you to make this change?
Summary
This PR addresses one of the reviewers’ comments:
" When I used the function pycytominer.normalize() on a dataset originating from my own work, I encountered an error message saying
“No CP features found. Are you sure this dataframe is from CellProfiler?”.
I could easily fix the problem because I know the properties of a CellProfiler dataset, and the error message was hinting at some “Metadata_” missing.
The fix was very easy, I just needed to add the prefix “Metadata_” to my metadata columns. My dataset comes from another commonly used proprietary software called Harmony. I understand that the authors are in favor of using open-source resources, but I think that the pycytominer tool could be useful in a much larger context, and take as input different types of datasets. In the readme file, that I followed to perform the normalization, I couldn’t find anywhere that the dataset has to come from CellProfiler. ..."
I replaced the ‘assert’ statement with an exception, as it’s considered bad practice to use assertions during runtime. Assertions can be disabled with the
-o
parameter, which means they aren’t reliable for handling errors in production code.Here are some useful links:
Additionally, I updated the error message to clarify that pycytominer infers CellProfiler features by default and provided guidance on how users can manually specify their feature space when working with non-CellProfiler data.
Please also link to any relevant issues that your code is associated with.
This code is also an addition to #430
What is the nature of your change?
Checklist
Please ensure that all boxes are checked before indicating that a pull request is ready for review.