Provide note to users for too many missing values in imputation#145
Provide note to users for too many missing values in imputation#145vbrennsteiner wants to merge 10 commits intomainfrom
Conversation
…ts with remaining imputation method tests
update with main
…all imputation methods
…o_high_missing branch
…o_high_missing branch
|
I would say that it is difficult to set a specific threshold Example: Imagine that you profile a heterogeneous cell population with single cell proteomics. A certain protein is a very specific marker for a cell population that only makes up 10% of the total cell counts. The missingness of this protein would always be higher than any reasonable threshold? |
Agreed, the point is not necessarily to warn/dissuade, just to make aware that from here on the majority of values will be imputed. Perseus has to my knowledge some kind of imputation cutoff where it won't impute unless a minimal number of proportion of values is non-missing. It would just be a convenience warning that might prompt introspection if imputation is used without really thinking about the implications.. |
Add a warning for users when they are about to impute features above a predermined missingness threshold. This is meant to avoid situations where users run imputation per default, not considering that features which are mostly missing may become increasingly unreliable when imputed.
--> As pointed out in the meeting on 2026.01.19, "warning" is maybe not the correct term here and I would prefer "notification" - it's not that we want to suggest to users that something is going wrong, but there's a chance that every once in the wile we raise the right eyebrows to reconsider some downstream step that would be performed on only imputed values.
To-do list (outside contributers only)