add filtering #62

kweav · 2024-08-29T20:57:33Z

Description

This is the beginning of a series of stacked PRs that are addressing NAs getting propagated throughout the code from normalization through CRISPR score and genetic interaction score calculation. In investigating issue #55, I found that NAs were being introduced for the test dataset because there were 2 input data pgRNA IDs that didn't have corresponding data in the annotation. So once the left_join() in the annotation step was performed, NAs were present.

The goal for addressing this issue is to

The checkmarked items above are what this PR specifically takes on which is reporting to console which IDs are filtered, filters them, and moves on with the analysis.

Next steps

The uncheckmarked items above will be tackled in stacked PRs down the line.

Fixes #55

How Has This Been Tested?

I've tested this fix locally on my computer using devtools to load the gimap changes and ran through the test data from Sita without error messages.

Feedback requested

Do you like the planned course of action? Are there changes you would make?
Could you also verify that this code runs the test data without error for you?

Thanks!!

…into handle_lm_issue

kweav · 2024-09-23T20:02:52Z

@cansavvy as discussed earlier ...
First additional commit uses readr::write_csv to write to a file if there are a greater than [user specified parameter] number of IDs that aren't found in the annotation.

Second additional commit was after running devtools::document() since I added a couple of arguments.

Still need to discuss next steps (including what to do if rm_ids_wo_annot is FALSE (not default). Right now it continues the analysis, but we'll run into the errors we've seen before unless all calculations are made more robust to ignore NA values rather than propagating them.

Still also should probably adjust the comment to tell the user to check if their input data IDs are miscoded/follow the pattern they expect?

cansavvy · 2024-09-27T16:02:05Z

@cansavvy as discussed earlier ... First additional commit uses readr::write_csv to write to a file if there are a greater than [user specified parameter] number of IDs that aren't found in the annotation.

Perfect! thanks for doing this! Sorry for my delay in responding.

Second additional commit was after running devtools::document() since I added a couple of arguments.

I believe/supposedly you shouldn't technically have to re-run this because the GitHub actions will make sure documents are up to date. But it certainly doesn't hurt anything to do so. I'm not 100% sure about this but can make it so if the GitHub Actions here don't already do this.

Still need to discuss next steps (including what to do if rm_ids_wo_annot is FALSE (not default). Right now it continues the analysis, but we'll run into the errors we've seen before unless all calculations are made more robust to ignore NA values rather than propagating them.

I'll work on this part.

Still also should probably adjust the comment to tell the user to check if their input data IDs are miscoded/follow the pattern they expect?

Yeah I can adjust that comment in the next PR.

cansavvy · 2024-09-30T15:26:10Z

Seeing if I can get these errors resolved and then I'll merge and start fresh on the next part of this:

2024-09-23 19:56:55 (26.6 MB/s) - ‘/private/var/folders/0g/hj_q_pzx65bbjnslxz9n0src0000gn/T/RtmpashkeG/Rinste79625282f1/gimap/extdata/CCLE_gene_cn.csv’ saved [840532069/840532069]
Quitting from lines 209-216 [unnamed-chunk-17] (getting_started.Rmd)
Error: Error: processing vignette 'getting_started.Rmd' failed with diagnostics:
is.data.frame(x) is not TRUE
--- failed re-building ‘getting_started.Rmd’
SUMMARY: processing the following file failed:
‘getting_started.Rmd’
Error: Error: Vignette re-building failed.
Execution halted

add filtering

9124e32

kweav requested a review from cansavvy August 29, 2024 20:57

cansavvy and others added 4 commits September 23, 2024 12:47

Merge branch 'main' into handle_lm_issue

357bd61

write to file

89dd88f

Merge branch 'handle_lm_issue' of https://github.com/FredHutch/gimap …

b34c72d

…into handle_lm_issue

from devtools document

f25c407

Fix a minor issue with data writing

fa738ba

cansavvy merged commit 68c0f98 into main Sep 30, 2024
4 of 7 checks passed

cansavvy deleted the handle_lm_issue branch September 30, 2024 18:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add filtering #62

add filtering #62

kweav commented Aug 29, 2024 •

edited

Loading

kweav commented Sep 23, 2024 •

edited

Loading

cansavvy commented Sep 27, 2024

cansavvy commented Sep 30, 2024

add filtering #62

add filtering #62

Conversation

kweav commented Aug 29, 2024 • edited Loading

Description

Next steps

How Has This Been Tested?

Feedback requested

kweav commented Sep 23, 2024 • edited Loading

cansavvy commented Sep 27, 2024

cansavvy commented Sep 30, 2024

kweav commented Aug 29, 2024 •

edited

Loading

kweav commented Sep 23, 2024 •

edited

Loading