Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beginning of making filtering steps all work together #41

Merged
merged 3 commits into from
Jul 3, 2024
Merged

Conversation

kweav
Copy link
Collaborator

@kweav kweav commented Jun 27, 2024

Sorry, this PR is the first step of filtering where the filter_ki branch builds on the filter_qc_ki5 branch. PR #40 is trying to merge filter_ki2 into filter_ki. I submitted those PRs out of order since this one will be PR #41.

This PR makes a lot of documentation changes, explaining parameters and how to use the gimap_filter() function.

It also builds the groundwork for the filters working together.

  • All supported/possible filters are first set to be NULL and then based on user input for which filter(s) to run, these variables are overwritten/remain NULL.
  • A list of all possible filters is built, and then we use the reduce() function to cbind these filters together. This approach ignores NULLs and returns a df with column(s) of TRUEs and FALSEs.
  • Then the min_n_filters parameter is used together with rowSums to find out which pgRNA constructs are flagged by at least that minimum number of filters that would result in the pgRNA constructs being removed from the dataset -- creating a consensus or combined filter.

#TRUE means it should be filtered, FALSE means it shouldn't be filtered
combined_filter <- rowSums(reduce(possible_filters, cbind)) >= min_n_filters


gimap_dataset$filtered <- NULL #TODO: Filtered version of the data can be stored here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add here a nice readable version of combined_filter to gimap_dataset and then the filtered version of the data to filtered_data

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a later PR so I think this means this is set to go!

@kweav kweav changed the base branch from filter_qc_ki5 to main July 1, 2024 19:16
Copy link
Collaborator

@cansavvy cansavvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MERGE IT

@@ -18,18 +23,57 @@
#'
#' # To see filtered data
#' gimap_dataset$filtered_data
#'
#' # If you want to only use a single filter or some subset, specify which using the filter_type parameter
#' gimap_dataset <- gimap_filter(gimap_dataset, filter_type = "zero_count_only")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YAY DOCUMENTATION!

#TRUE means it should be filtered, FALSE means it shouldn't be filtered
combined_filter <- rowSums(reduce(possible_filters, cbind)) >= min_n_filters


gimap_dataset$filtered <- NULL #TODO: Filtered version of the data can be stored here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a later PR so I think this means this is set to go!

@cansavvy cansavvy merged commit 655e5f1 into main Jul 3, 2024
6 of 7 checks passed
@cansavvy cansavvy deleted the filter_ki branch July 3, 2024 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants