-
Notifications
You must be signed in to change notification settings - Fork 9
Mismatches channels with multiple samples in the input sheet #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.2.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
assets/bidcell_toy_ref/bidcell_toy_ref_sc_breast_markers_neg.csv
Outdated
Show resolved
Hide resolved
assets/bidcell_toy_ref/bidcell_toy_ref_sc_breast_markers_pos.csv
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Dongze, can you please go over the comments below
Done. Also, I want to bring the nf-core module for proseg to your attention. They now have their docker image as well, although it is not upgraded to proseg version 3 yet. |
the latest commit is to solve a cellpose error I encountered, suggested by MIT-LCP/wfdb-python#493 I am working on enabling GPU for cellpose cuz it runs really slow on CPU. |
Hi @khersameesh24 I removed all bidcell related code. Now it is good to merge. |
Hi @heylf , this PR is about the channel mismatch issue we discussed previously. Please let me know what you think. thanks! |
Hi @DongzeHE , I am working to get the pipeline to run with multiple samples from the sampleheet. Will merge your changes with it |
Wonderful! Feel free to let me know if there is anything I can help! |
FYI, all my changes are for getting the pipeline to run with multiple samples from the sampleheet. |
I am fixing all the subworkflows to support multi-sample. Can you add your github username in the README file in the contributors section? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @khersameesh24 , I added my name here
Description of Changes
This PR addresses an, in my opinion, important file-mismatching bug in the
proseg
subworkflow that occurs when processing multiple samples. The changes ensure that all input and output channels within the subworkflow are correctly paired using sample-specific metadata.Reason for Changes
I'm applying the
proseg
subworkflow to several datasets and have identified a file mismatching issue. The currentproseg
subworkflow, like most inspatialxe
, does not pair channels by sample before passing them to downstream tasks. Because Nextflow channels operate on a First-In, First-Out (FIFO) basis (see here), this can lead to incorrect file pairings when processing multiple samples.For example, the
proseg2baysor
module outputs multiple channels. One channel, for the transcript assignment file, is tagged bymeta
. Another, for the polygon mask file, is not. When these are passed to thexeniumranger import-segmentation
module (aliased asxris
), the pipeline discards themeta
tag from the transcript channel. It then passes two "file-only" channels toxris
individually, along with the original xenium bundle channel.Due to the FIFO nature of these independent channels, it's nearly certain that
xris
will receive mismatched files when a run includes multiple input samples. This is exactly what happened in my recent analysis.To fix this, I propose updating the input and output blocks of the affected modules. The changes I've made for the
proseg
subworkflow ensure that all input files are matched bymeta.id
and all output files are tagged withmeta
. I suggest, and can help, to apply this changes to other subworkflows as well to avoid file mismatches.Additional Notes
bidcell
Workflow: This PR also includes an initial draft of thebidcell
workflow. It is not called anywhere in the pipeline because it is not fully functional as it depends on single-cell reference data, an input type not yet supported by the pipeline. I suggest merging this code as a foundation, and I will continue its development once the required input functionality is added.PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).