Skip to content

Conversation

an-altosian
Copy link

Description of Changes

This PR addresses an, in my opinion, important file-mismatching bug in the proseg subworkflow that occurs when processing multiple samples. The changes ensure that all input and output channels within the subworkflow are correctly paired using sample-specific metadata.

Reason for Changes

I'm applying the proseg subworkflow to several datasets and have identified a file mismatching issue. The current proseg subworkflow, like most in spatialxe, does not pair channels by sample before passing them to downstream tasks. Because Nextflow channels operate on a First-In, First-Out (FIFO) basis (see here), this can lead to incorrect file pairings when processing multiple samples.

For example, the proseg2baysor module outputs multiple channels. One channel, for the transcript assignment file, is tagged by meta. Another, for the polygon mask file, is not. When these are passed to the xeniumranger import-segmentation module (aliased as xris), the pipeline discards the meta tag from the transcript channel. It then passes two "file-only" channels to xris individually, along with the original xenium bundle channel.

Due to the FIFO nature of these independent channels, it's nearly certain that xris will receive mismatched files when a run includes multiple input samples. This is exactly what happened in my recent analysis.

To fix this, I propose updating the input and output blocks of the affected modules. The changes I've made for the proseg subworkflow ensure that all input files are matched by meta.id and all output files are tagged with meta. I suggest, and can help, to apply this changes to other subworkflows as well to avoid file mismatches.

Additional Notes

  • Lost PR add bidcell subworkflow #78: Due to a mistake on my part (a force reset on my fork), the original PR add bidcell subworkflow #78 was lost. This PR contains the same changes, recovered from my local copy.
  • Draft bidcell Workflow: This PR also includes an initial draft of the bidcell workflow. It is not called anywhere in the pipeline because it is not fully functional as it depends on single-cell reference data, an input type not yet supported by the pipeline. I suggest merging this code as a foundation, and I will continue its development once the required input functionality is added.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/spatialxe branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.2.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Copy link
Collaborator

@khersameesh24 khersameesh24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dongze, can you please go over the comments below

@an-altosian
Copy link
Author

Done. Also, I want to bring the nf-core module for proseg to your attention. They now have their docker image as well, although it is not upgraded to proseg version 3 yet.

@an-altosian
Copy link
Author

the latest commit is to solve a cellpose error I encountered, suggested by MIT-LCP/wfdb-python#493

I am working on enabling GPU for cellpose cuz it runs really slow on CPU.

@khersameesh24 khersameesh24 added enhancement New feature or request prio3 labels Sep 15, 2025
@an-altosian
Copy link
Author

Hi @khersameesh24 I removed all bidcell related code. Now it is good to merge.

@an-altosian
Copy link
Author

Hi @heylf , this PR is about the channel mismatch issue we discussed previously. Please let me know what you think. thanks!

@khersameesh24
Copy link
Collaborator

Hi @khersameesh24 I removed all bidcell related code. Now it is good to merge.

Hi @DongzeHE , I am working to get the pipeline to run with multiple samples from the sampleheet. Will merge your changes with it

@an-altosian
Copy link
Author

Wonderful! Feel free to let me know if there is anything I can help!

@an-altosian
Copy link
Author

FYI, all my changes are for getting the pipeline to run with multiple samples from the sampleheet.

@khersameesh24
Copy link
Collaborator

FYI, all my changes are for getting the pipeline to run with multiple samples from the sampleheet.

I am fixing all the subworkflows to support multi-sample. Can you add your github username in the README file in the contributors section?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @khersameesh24 , I added my name here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request prio3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants