-
Notifications
You must be signed in to change notification settings - Fork 481
add COMEBin utility script #7285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
bbfeaf5
056c813
0272e08
a4fa81e
7916855
1959b1f
df17336
5e50c49
b081fd0
abd429b
017e8aa
edbf9a3
e0b1d6d
8cf3452
b2afe2f
6cb41b5
9bc7cbc
4f663ea
532800b
1707d71
bb5e5f0
372b1d7
6418521
0c4685a
d4836d3
fc8c85d
bc79257
2ebe02d
befae46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| name: comebin_bam | ||
| owner: iuc | ||
| description: Generate bam file for COMEBin | ||
| homepage_url: https://github.com/ziyewang/COMEBin | ||
| long_description: | | ||
| COMEBin, a binning method based on contrastive multi-view representation learning. | ||
| COMEBin utilizes data augmentation to generate multiple fragments (views) of each | ||
| contig and obtains high-quality embeddings of heterogeneous features | ||
| (sequence coverage and k-mer distribution) through contrastive learning. | ||
| This script generate the bam file input for COMEBin. | ||
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/comebin_bam/ | ||
| type: unrestricted | ||
| categories: | ||
| - Metagenomics |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,175 @@ | ||
| <tool id="comebin_bam" name="Generate BAM file for COMEBin" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>COMEBin utility script to generate BAM files using modified MetaWRAP</description> | ||
| <macros> | ||
| <import>macros.xml</import> | ||
| </macros> | ||
| <expand macro="requirements"/> | ||
| <command detect_errors="exit_code"> | ||
| <![CDATA[ | ||
|
|
||
| mkdir 'outputs' && | ||
|
|
||
| #if $assembly.ext.endswith('.gz'): | ||
| ln -s '$assembly' 'assembly.fasta.gz' && | ||
| gunzip 'assembly.fasta.gz' && | ||
| #else: | ||
| ln -s '$assembly' 'assembly.fasta' && | ||
| #end if | ||
|
|
||
| #if $read_typ.is_select == "normal": | ||
| #if $input_typ.is_select == "paired": | ||
| #if $paired_reads.forward.ext.endswith('.gz'): | ||
| ln -s '$paired_reads.forward' 'read_1.fastq.gz' && | ||
| ln -s '$paired_reads.reverse' 'read_2.fastq.gz' && | ||
| gunzip 'read_1.fastq.gz' && | ||
| gunzip 'read_2.fastq.gz' && | ||
| #else: | ||
| ln -s '$paired_reads.forward' 'read_1.fastq' && | ||
| ln -s '$paired_reads.reverse' 'read_2.fastq' && | ||
| #end if | ||
| #else: | ||
| #if $forward.ext.endswith('.gz'): | ||
| ln -s '$forward' 'read_1.fastq.gz' && | ||
| ln -s '$reverse' 'read_2.fastq.gz' && | ||
| gunzip 'read_1.fastq.gz' && | ||
| gunzip 'read_2.fastq.gz' && | ||
| #else: | ||
| ln -s '$forward' 'read_1.fastq' && | ||
| ln -s '$reverse' 'read_2.fastq' && | ||
| #end if | ||
| #end if | ||
| #else: | ||
| #if $single_reads.ext.endswith('.gz'): | ||
| ln -s '$single_reads' 'read.fastq.gz' && | ||
| gunzip 'read.fastq.gz' && | ||
| #else: | ||
| ln -s '$single_reads' 'read.fastq' && | ||
| #end if | ||
| #end if | ||
|
|
||
| gen_cov_file.sh | ||
| -a 'assembly.fasta' | ||
| -o 'outputs' | ||
| -t \${GALAXY_SLOTS:-1} | ||
| -l ${length} | ||
| #if $read_typ.is_select == "normal": | ||
| 'read_1.fastq' | ||
| 'read_2.fastq' | ||
| #else: | ||
| --single-end 'read.fastq' | ||
| #end if | ||
|
|
||
| && | ||
|
|
||
| mv 'outputs/work_files/read.bam' '$bam_file' | ||
|
|
||
| ]]> | ||
| </command> | ||
| <inputs> | ||
| <param name="assembly" type="data" format="fasta,fasta.gz" label="Input assembly file"/> | ||
| <conditional name="read_typ"> | ||
| <param name="is_select" type="select" label="Type of reads"> | ||
| <option value="normal" selected="true">Paired-end non-interleaved</option> | ||
| <option value="single">Single-end</option> | ||
| </param> | ||
| <when value="normal"> | ||
| <conditional name="input_typ"> | ||
| <param name="is_select" type="select" label="Input type"> | ||
| <option value="paired">Paired collection</option> | ||
| <option value="single">No collection</option> | ||
| </param> | ||
| <when value="paired"> | ||
| <param name="paired_reads" type="data_collection" collection_type="paired" format="fastq,fastq.gz" label="Input paired reads collection"/> | ||
| </when> | ||
| <when value="single"> | ||
| <param name="forward" type="data" format="fastq,fastq.gz" label="Input forward reads"/> | ||
| <param name="reverse" type="data" format="fastq,fastq.gz" label="Input reverse reads"/> | ||
| </when> | ||
| </conditional> | ||
| </when> | ||
| <when value="single"> | ||
| <param name="single_reads" type="data" format="fastq,fastq.gz" label="Input single-end reads"/> | ||
| </when> | ||
| </conditional> | ||
| <param name="length" type="integer" value="1000" label="Set minimum contig length"/> | ||
| </inputs> | ||
| <outputs> | ||
| <data name="bam_file" format="bam" label="COMEBin bam file"/> | ||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="1"> | ||
| <param name="assembly" value="bowtie2-ref.fasta" ftype="fasta"/> | ||
| <conditional name="read_typ"> | ||
| <param name="is_select" value="normal"/> | ||
| <conditional name="input_typ"> | ||
| <param name="is_select" value="paired"/> | ||
| <param name="paired_reads"> | ||
| <collection type="paired"> | ||
| <element name="forward" value="bowtie2-fq_1.fastq" ftype="fastq"/> | ||
| <element name="reverse" value="bowtie2-fq_2.fastq" ftype="fastq"/> | ||
| </collection> | ||
| </param> | ||
| </conditional> | ||
| </conditional> | ||
| <output name="bam_file"> | ||
| <assert_contents> | ||
| <has_size size="17000" delta="1000"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| <test expect_num_outputs="1"> | ||
| <param name="assembly" value="bowtie2-ref.fasta" ftype="fasta"/> | ||
| <conditional name="read_typ"> | ||
| <param name="is_select" value="normal"/> | ||
| <conditional name="input_typ"> | ||
| <param name="is_select" value="single"/> | ||
| <param name="forward" value="bowtie2-fq_1.fastq" ftype="fastq"/> | ||
| <param name="reverse" value="bowtie2-fq_2.fastq" ftype="fastq"/> | ||
| </conditional> | ||
| </conditional> | ||
| <output name="bam_file"> | ||
| <assert_contents> | ||
| <has_size size="17000" delta="1000"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| <test expect_num_outputs="1"> | ||
| <param name="assembly" value="bowtie2-ref.fasta" ftype="fasta"/> | ||
| <conditional name="read_typ"> | ||
| <param name="is_select" value="normal"/> | ||
| <conditional name="input_typ"> | ||
| <param name="is_select" value="single"/> | ||
| <param name="forward" value="bowtie2-fq_1.fastq.gz" ftype="fastq.gz"/> | ||
| <param name="reverse" value="bowtie2-fq_2.fastq.gz" ftype="fastq.gz"/> | ||
| </conditional> | ||
| </conditional> | ||
| <output name="bam_file"> | ||
| <assert_contents> | ||
| <has_size size="17000" delta="1000"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| </tests> | ||
| <help> | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The help is a bit sparse :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is this bash script doing? Why is bowtie2 not good enough? This looks a bit suspicious to me :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did change the help to answer this question. This script was build using a modified version of the binning.sh of MetaWRAP and i assume the tool itself was build to fit to the these generated BAM files which means that other could work but i will not bet on this. Since BAM files are binary it is hard to see the difference but i could link you a history where 2 Comebin runs did run 1 with the BAM file generated by the the utitlity script and one with the Bowtie2 output: https://usegalaxy.eu/u/santinof/h/metagenome-assembled-genomes-mags-generation-imported-from-uploaded-file There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @SantaMcCloud I still feel there is still room to update the help text. As of now, it lacks quite a lot of information. Maybe you could add what does the script does? Why is this script preferred over Bowtie2? What are inputs, expected outputs? That would be much helpful for the users. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added the help section a little bit. There si not a lot left what i could write there since this scritp use case is only the preprocessing for COMEBin. Bowtie2 doesnt work here and this script output doesnt work for other binnner. If this still is not enought since i can not write more detailed since i dont know metaWRAP i have to read a bit into it. |
||
| <![CDATA[ | ||
|
|
||
| **Why using this tool instead of Bowtie2** | ||
|
|
||
| COMEBin used modified from the "binning.sh" of MetaWRAP which means a complete pipeline is running to generate BAM files fit for COMEBin to use. | ||
|
|
||
| Other BAM file tools might also work but it is recommended to use this utility script because some test showed that using Bowtie2 COMEBin always failed since certain IDs are not contain in COMEBins data. | ||
|
|
||
| This tool only serve one use case which is when using COMEBin as binner other binner can use and it is recommended to use Bowtie2 in this case. Only use this script when the binner is COMEBin! | ||
|
|
||
| **Input** | ||
|
|
||
| - Single-end reads or Paired-end (non-)interleaved reads | ||
|
|
||
| **Output** | ||
|
|
||
| - One BAM file build for use for COMEBin (no other binner can use this file so please run Bowtie2 for other binner!) | ||
|
|
||
| ]]> | ||
| </help> | ||
| <expand macro="citations"/> | ||
| </tool> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| <macros> | ||
| <xml name="requirements"> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">comebin</requirement> | ||
| </requirements> | ||
| </xml> | ||
| <token name="@TOOL_VERSION@">1.0.4</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">24.2</token> | ||
| <xml name="citations"> | ||
| <citations> | ||
| <citation type="doi">10.1038/s41467-023-44290-z</citation> | ||
| </citations> | ||
| </xml> | ||
| </macros> |
Uh oh!
There was an error while loading. Please reload this page.