Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimap2 --split-fasta? #75

Open
dhoconno opened this issue Nov 15, 2024 · 0 comments
Open

Minimap2 --split-fasta? #75

dhoconno opened this issue Nov 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@dhoconno
Copy link

dhoconno commented Nov 15, 2024

Description of the bug

The workflow is failing at MINIMAP2_ALIGN with two errors:

  1. Minimap2 can't process FASTA headers unless --split-prefix is specified
  2. SAM sorting has the flag -m 0M which triggers this error: [bam_sort] -m setting (0 bytes) is less than the minimum required (1M).

Command used and terminal output

Command is:


minimap2 \
       -ax sr \
      -t 9 -I 0M \
      44_10_1_S11.dwnld.references.fasta \
      44_10_1_S11.standard8.classified_1.fastq.gz 44_10_1_S11.standard8.classified_2.fastq.gz \
         \
      -L \
      -a | samtools sort -@ 9 -m 0M | samtools view  -q 4  -@ 9 -b -h -o 44_10_1_S11.44_10_1_S11.dwnld.references.bam

This gives the errors:

 [bam_sort] -m setting (0 bytes) is less than the minimum required (1M).
  
  Trying to run with -m too small can lead to the creation of a very large number
  of temporary files.  This may make sort fail due to it exceeding limits on the
  number of files it can have open at the same time.
  
  Please check your -m parameter.  It should be an integer followed by one of the
  letters K (for kilobytes), M (megabytes) or G (gigabytes).  You should ensure it
  is at least the minimum above, and much higher if you are sorting a large file.
  [main_samview] fail to read the header from "-".
  [M::mm_idx_gen::0.098*0.98] collected minimizers
  [M::mm_idx_gen::0.127*1.69] sorted minimizers
  [WARNING] For a multi-part index, no @SQ lines will be outputted. Please use --split-prefix.

I can manually override and get the command to run as expected if I run:

minimap2 \
     -ax sr \
    -t 8 -I 0M \
    --split-prefix \
    44_10_1_S11.dwnld.references.fasta \
    44_10_1_S11.standard8.classified_1.fastq.gz 44_10_1_S11.standard8.cla>
       \
    -L \
    -a | samtools sort -@ 8 | samtools view  -q 4  -@ 8 -b -h -o 44_10_1_>

I'm invoking the overall workflow with:

nextflow run https://github.com/jhuapl-bio/taxtriage \
-r main -latest \
--input /home/dhoconno/git/experiments/2024/30806/30806-samplesheet.csv \
--outdir /scratch/dhoconno/30806/out \
--max_cpus 16 \
--max_memory 512 \
-work-dir /scratch/dhoconno/30806/work \
-profile local,singularity

The small example test command provided in the docs works fine.

Any guidance?

Thanks,

dave

Relevant files

No response

System information

No response

@dhoconno dhoconno added the bug Something isn't working label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant