All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Read mean Qscore and alignment flag added to
stats_from_bam
output. - Add option to
filter_bam
to keep unmapped reads that pass (non-alignment) filters. - Enable multithreaded bam file reading and writing in
filter_bam
.
- Pin numpy<2.0.
mini_align
can take a bam as input and optionally retain all or subset of bam tags.
- Drop porechop dependency from pypi package due to blocking package uploads. Add warning to the user if porechop is called by
mini_assemble
but does not exist.
subsample_bam
andcoverage_from_bam
now have unified read filtering options and logic.
filter_bam
to filter a bam with the same logic used insubsample_bam
.
subsample_bam
was previously subsampling proportionally before filtering resulting in lower than expected depth.
subsample_bam
:--force_low_coverage
saves contigs with coverage below the targetsubsample_bam
:--force_non_primary
saves multimapping for the subsampled readscoverage_from_bam
:--primary_only
considers only primary reads when computing the depthbedtools
: upgraded to v2.31porechop
: switched to using Artic version
- Option
-C
formini_align
to copy fastx comments into bam tags
- Minor compatibility fixes to support
pandas>=2.0
subsample_bam
:--quality
filtering now uses mean error probability, not mean of quality scores as previously.subsample_bam
: enable filtering for proportional subsampling.
- Fix crashes in
subsample_bam
with alignment filtering andcommon_errors_from_bam
assess_assembly -H
uses correct output directory.- Handling of comments in bed files.
- Added
Q(sub)
to summary output. - Ported bed file handling from
intervaltrees
toncls
, speeding up assessment and multithreading efficiency.
stats_from_bam
: handle cigar strings using=
andX
instead ofM
.
- Include mapping quality in
stats_from_bam
output.
- Handling of LRA bams in which NM tag is number of matches rather than edit distance.
- Added an option (
-y
) toassess_assembly
andmini_align
to include supplementary alignments. - Added an option (
-d
) tomini_align
andassess_assembly
to select minimap2 alignment preset. - Added accumulation of errors over a number of chunks (
-a
option insummary_from_stats
andassess_assembly
) to get better stats. - Use
-L
option forminimap2
. - Updated versions of minimap2, samtools, bcftools, bedtools, seqkit in Makefile to the most recent ones.
- Reduced memory consumption of
catalogue_errors
. fast_convert qa
now properly outputs a fasta file- Fixed
long_fastx
--others
option - Fixed
split_fastx
fastq output
assess_homopolymers
can use multiple threads
- Install
paftools.js
from minimap2 andk8
- Speed improvements to several benchmarking and analysis scripts
- Quoted all variables in
mini_align
to handle spaces in inputs.
stats_from_bam
no longer throws exception when no alignments have been proceseed.
coverage_from_bam
now has a--one_file
option to better specify the output in the common usage.
- Python 3.5 support
- Python >3.6 support