Skip to content

Multipatch

Compare
Choose a tag to compare
@ekg ekg released this 20 Jul 15:56
· 533 commits to main since this release

Buildable source tarball: wfmash-v0.17.0.tar.gz

This release introduces multipatch alignment capabilities, significantly enhancing wfmash's ability to handle complex genomic structures, particularly inversions and other rearrangements. Multipatching refers to a process in which the initial wflign traceback is patched, we determine that an inverted orientation of the patch is preferable (as introduced in v0.16.0), and (in v0.17.0) we now attempt multiple patching steps to span the gap. Key improvements include:

Multipatch Alignment:

  • Implemented a progressive alignment approach that can detect and align multiple patches, including inversions, within a single alignment region.
  • Added a new tag patch:Z:true to indicate multipatch alignments in the output.
  • Introduced an inv:Z:true/false tag to specify whether a patch is inverted.

Alignment Refinements:

  • Implemented trimming of alignments to remove leading and trailing indels, improving alignment quality.
  • Added bounds detection for alignments to better handle partial matches.
  • Increased the default chain gap to 6x segment length or 30k, allowing for detection of larger variants.

Output Enhancements:

  • Modified the output format to clearly distinguish multipatch alignments.
  • Improved logging and debugging output for better insight into the alignment process.

Code Improvements:

  • Enhanced the alignment_t class with new accessors for query and target begin/end positions.
  • Implemented pruning of overlapping patches to avoid redundant alignments.
  • Refactored several core functions for better modularity and readability.

Build System:

  • Added libdeflate as a dependency in the Guix build configuration.

This release significantly improves wfmash's ability to handle complex genomic alignments, particularly those involving local inversions and other structural variations. The multipatch approach allows for a more complete representation of genomic relationships in challenging regions than is available in other methods.

Happy aligning with enhanced structural variation breakpoint resolution! 🧬🔍🧮

What's Changed

Full Changelog: v0.16.0...v0.17.0