tackling the all-vs-all matrix
Buildable Source Tarball: wfmash-v0.14.0.tar.gz
This release provides support for subsetting the queries which are used in addition to the target subsetting. A list of queries can be offered. (We still work with only a single target though.) The idea is that this will make it possible for us to subdivide the all-versus-all alignment matrix and run many small jobs where multiple queries are aligned against a single target. However, running all queries against one target would be computationally infeasible, because there might be many hundreds of thousands of queries. There are some other bug fixes and updates as well, but the main difference that triggers a release is the change in the command line API.
changelog
Query filtering and specification improvements
- Added support for specifying a comma-delimited list of query name prefixes to filter queries with the
-Q
/--query-prefix
option. - Added
-A
/--query-list
option to specify a file containing a list of query sequence names to use. - Updated internal sequence iteration and counting logic to properly apply the new query filtering options.
Target filtering option name changes
- Renamed target prefix filtering option from
-P
/--target-prefix
to-T
/--target-prefix
for consistency. - Renamed target list filtering option from
-A
/--target-list
to-R
/--target-list
.
All-to-all alignment script improvements
- Updated
scripts/all2all_jobs.py
to:- Support grouping by genome, haplotype, or contig.
- Allow specifying different grouping levels for target and query sequences.
- Directly generate wfmash command lines.
- Added
scripts/make_source_targball.sh
to generate a source tarball for releases.
Build and testing updates
- Added back
rt
library to CMake configuration. - Updated CI tests to run on the
main
branch. - Adjusted CI test cases for the subset of the LPA dataset.
Bug fixes
- Fixed a
heap-use-after-free
error inwflign_affine_wavefront()
.