Skip to content

ExpansionHunter-v5.0.0

Latest
Compare
Choose a tag to compare
@ctsa ctsa released this 20 Aug 15:13
· 8 commits to master since this release

ExpansionHunter v5.0.0 introduces substantial changes to accelerate analysis of large genome-wide STR catalogs, in addition to build system improvements, bug fixes and catalog updates.

Large Catalog Support

  • ExpansionHunter genotyping can now be accelerated across multiple threads by using the new --threads option.
  • To further support large catalog analysis, streaming mode memory requirements have been decreased by a factor of 20.
  • Together with additional runtime optimizations, a catalog of 240,441 STRs can now be genotyped on a 35x human sample in less than 31 minutes (on 16 threads) with 25 GB of memory.

Comparison of v5.0.0 to v4.0.2 for a catalog of 37,413 STRs on a 35x human sample:

Version Analysis Mode Threads Wall TIme (mm:ss) Peak RSS (GB)
v4.0.2 streaming 1 93:10 85.6
v5.0.0 streaming 16 4:35 3.5
v4.0.2 seeking 1 272:23 0.5
v5.0.0 seeking 16 19:00 1.1

Additional Updates

  • Added definitions for GIPC1 repeats and sorted all catalogs
  • Updated depth calculation to Include singleton reads
  • BAM/CRAM input to streaming mode is no longer required to be sorted or indexed
  • Bug fixes
    • Fixed minor errors in depth computation when analyzing large catalogs in streaming mode
    • Fixed shadowing issue preventing new MinimalLocusCoverage setting (#137 from @Lenbok)
  • Reorganized source code and build system
    • The build system has been redesigned to download most third-party libraries. An active internet connection is now required to build from source.

Contributors: @ctsa, @yjqiu, @egor-dolzhenko


ExpansionHunter-v5.0.0-linux_x86_64.tar.gz and ExpansionHunter-v5.0.0-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.