Releases: Illumina/ExpansionHunter
ExpansionHunter-v5.0.0
ExpansionHunter v5.0.0 introduces substantial changes to accelerate analysis of large genome-wide STR catalogs, in addition to build system improvements, bug fixes and catalog updates.
Large Catalog Support
- ExpansionHunter genotyping can now be accelerated across multiple threads by using the new
--threads
option. - To further support large catalog analysis, streaming mode memory requirements have been decreased by a factor of 20.
- Together with additional runtime optimizations, a catalog of 240,441 STRs can now be genotyped on a 35x human sample in less than 31 minutes (on 16 threads) with 25 GB of memory.
Comparison of v5.0.0 to v4.0.2 for a catalog of 37,413 STRs on a 35x human sample:
Version | Analysis Mode | Threads | Wall TIme (mm:ss) | Peak RSS (GB) |
---|---|---|---|---|
v4.0.2 | streaming | 1 | 93:10 | 85.6 |
v5.0.0 | streaming | 16 | 4:35 | 3.5 |
v4.0.2 | seeking | 1 | 272:23 | 0.5 |
v5.0.0 | seeking | 16 | 19:00 | 1.1 |
Additional Updates
- Added definitions for GIPC1 repeats and sorted all catalogs
- Updated depth calculation to Include singleton reads
- BAM/CRAM input to streaming mode is no longer required to be sorted or indexed
- Bug fixes
- Reorganized source code and build system
- The build system has been redesigned to download most third-party libraries. An active internet connection is now required to build from source.
Contributors: @ctsa, @yjqiu, @egor-dolzhenko
ExpansionHunter-v5.0.0-linux_x86_64.tar.gz and ExpansionHunter-v5.0.0-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v4.0.2
Added definitions of NIPA1, GLS, RFC1, and PABPN1 repeats to all catalogs and NOTCH2NL repeat to hg38 catalog only (it appears that it is not straightforward to define NOTCH2NL repeat for hg19 reference). Improved alignment accuracy of in-repeat reads located in regions containing multiple long STRs with the same motif. Thank you to Andreas Halman for identifying this issue.
Contributors: @egor-dolzhenko, @kscheffler, @felixschlesinger, @yjqiu.
ExpansionHunter-v4.0.2-linux_x86_64.tar.gz and ExpansionHunter-v4.0.2-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v4.0.1
This is the first release of ExpansionHunter version 4. This version introduces a new STR genotyping algorithm that can handle ambiguously-mapping reads better than before. This enables ExpansionHunter to genotype some repeats more accurately. In particular, the program should perform much better on PHOX2B and other polyalanine repeats.
Contributors: @egor-dolzhenko, @kscheffler, @felixschlesinger, @yjqiu.
ExpansionHunter-v4.0.1-linux_x86_64.tar.gz and ExpansionHunter-v4.0.1-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v3.2.2
This is a minor release that fixes a bug caused by secondary alignments with an empty sequence.
ExpansionHunter-v3.2.2-linux_x86_64.tar.gz and ExpansionHunter-v3.2.2-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v3.2.0
This release introduces low locus-depth VCF filter and updates GraphTools to the latest version.
ExpansionHunter-v3.2.0-linux_x86_64.tar.gz and ExpansionHunter-v3.2.0-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v3.1.2
This release introduces an improved algorithm for the detection of in-repeat read pairs. This change will result in better size estimates of long C9orf72 and FMR1 repeat expansions.
ExpansionHunter-v3.1.2-linux_x86_64.tar.gz and ExpansionHunter-v3.1.2-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v3.0.1
This is a minor release that fixes a bug in the extraction of reads from some CRAM files.
ExpansionHunter-v3.0.0
This is the full v3.0.0 release. It introduces a major update to the program aimed at increasing its flexibility and accuracy. To accommodate the new features, this version makes a significant number of non-backward compatible changes compared to v2.x.x.
The new features of the program are described in our application note on bioRxiv. Briefly:
- Repeat regions are now represented by sequence graphs
- Loci containing multiple adjacent repeats (and even small variants) are allowed
- Polyalanine repeats and other repeats whose structure can be defined using IUPAC degenerate base codes can be genotyped
We have also released a tool for visualizing alignments of reads overlapping STRs. It should be very useful for accessing the accuracy of ExpansionHunter calls.
See documentation and release notes for pre-releases v3.0.0-rc1 through v3.0.0-rc5 for more information.
ExpansionHunter-v3.0.0-linux_x86_64.tar.gz and ExpansionHunter-v3.0.0-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v3.0.0-rc5
This release introduces the following changes:
- Several bugs in processing of regions with very high or low coverage were fixed
- Regions with coverage below 10x are now removed from the analysis
- Algorithm for calculating size confidence intervals was improved to better handle high-coverage regions
- Format of the JSON output files was changed (see documentation) and now includes estimated coverage at each region
Huge thank you to @christopher-schroeder for testing the program and reporting multiple bugs.
ExpansionHunter-v3.0.0-rc5-linux_x86_64.tar.gz and ExpansionHunter-v3.0.0-rc5-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.
ExpansionHunter-v3.0.0-rc4
This is a minor release that fixes bugs in variant catalog ingestion and read summarization components.
ExpansionHunter-v3.0.0-rc4-linux_x86_64.tar.gz and ExpansionHunter-v3.0.0-rc4-macOS.tar.gz are binary distributions for 64-bit Linux and macOS respectively.