bcftools release 1.17:
Download the source code here: bcftools-1.17.tar.bz2.(The "Source code" downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.)
Changes affecting the whole of bcftools, or multiple commands:
-
The
-i
/-e
filtering expressions-
Error checks were added to prevent incorrect use of vector arithmetics. For example, when evaluating the sum of two vectors A and B, the resulting vector could contain nonsense values when the input vectors were not of the same length. The fix introduces the following logic:
- evaluate to C_i = A_i + B_i when length(A)==B(A) and set length(C)=length(A)
- evaluate to C_i = A_i + B_0 when length(B)=1 and set length(C)=length(A)
- evaluate to C_i = A_0 + B_i when length(A)=1 and set length(C)=length(B)
- throw an error when length(A)!=length(B) AND length(A)!=1 AND length(B)!=1
-
Arrays in
Number=R tags
can be now subscripted by alleles found inFORMAT/GT
. For example,
FORMAT/AD[GT] > 10
.. require support of more than 10 reads for each allele
FORMAT/AD[0:GT] > 10
.. same as above, but in the first sample
sSUM(FORMAT/AD[GT]) > 20
.. require total sample depth bigger than 20
-
-
The commands
consensus -H
and+split-vep -H
- Drop unnecessary leading space in the first header column and newly print
#[1]columnName
instead of the previous# [1]columnName
(#1856)
- Drop unnecessary leading space in the first header column and newly print
Changes affecting specific commands:
-
bcftools +allele-length
- Fix overflow for indels longer than 512bp and aggregate alleles equal or larger than that in the same bin (#1837)
-
bcftools annotate
-
bcftools call
-
bcftools consensus
- BREAKING CHANGE: the option
-I, --iupac-codes
newly outputs IUPAC codes based onFORMAT/GT
of all samples. The-s, --samples
and-S, --samples-file
options can be used to subset samples. In order to ignore samples and consider only theREF
andALT
columns (the original behavior prior to 1.17), run with-s -
(#1828)
- BREAKING CHANGE: the option
-
bcftools convert
- Make variantkey conversion work for sites without an
ALT
allele (#1806)
- Make variantkey conversion work for sites without an
-
bcftool csq
-
Fix a bug where a MNV with multiple consequences (e.g. missense + stop_gained) would report only the less severe one (#1810)
-
GFF file parsing was made slightly more flexible, newly ids can be just
XXX
rather than, for example,gene:XXX
-
New
gff2gff
perl script to fix GFF formatting differences
-
-
bcftools +fill-tags
- More of the available annotations are now added by the
-t all
option
- More of the available annotations are now added by the
-
bcftools +fixref
-
New
INFO/FIXREF
annotation -
New
-m
swap mode
-
-
bcftools +mendelian
- The +mendelian plugin has been deprecated and replaced with +mendelian2. The function of the plugin is the same but the command line options and the output format has changed, and for this was introduced as a new plugin.
-
bcftools mpileup
-
Most of the annotations generated by mpileup are now optional via the
-a, --annotate
option and add several new (mostly experimental) annotations. -
New option
--indels-2.0
for an EXPERIMENTAL indel calling model. This model aims to address some known deficiencies of the current indel calling algorithm, specifically, it uses diploid reference consensus sequence. Note that in the current version it has the potential to increase sensitivity but at the cost of decreased specificity. -
Make the FS annotation (Fisher exact test strand bias) functional and remove it from the default annotations
-
-
bcftools norm
-
New
--multi-overlaps
option allows to set overlapping alleles either to the ref allele (the current default) or to a missing allele (#1764 and #1802) -
Fixed a bug in
-m -
which does not split missingFORMAT
values correctly and could lead to emptyFORMAT
fields such as::
instead of the correct:.:
(#1818) -
The
--atomize
option previously would not split complex indels such asC>GGG
. Newly these will be split into two recordsC>G
andC>CGG
(#1832)
-
-
bcftools query
- Fix a rare bug where the printing of
SAMPLE
field withquery
was incorrectly suppressed when the-e
option contained a sample expression while the formatting query did not. See #1783 for details.
- Fix a rare bug where the printing of
-
bcftools +setGT
-
bcftools +split-vep
-
New options
-g, --gene-list
and--gene-list-fields
which allow to prioritize consequences from a list of genes, or restrict output to the listed genes -
New
-H, --print-header
option to print the header with-f
-
Work around a bug in the LOFTEE VEP plugin used to annotate gnomAD VCFs. There the
LoF_info
subfield contains commas which, in general, makes it impossible to parse the VEP subfields. The+split-vep
plugin can now work with such files, replacing the offending commas with slash (/
) characters. See also Ensembl/ensembl-vep#1351 -
Newly the
-c, --columns
option can be omitted when a subfield is used in-i/-e
filtering expression. Note that-c
may still have to be given when it is not possible to infer the type of the subfield. Note that this is an experimental feature.
-
-
bcftools stats
- The per-sample stats (PSC) would not be computed when
-i/-e
filtering options and the-s -
option were given but the expression did not include sample columns (1835)
- The per-sample stats (PSC) would not be computed when
-
bcftools +tag2tag
- Revamp of the plugin to allow wider range of tag conversions, specifically all combinations from
FORMAT/GL,PL,GP
toFORMAT/GL,PL,GP,GT
- Revamp of the plugin to allow wider range of tag conversions, specifically all combinations from
-
bcftools +trio-dnm2
-
New
-n, --strictly-novel
option to downplay alleles which violate Mendelian inheritance but are not novel -
Allow to set the
--pn
and--pns
options separately for SNVs and indels and make the indel settings more strict by default -
Output missing
FORMAT/VAF
values in non-trio samples, rather than random nonsense values
-
-
bcftools +variant-distance
- New option
-d, --direction
to choose the directionality: forward, reverse, nearest (the default) or both (#1829)
- New option