1.14
Download the source code here: bcftools-1.14.tar.bz2.(The "Source code" downloads are generated by GitHub and are incomplete as they don't bundle HTSlib and are missing some generated files.)
Changes affecting the whole of bcftools, or multiple commands
-
New
--regions-overlap
and--targets-overlap
options which address a long-standing design problem with subsetting VCF files by region. BCFtools recognize two sets of options, one for streaming (-t/-T
) and one for index-gumping (-r/-R
). They behave differently, the first includes only records with POS coordinate within the regions, the other includes overlapping regions. The two new options allow to modify the default behaviour, see the man page for more details. -
The
--output-type
option can be used to override the default compression level
Changes affecting specific commands
-
bcftools annotate
-
when
--set-id
and--remove
are combined,--set-id
cannot use tags deleted by--remove
. This is now detected and the program exists with an informative error message instead of segfaulting (#1540) -
while non-symbolic variation are uniquely identified by
POS
,REF
,ALT
, symbolic alleles starting at the same position were indistinguishable. This prevented correct matching of records with the same positions and variant type but different length given byINFO
/END
(samtools/htslib@60977f2). When annotating from a VCF/BCF, the matching is done automatically. When annotating from a tab-delimited text file, this feature can be invoked by using-c INFO/END
. -
add a new
.
modifier to control whether missing values should be carried over from a tab-delimited file or not. For example:-c TAG ..
addsTAG
if the source value is not missing. IfTAG
exists in the target file, it will be overwritten.
-c .TAG ..
addsTAG
even if the source value is missing. This can overwrite non-missing values with a missing value and can create empty VCF fields (TAG=.
)
-
-
bcftools +check-ploidy
- by default missing genotypes are not used when determining ploidy. With the new option
-m, --use-missing
it is possible to use the information carried in the missing and half-missing genotypes (e.g..
,./.
or./1
)
- by default missing genotypes are not used when determining ploidy. With the new option
-
bcftools concat
:- new
--ligate-force
and--ligate-warn
options for finer control of-l, --ligate
behavior in imperfect overlaps. The new default is to throw an error when sites present in one chunk but absent in the other are encountered. To drop such sites and proceed, use the new--ligate-warn
option (previously this was the default). To keep such sites, use the new--ligate-force
option (#1567).
- new
-
bcftools consensus
:- Apply mask even when the VCF has no notion about the chromosome. It was possible to encounter this problem when
contig
lines were not present in the VCF header and no variants were called on that chromosome (#1592)
- Apply mask even when the VCF has no notion about the chromosome. It was possible to encounter this problem when
-
bcftools +contrast
:- support for chunking within map/reduce framework allowing to collect
NASSOC
counts even for empty case/control sample sets (#1566)
- support for chunking within map/reduce framework allowing to collect
-
bcftools csq
:-
bug fix, compound indels were not recognised in some cases (#1536)
-
compound variants were incorrectly marked as 'inframe' even when stop codon would occur before the frame was restored (#1551)
-
bug fix,
FORMAT/BCSQ
bitmasks could have been assigned incorrectly to some samples at multiallelic sites, a superset of the correct consequences would have been set (#1539) -
bug fix, the upstream stop could be falsely assigned to all samples in a multi-sample VCF even if the stop was relevant for a single sample only (#1578)
-
further improve the detection of mismatching chromosome naming (e.g. "chrX" vs "X") in the GFF, VCF and fasta files
-
-
bcftools merge
:- keep (sum)
INFO/AN,AC
values when merging VCFs with no samples (#1394)
- keep (sum)
-
bcftools mpileup
:- new
--indel-size
option which allows to increase the maximum considered indel size considered, large deletions in long read data are otherwise lost.
- new
-
bcftools norm
:-
atomization now supports
Number=A,R
string annotations (#1503) -
assign as many alternate alleles to genotypes at multiallelic sites in the
-m +
mode, disregarding the phase. Previously the program assumed to be executed as an inverse operation of-m -
, but when that was not the case, reference alleles would have been filled instead of multiple alternate alleles (#1542)
-
-
bcftools sort
:- increase accuracy of the
--max-mem
option limit, previously the limit could be exceeded by more than 20% (#1576)
- increase accuracy of the
-
bcftools +trio-dnm
:- new
--with-pAD
option to allow processing of VCFs without FORMAT/QS. The existing--ppl
option was changed to the analogous--with-pPL
- new
-
bcftools view
:- the functionality of the option
--compression-level
lost in 1.12 has been restored
- the functionality of the option