Status: In progress
This is guide to clarify the usage of positions for POS, END, and MATE_POS fields for structural variants inferred at basepair resolution and rendered to VCF 4.1 (or higher). The VCF spec leaves many clues on this topic but there is considerable room for off-by-one errors at VCF generation and parse stages.
All discussions below are based on the forward DNA strand, with "left" used to denote positions with a lower reference coordinate relative to "right" positions.
Most SVs will have some amount of breakpoint homology, its representation is reasonably well defined in the VCF spec already. The guide below follows the NGS community convention of standarizing variants by describing the local breakend standardized to the left-most position within its homology range.
All examples are drawn from public Platinum Genomes data for NA12878
POS indicates right-most position before crossing the left-shifted deletion left breakend:
END indicates right-most position before crossing the left-shifted deletion right breakend