-
Notifications
You must be signed in to change notification settings - Fork 23
Formalized Nomenclature
The standard human sequence variant nomenclature of the Human Genome Sequence Variation Society (HGVS) is used to describe gene variants and their consequences at RNA and protein levels in an umambiguous manner.
The standard nomenclature is provided by the HGVS in textual format and the recommendations are divided across DNA, RNA and protein sections. This makes it difficult to get the overview necessary to design detailed and clear rules for standard nomenclature checkers, such as Mutalyzer.
Approaching the standard nomenclature as a scientific sublanguage, we have created two formal descriptions of the syntax in Extended Backus-Naur Form (EBNF): one at the [DNA-RNA level](h## Standard human sequence variant nomenclature in EBNF format The standard human sequence variant nomenclature of the Human Genome Sequence Variation Society (HGVS) is used to describe gene variants and their consequences at RNA and protein levels in an umambiguous manner.
The standard nomenclature is provided by the HGVS in textual format and the recommendations are divided across DNA, RNA and protein sections. This makes it difficult to get the overview necessary to design detailed and clear rules for standard nomenclature checkers, such as Mutalyzer.
Approaching the standard nomenclature as a scientific sublanguage, we have created two formal descriptions of the syntax in Extended Backus-Naur Form (EBNF): one at the DNA-RNA level and one at the protein level. Please note that for backwards compatibility these descriptions include rules for several description types used in previous nomenclature versions.
The DNA and RNA variant nomenclature EBNF v.2.0.0. has been used to generate the context-free nomenclature parser of the current Mutalyzer 2 Syntax Checker. Updated versions of the EBNFs used in future versions of the parser will be made available.
Apart from syntactic rules, the standard nomenclature uses semantic rules, which have been recapitulated in: Taschner PE, den Dunnen JT. Describing structural changes by extending HGVS sequence variation nomenclature. Hum Mutat. 2011, 32:507–511doi:10.1002/humu.21427
For more details about the EBNFs and the parser: Laros JFJ, Blavier A, den Dunnen JT, Taschner PEM. A formalized description of the standard human variant nomenclature in Extended Backus-Naur Form. BMC Bioinformatics 2011, 12(Suppl 4):S5doi:10.1186/1471-2105-12-S4-S5 ttp://www.biomedcentral.com/content/supplementary/1471-2105-12-s4-s5-s1.pdf) and one at the protein level. Please note that for backwards compatibility these descriptions include rules for several description types used in previous nomenclature versions.
The DNA and RNA variant nomenclature EBNF v.2.0.0. has been used to generate the context-free nomenclature parser of the current Mutalyzer 2 Syntax Checker. Updated versions of the EBNFs used in future versions of the parser will be made available.
Apart from syntactic rules, the standard nomenclature uses semantic rules, which have been recapitulated in:
- Taschner PE, den Dunnen JT. Describing structural changes by extending HGVS sequence variation nomenclature. Hum Mutat. 2011, 32:507–511doi:10.1002/humu.21427
For more details about the EBNFs and the parser:
- Laros JFJ, Blavier A, den Dunnen JT, Taschner PEM. A formalized description of the standard human variant nomenclature in Extended Backus-Naur Form. BMC Bioinformatics 2011, 12(Suppl 4):S5doi:10.1186/1471-2105-12-S4-S5