-
Notifications
You must be signed in to change notification settings - Fork 23
Back Translator
The purpose of the Mutalyzer Back Translator is to help gene variant database curators and researchers to predict the potential nucleotide substitutions underlying variants reported as predictions at the protein level only.
Back translation from amino acid substitutions to nucleotide substitutions is achieved by considering all single nucleotide substitutions that may lead to the observed amino acid change.
Note: if more than one substitution is needed to explain the observed amino acid change, the Back Translator will return no results.
We distinguish two types of back translation. In one case we know what the reference sequence of the transcript is; in the other case we do not. Knowledge of the transcript reference sequence may be important as illustrated in the following example.
Suppose we have the predicted amino acid substitution p.Leu92Phe
, amongst the
possible nucleotide substitutions that may lead to this description are:
c.276A>T
, c.276G>T
and c.274C>T
. Obviously, by knowing the reference
sequence of the transcript at position 276, two out of these three options can
be discarded.
The Mutalyzer Back Translator has two methods of retrieving the transcript reference sequence:
- Directly by providing the transcript accession number (e.g.,
NM_003002.3
). - Indirectly by providing the protein accession number (e.g.,
NP_002993.1
).
In the latter case, using the NCBI databases an attempt is made to link the protein accession number to the corresponding transcript accession number. When successful, these two methods provide equally reliable results. Otherwise, a warning will be issued and the fall back method described below will be used.
Even if the transcript reference is not known, it is possible to do a
meaningful back translation by considering all possible reference codons. In
general this method will yield more possibilities, but not always. The amino
acid substitution p.Asp92Tyr
for example, can only be explained by the single
nucleotide substitution (c.274G>T
), so in this case lack of the transcript
reference sequence is not detrimental to our results.
The Mutalyzer Back Translator is aware of all nucleotide substitutions that will be more specific when a nucleotide reference sequence is supplied. If a back translation of such substitutions is requested, warnings will be issued informing the user about of possible improvement of the predictions.
For the following substitution, the Back Translator can find a link to the transcript reference sequence.
NP_002993.1:p.Asp92Glu
The results are fully HGVS compliant and can be used directly in the Name Checker:
NM_003002.3:c.276C>A
NM_003002.3:c.276C>G
For the next substitution, the link to the transcript reference sequence can not be found. Lack of this knowledge does not restrict the number of possibilities and is detrimental to the output.
NP_000000.0:p.Leu92Phe
Two warnings will be issued: one stating that no nucleotide reference sequence could be found; the other informing the user that the back translation could be improved by supplying this reference sequence. In addition, a list of possible nucleotide substitutions is given.
UNKNOWN:c.274C>T
UNKNOWN:c.276G>T
UNKNOWN:c.276G>C
UNKNOWN:c.276A>C
UNKNOWN:c.276A>T
These variant descriptions can not be used directly in the Name Checker or other interfaces because they lack the required accession numbers.