-
Notifications
You must be signed in to change notification settings - Fork 23
Name Checker
The Mutalyzer Name Checker can check the correctness of a variant Description under the following conditions:
- The Syntax Checker is able to parse the description.
- A valid Reference Sequence record is provided.
- The reference sequence record contains all the sequence affected by the variant description.
- The reference sequence record annotation contains sufficient information to support the selected position numbering scheme.
- The semantic nomenclature rules applicable to the variant description are supported by the Name Checker.
If you are not familiar with the HGVS standard human sequence variant nomenclature, try the Name Generator first or check Variant Descriptions.
The Name Checker expects sequence variant descriptions in the following format:
<accession number>.<version number>:<sequence type>.<variant>
Example
NM_003002.1:c.5delC
AL449423.14:g.61866_85191del
If the reference sequence record contains multiple genes, transcript variants or protein isoforms and position numbering becomes ambiguous, this format is extended to:
<accession number>.<version number><(Gene Symbol)>:<sequence type>.<variant>
The gene symbol has to be extended with transcript variant or protein isoform numbers (e.g., _v001 or _i001, respectively), if multiple transcript variants or protein isoforms are annotated.
Example
The genomic description AL449423.14:g.61866_85191del
is equivalent to the
following unambiguous descriptions:
-
8 descriptions relative to CDKN2A transcript variants:
AL449423.14(CDKN2A_v001):c.-271-u19352_234del
AL449423.14(CDKN2A_v002):c.5_400del
AL449423.14(CDKN2A_v003):n.1-u19623_508del
AL449423.14(CDKN2A_v004):n.42_437del
AL449423.14(CDKN2A_v005):n.449+371_705del
AL449423.14(CDKN2A_v006):n.481+371_565del
AL449423.14(CDKN2A_v007):n.53+371_859+d18212del
AL449423.14(CDKN2A_v008):n.1-u23242_84del
-
1 description relative to an MTAP transcript variant:
AL449423.14(MTAP_v005):n.*60994-u23670_*60994-u345del
-
2 descriptions relative to CDKN2B transcript variants:
AL449423.14(CDKN2B_v001):c.*3084+d8453_*3084+d31778del
AL449423.14(CDKN2B_v002):c.*303+d11537_*303+d34862del
-
1 description relative to a C9orf53 transcript variant:
-
AL449423.14(C9orf53_v001):c.*312+d3374_*312+d26699del
The Name Checker will try to regenerate the variant sequence and apply the semantic rules of the HGVS standard human sequence variant nomenclature to name it accordingly.
The Mutalyzer Name Checker has been designed to issue warnings, when correcting entries, encountering inconsistencies, incomplete sequences or annotation, or identifying variations with potential effects on splicing before presenting the results of the analysis. Errors will be generated when the entries can not be processed properly (see the conditions mentioned above).
Click the link below for a Name Checker output example:
Within the input box:
- The submitted description
- Top sequence: part of the reference sequence affected by the variant with 25 nucleotide upstream and downstream flanking sequences in 5' to 3' orientation
- Bottom sequence: the variant sequence with 25 nucleotide upstream and downstream flanking sequences in 5' to 3' orientation
The raw variant description shows the variation type and the position of the variant from the start of the reference sequence.
- The "View original variant in UCSC Genome Browser" link. Click this link to see the Mutalyzer custom variant track in the UCSC Genome Browser. Please note that the Base Position track displayed as Full will show amino acid codons for the forward orientation of the chromosomal reference sequence, whereas the codon affected might be on the reverse strand.
The genomic description of the variant using the reference sequence specified (only shown for genomic sequence records). If the reference sequence annotation contains mapping to a chromosomal reference sequence, the corresponding description will be listed under the heading: Alternative chromosomal position
Only shown for transcript sequence records. '''(Not for use in LSDBs in case of protein-coding transcripts). '''The description of the variant using the non-coding transcript position numbering. The link should not be used in combination with protein-coding transcripts, since the n. position will be interpreted as a c. position!
Lists all descriptions relative to transcript variants of genes affected by the variant. Descriptions are no predictions of variant effects at the RNA level
Note: Substitution descriptions for genes transcribed in the opposite orientation will use the reverse complement of nucleotides shown in the genomic description. Positions of insertion and deletions in those transcripts can shift in opposite directions due to the Position shift rule: According to the standard nomenclature a deletion of a G in a stretch of G's is described using the position of the most 3' G.
Lists all descriptions relative to protein isoforms of genes affected by the variant. The protein variant descriptions following the p. prefix are shown between parentheses to indicate that they are predictions. The descriptions are generated by translation of the variant coding sequence under the simple assumption that the annotated splice sites unaffected by the variant are still used.
Only displayed when descriptions relative to a specific transcript or protein are checked.
Reference protein sequence in single letter amino acid code. Amino acids affected by the variant are shown in red.
Predicted variant protein sequence in single letter amino acid code Amino acids not present in the reference protein are shown in red
Transcript information extracted from the reference sequence annotation presented in tabular format Lists all exons of the transcript with their corresponding numbers, genomic (g.) start and end positions and coding DNA (c.) start and end positions.
Lists the Coding sequence (CDS) start and end positions extracted from the reference sequence annotation.
Lists all restriction sites, which are created or deleted by the variant. Restriction sites are identified in the sequence using Biopython. The list is created by comparison of restriction sites present in the reference sequence and the variant sequence.
Lists all genes, transcript variants and protein isoforms extracted from the reference sequence annotation and the method to link them to each other.
Allows the user to download the reference sequence file.
AB026906.1:c.3_4insG
AB026906.1:c.[1del;4G>T]
AL449423.14(CDKN2A_v1):c.1_10del
UD_127955523176(DMD_v002):c.136G>T
LRG_1t1:c.266G>T