-
Notifications
You must be signed in to change notification settings - Fork 11
Terminology
This is the naming convention used across the web interfaces:
Object Type | Name | Example |
---|---|---|
SRA run accession | run |
https://serratus.io/query?run=ERR2756788 |
GenBank accession | genbank |
https://serratus.io/query?genbank=EU769558.1 |
Viral family name | family |
https://serratus.io/query?family=Coronaviridae |
Due to the biological nature of this project, there is some terminology that is unfamiliar to the average developer:
Family is a rank in the classification of organisms (taxonomy). Viral families (e.g. Coronaviridae) are of particular interest for Serratus. This is the highest level that our data starts at.
https://serratus.io/query?family=Coronaviridae
The Sequence Read Archive (SRA) is a public bioinformatics database that contains raw DNA sequencing data from most open access studies.
There are four publicly accessioned hierarchical levels of SRA entities, along with possible prefixes:
- Study: SRP, ERP, DRP
- Sample: SRS, ERS, DRS
- Experiment: SRX, ERX, DRX
- Run: SRR, ERR, DRR
Serratus aims to analyze all Run accessions.
https://serratus.io/query?run=ERR2756788
GenBank is another public database that contains DNA sequences. A GenBank accession repreents a single sequence submission in which raw reads (SRA run-level data) have been processed and aligned. The important thing to note is that GenBank accessions are directly associated with SRA runs in analysis results by Serratus. Each GenBank accession can also be associated with a particular family.
Reference
Records
Work in Progress
Stale