-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the Glycopeptide Sequence Finder Wiki! This wiki is your centralized resource for exploring and understanding predicted glycoproteomes across various species. The goal is to provide clear, detailed, and accessible documentation for researchers and scientists working with glycoproteomics data in lesser studied organisms.
This wiki serves as a comprehensive guide to:
- Species Biology: Learn about the biological context and unique characteristics of each species.
- Glycobiology Insights: Discover detailed information on glycosylation patterns and glycoprotein profiles.
- Practical Use Cases: Explore how glycopeptide data can be applied in areas such as biomarker discovery, drug target identification, and comparative studies.
- Sample Types & Preparation: Find protocols and best practices for preparing different sample types, including blood, tissue, and cultured cells.
- Datasets: Access and review the datasets that underpin our predictions.
- References: Consult key literature, tools, and resources that support the data and methodologies presented.
- Species Pages: Each species has its own dedicated page with detailed sections on biology, glycobiology, sample preparation, and more.
-
Extraction Guides:
-
General Guide to Extracting Glycopeptides from Various Sample Types
This page covers step-by-step protocols and best practices for extracting glycopeptides from a range of sample types.
-
General Guide to Extracting Glycopeptides from Various Sample Types
-
Linkage Types:
-
Guide to Glycoprotein Linkage Types
Learn about different glycoprotein linkage types, their biological significance, and methods for identification and analysis.
-
Guide to Glycoprotein Linkage Types
- Browse the Wiki: Use the sidebar to navigate between species pages and other sections.
- Review Protocols: Check out the extraction guides and datasets to understand the experimental setups and data structures.
- Deep Dive into Glycobiology: Explore our detailed guide on glycoprotein linkage types to enhance your analysis and understanding.
- Contribute: We welcome contributions and feedback! Please refer to the contribution guidelines in our GitHub repository if you have suggestions or improvements.
For more technical details and the latest updates on the Glycopeptide Sequence Finder, please visit our GitHub Repository.
We hope this resource enhances your research and deepens your understanding of glycopeptide sequences. Thank you for exploring this resource!
— Richard Shipman
Test proteome FASTA files from UniProt are available in the test_proteomes
folder. Below is a list of species gathered. Only Swiss-Prot reviewed proteins were downloaded, and not every sequence available for a species is included.
I used these test proteomes to generate a zoo of glycopeptides under constrained conditions to fit into a GitHub repo. To build full zoo, remove constraints in batch processing script.
Species template: template_species.md
Common Name | Scientific Name | Taxon ID |
---|---|---|
Alpaca | Vicugna pacos | 30538 |
Amoeba | Naegleria gruberi | 5762 |
Anemone | Nematostella vectensis | 45351 |
Ant | Camponotus floridanus | 104421 |
Apple | Malus domestica | 3750 |
Arabidopsis | Arabidopsis thaliana | 3702 |
Aspergillus fumigata | Aspergillus fumigata (strain ATCC MYA-4609 / CBS 101355 / FGSC A1100 / Af293) | 330879 |
Aspergillus nidulans | Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139) | 227321 |
Avocado | Persea americana | 3435 |
Banana | Musa acuminata | 4641 |
Barley | Hordeum vulgare | 4513 |
Bat | Myotis lucifugus | 59463 |
Black Cherry | Prunus serotina | 23207 |
Black Truffle | Tuber melanosporum (strain Mel28) | 656061 |
Blood Fluke | Schistosoma mansoni | 6183 |
Brine Shrimp | Artemia franciscana | 6661 |
Brown Alga | Ectocarpus siliculosus | 2880 |
Bushbaby | Otolemur garnettii | 30611 |
Camel | Camelus bactrianus | 9837 |
Candida albicans (Yeast, human pathogen) | Candida albicans (strain SC5314 / ATCC MYA-2876) | 237561 |
Cat | Felis catus | 9685 |
C. elegans | Caenorhabditis elegans | 6239 |
Chameleon | Anolis carolinensis | 28377 |
Charcoal Rot | Macrophomina phaseolina (strain MS6) | 1126212 |
Chicken | Gallus gallus | 9031 |
Chimpanzee | Pan troglodytes | 9598 |
Chinchilla | Chinchilla lanigera | 34839 |
C. jejuni | Campylobacter jejuni | 1951 |
Coffee | Coffea arabica | 13443 |
Cow | Bos taurus | 9913 |
Crocodile | Crocodylus porosus | 8502 |
Crytococcus | Cryptococcus neoformans var. neoformans serotype D (strain JEC21 / ATCC MYA-565) | 214684 |
Cytomegalovirus | Human cytomegalovirus (strain Merlin) | 295027 |
Corn Smut | Mycosarcoma maydis | 5270 |
Date Palm | Phoenix dactylifera | 42345 |
Debaryomyces hansenii (yeast) | Debaryomyces hansenii (strain ATCC 36239 / CBS 767 / BCRC 21394 / JCM 1990 / NBRC 0083 / IGC 2968) | 284592 |
Deer Tick | Ixodes scapularis | 6945 |
Diatom | Thalassiosira pseudonana | 35128 |
Dictyostelium | Dictyostelium discoideum | 44689 |
Dog | Canis lupus familiaris | 9615 |
Donkey | Equus asinus | 9796 |
Duck | Cairina moschata | 8855 |
Dugbe Virus | Dugbe virus (isolate ArD44313) | 766194 |
Ebola | Zaire ebolavirus (strain Mayinga-76) | 128952 |
Elephant | Loxodonta africana (African Elephant) | 9785 |
Fall Armyworm | Spodoptera frugiperda (Fall Armyworm) | 7108 |
Ferret | Mustela putorius furo | 9669 |
Fission Yeast | Schizosaccharomyces japonicus (strain yFS275 / FY16936) | 402676 |
Frog | Xenopus laevis | 8355 |
Fruit Fly | Drosophila melanogaster | 7227 |
Goat | Capra hircus | 9925 |
Gorilla | Gorilla gorilla gorilla | 9595 |
Grape | Vitis vinifera | 29760 |
Green Alga | Chlamydomonas reinhardtii | 3055 |
Guinea Pig | Cavia porcellus | 10141 |
Hamster | Mesocricetus auratus | 10036 |
Hemp | Cannabis sativa | 4565 |
HHV-1 | Human herpesvirus 1 (strain 17) | 10299 |
HIV-1 | Human immunodeficiency virus type 1 group N (isolate YBF30) | 388818 |
HIV-2 | Human immunodeficiency virus type 2 subtype A (isolate BEN) | 11714 |
Honeybee | Apis mellifera | 7460 |
Horse | Equus caballus | 9796 |
HRSV S-2 | Human respiratory syncytial virus A (strain S-2) | 410078 |
Human | Homo sapiens | 9606 |
Influenza B | Influenza B virus (strain B/Lee/1940) | 518987 |
Influenza C | Influenza C virus (strain C/Ann Arbor/1/1950) | 11553 |
JEV | Japanese encephalitis virus (strain M28) | 2555554 |
Kidney Bean | Phaseolus vulgaris | 3885 |
Kluyveromyces lactis (lactate processing yeast) | Kluyveromyces lactis (strain ATCC 8585 / CBS 2359 / DSM 70799 / NBRC 1267 / NRRL Y-1140 / WM37) | 284590 |
LASV | Lassa virus (strain Mouse/Sierra Leone/Josiah/1976) | 11622 |
LCMV | Lymphocytic choriomeningitis virus (strain Armstrong) | 11624 |
Lemur | Microcebus murinus | 30608 |
Macaque (Rhesus monkey) | Macaca mulatta | 9544 |
Maize | Zea mays | 4577 |
Measles virus | Measles virus (strain Ichinose-B95a) | 645098 |
Monkey (cynomolgus, crab-eating) | Macaca fascicularis | 9541 |
Mosquito (African malaria) | Anopheles gambiae | 7165 |
Mouse | Mus musculus | 10090 |
Naked Mole Rat | Heterocephalus glaber | 10181 |
Nematode (roundworm) | Caenorhabditis briggsae | 6238 |
Norovirus | Norovirus (strain Human/NoV/United States/Norwalk/1968/GI) | 524364 |
Octopus | Octopus vulgaris | 6645 |
Olive | Olea europaea | 4146 |
Opossum | Monodelphis domestica | 13616 |
Orange | Citrus sinensis | 2711 |
Orangutan | Pongo abelii | 9601 |
Oyster | Magallana gigas | 29159 |
Paramecium | Paramecium tetraurelia | 5888 |
Peach | Prunus persica | 3760 |
Penicillium | Penicillium rubens (strain ATCC 28089 / DSM 1075 / NRRL 1951 / Wisconsin 54-1255) | 500485 |
Pig (Domestic) | Sus scrofa domesticus | 9823 |
Platypus | Ornithorhynchus anatinus | 9258 |
Poplar Leaf Rust Fungus | Melampsora larici-populina (strain 98AG31 / pathotype 3-4-7) | 747676 |
Potato | Solanum tuberosum | 4113 |
Psilocybe mushroom | Psilocybe cubensis | 181762 |
Pufferfish | Takifugu rubripes | 31033 |
Rabbit | Oryctolagus cuniculus | 9986 |
Rat | Rattus norvegicus | 10116 |
Red Alga | Cyanidioschyzon merolae (strain NIES-3377 / 10D) | 280699 |
Rice | Oryza sativa subsp. japonica | 39947 |
Rice Blast Fungus | Pyricularia oryzae (strain 70-15 / ATCC MYA-4617 / FGSC 8958) | 242507 |
Rice Fish (Japanese) | Oryzias latipes | 8090 |
RVA | Rotavirus A (isolate RVA/Monkey/South Africa/SA11-H96/1958/G3P5B[2]) | 450149 |
RVB | Rotavirus B (isolate RVB/Human/China/ADRV/1982) | 10942 |
RVC | Rotavirus C (isolate RVC/Human/United Kingdom/Bristol/1989) | 31567 |
SARS-CoV | SARS-CoV (Severe Acute Respiratory Syndrome Coronavirus) | 694009 |
SFTSV | SFTS phlebovirus (isolate SFTSV/Human/China/HB29/2010) | 992212 |
Shark | Callorhinchus milii | 7868 |
Sheep | Ovis aries | 9940 |
Silk Moth | Bombyx mori | 7091 |
Silveira (Coccidioides Silveira strain) | Coccidioides posadasii (strain RMSCC 757 / Silveira) | 443226 |
Snake (Brown Eastern) | Pseudonaja textilis | 8673 |
Softshell Turtle | Pelodiscus sinensis | 13735 |
Spike Moss (lycophyte) | Selaginella moellendorffii | 88036 |
Sponge | Amphimedon queenslandica | 400682 |
Sorghum | Sorghum bicolor | 4558 |
Squirrel | Ictidomys tridecemlineatus | 43179 |
Starfish | Patiria pectinifera | 7594 |
Strawberry | Fragaria ananassa | 3747 |
Sugarcane | Saccharum officinarum | 4547 |
Sunflower | Helianthus annuus | 4232 |
Sycamore | Platanus occidentalis | 4403 |
Tea plant | Camellia sinensis | 4442 |
Tobacco | Nicotiana tabacum | 4097 |
Tilapia | Oreochromis niloticus | 8128 |
Tomato | Solanum lycopersicum | 4081 |
Trout (Brown) | Oreochromis niloticus | 8128 |
Turkey | Meleagris gallopavo | 9103 |
Urchin | Strongylocentrotus purpuratus | 7668 |
VZV | Varicella-zoster virus (strain Dumas) | 10338 |
Wasp (parasitoid) | Nasonia vitripennis | 7425 |
Watermelon | Citrullus lanatus | 3654 |
Wheat | Triticum aestivum | 4565 |
Whisk fern | Psilotum nudum | 3240 |
Wild Rice (North America) | Oryza nivara | 4536 |
WNV | West Nile virus | 11082 |
XMAn v2 Missense | Homo sapians - Unknown Mutation Analysis (Human missense peptide library) Download at: https://github.com/lazarlab/XMAn-v2 | 9606 |
XMAn v2 Nonsense | Homo sapians - Unknown Mutation Analysis (Human nonsense peptide library) Download at: https://github.com/lazarlab/XMAn-v2 | 9606 |
Yak | Bos mutus grunniens | 30521 |
Yeast (Budding, Baker's) | Saccharomyces cerevisiae (strain ATCC 204508 / S288c) | 559292 |
Yeast (Fission) | Schizosaccharomyces pombe (strain 972 / ATCC 24843) | 284812 |
Zebra Finch | Taeniopygia guttata | 59729 |
Zebrafish | Danio rerio | 7955 |
Zebu | Bos indicus | 9915 |
Zika | Zika virus | 64320 |