Skip to content

LandthalerLab/wastewater_virome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pipelines and Scripts for Wyler, E. et al. 2024

Wyler, E., Lauber, C., Manukyan, A., Deter, A., Quedenau, C., Alves, L. G. T., ... & Landthaler, M. (2024). Pathogen dynamics and discovery of novel viruses and enzymes by deep nucleic acid sequencing of wastewater. Environment International, 108875.

DOI: https://doi.org/10.1016/j.envint.2024.108875

This repository incorporates two pipelines for RNA and DNA samples generated from wastewater:

  • Kaiju Pipeline for taxonomy classification:

    • Duplicate removal of reads with CD-HIT.
    • Taxonomy classification and annotation with Kaiju.
    • Custom R scripts for summarizing annotated reads per sample.
  • CCTyper Pipeline for predicting cas proteins:

In addition to preprocessing pipelines, we also provide access to custom scripts for the downstream analysis for both taxonomy classification and Cas protein detection:

  • Metagenomic Analysis on taxonomically classified reads:

    • taxonomy ranks and lineage with taxizedb.
    • Processing and PCA of taxonomy counts.
    • Visualize heatmaps with ComplexHeatmap.
  • Downstream Analysis on reads associated with CRISPR-Cas genes:

    • Clustering open reading frames (ORFs) with CD-HIT.
    • Aligning ORFs to NR database in NCBI with rBLAST.
    • Protein embeddings of ORFs using ProtTrans.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published