Skip to content

Latest commit

 

History

History
98 lines (85 loc) · 4.72 KB

README.md

File metadata and controls

98 lines (85 loc) · 4.72 KB

Library-on-library screen of betacoronavirus stem helix peptides vs S2P6 mutants

Obejctive

Studying the evolutionary trajectories of S2P6 for breadth expansion using a library-on-library screen that involves 27 unique betacoronavirus stem helix peptides and 1,024 S2P6 variants.

Dependencies

Input files

Linking barcodes to variants based on PacBio sequencing data

  1. Identify the sequences of stem helix peptide, S2P6 variant, and barcode in each read
    python3 script/PacBio_fastq2seq.py

    • Input file:
      • Fastq file from the PacBio sequencing
    • Output file:
      • data/barcode_SHpep_mutID.tsv
  2. Filter barcodes with low read counts and perform error correction

Analyze the bacode sequencing data for the library-on-library screen

  1. Merging Illumina seqeuncing reads
    python3 script/merge_reads.py

    • Input file:
      • All .fastq files in [fastq/]
    • Output files:
      • merged files in [fastq/merged]
  2. Counting unique barcode sequences
    python3 script/fastq2count.py

  3. Splitting count file for faster processing
    python3 script/split_count_df.py

  4. Indentifying pairs of stem helix peptide and S2P6 mutant
    python3 script/mut_ID.py

  5. Calculating the frequency of each variant
    python3 script/count2freq.py

  6. Calculate the expression scores and binding scores
    python3 script/freq2score.py

Plotting

  1. Plot correlation between expression scores and binding scores
    python3 script/plot_replicate_qc.py

  2. Plot correlation between binding scores and effect of different frequency cutoffs
    Rscript script/plot_QC.R