Skip to content

v2.0.38: Bam processing + Prime Editing updates

Compare
Choose a tag to compare
@kclem kclem released this 02 Jul 01:18
· 358 commits to master since this release
  • Input can now be read from bam using the parameter --bam_input and (optionally) --bam_chr_loc to use the reads in the bam at this location as input.
    An output bam is produced with an additional soace-separated field prefixed by c2 (e.g. c2:Z:ALN=Inferred CLASS=Inferred_MODIFIED MODS=D47;I0;S0 DEL=56(47) INS= SUB= ALN_REF=TTGGCGGATGTTCCAATCAGTACGCAGAGAGTCGCCGTCTCCAAGGTGAAAGCGGAAGTAGGGCCTTCGCGCACCTCATGGAATCCCTTCTGCAGCACCTGGATCGCTTTTCCGAGCTTCTGGCGGTCTCAAGCACTACCTACGTCAGCACCTGGGACCCCGCCACCGTGCGCCGGGCCTTGCAGTGGGCGCGCTACCTGCGCCACATCCATCGGCGCTTTGGTCGGCATGGCCCCATTCGCACGGCTCT----------------------------------------------- ALN_SEQ=ACACCGGATGTTCCAATCAGTACGCAGAGAGTCGCCGTCTCCAAGGTGAAAGCGGA-----------------------------------------------TCGCTTTTCCGAGCTTCTGGCGGTCTCAAGCACTACCTACGTCAGCACCTGGGACCCCGCCACCGTGCGCCGGGCCTTGCAGTGGGCGCGCTACCTGCGCCACATCCATCGGCGCTTTGGTCGGCATGGCCCCATTCGCACGGCTCTGGAGCGGCGGCTGCACAACCAGTGGAGGCAAGAGGGCGGCTTTGGGC). Note that the alignment details (location, cigar string, etc) are not modified.. this may be done in the future). Bam file input cannot be trimmed or pre-processed with quality filtering.

  • Prime editing scaffold incorporation is now more accurate (looks for the scaffold sequence at the expected position directly after the extension sequence). A plot showing the number of bases matching the scaffold, as well as insertions after the extension sequence, and a data file with these numbers is produced. Added parameter --prime_editing_pegRNA_scaffold_min_match_length to define the minimum length required to classify a read as 'Scaffold-incorporated'

  • Renamed split_paired_end parameter to --split_interleaved_input for interleaved input

  • Auto mode now considers 5000 reads to detect amplicon sequences

  • Add new paramter --annotate_wildtype_allele to annotate wildtype alleles on the allele plots

  • Update output when reporting missing files -- only lists first 15 files in the current directory and directory of input parameter

--reference https instead of http