PuppetMaster

Tools for manipulating sequencing data, multiple sequence alignments and phylogenetic trees

VARIABLE SITES EXTRACTION

The aim of many microbial typing pipelines is to compare isolates at the SNP level. After multiple sequence alignment, phylogenetic inferences can be made by comparing SNPs across isolates. However, since SNPs are usually called with respect to a reference genome, one typically ends up with many redundant SNP sites when comparing isolates towards each other.

In parsimony methods (as opposed to likelihood or distance methods), constant sites (i.e. sites where the isolates being compared do not differ) can safely be excluded, as they are not informative for tree inference.

Similarly, sites where just one isolate is polymorphous, although variable, may not be of interest since they do not contribute to tree discrimination. Finally, sites where all four bases are represented are also non-informative and should be trimmed.

Furthermore, multiple sequence alignments are typically cluttered with gap characters "-" and ambigous characters "N". These sites are not valuable for phylogenetic inference, and should be trimmed.

This script allows the user to trim away N-containing columns, gapped columns and non-variable/non-informative sites from a multiple sequence alignment FASTA file.

Author: Ola Brynildsrud

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
LICENSE		LICENSE
README.md		README.md
Variable_sites_extractor.py		Variable_sites_extractor.py
count_reads.sh		count_reads.sh
find_percentage_Ns_in_sequence_alignment.py		find_percentage_Ns_in_sequence_alignment.py
stx_subtype_all.sh		stx_subtype_all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PuppetMaster

VARIABLE SITES EXTRACTION

About

Releases

Packages

Languages

License

AdmiralenOla/PuppetMaster

Folders and files

Latest commit

History

Repository files navigation

PuppetMaster

VARIABLE SITES EXTRACTION

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages