Functional Annotation Tools

Objective

Perform a full functional annotation on the genes and proteins determined by the Gene Prediction group that is relevant to C. jejuni We will be dividing functional annotation tools into clustering, homology-based and ab-initio-based tools.

Pipeline

We will be analyzing our DNA and protein sequences (in faa, fna, and gff files) using homology and ab-initio based techniques. We will be narrowing down the following categories to one tool based on efficiency and performance.e

Clustering

Clustering Sequences: CDHit

./cd-hit -i <input_file> -o <output_file_name>

Homology

Antibiotic Resistance: CARD

wget https://card.mcmaster.ca/latest/data
tar -xvf data ./card.json
rgi load --card_json <path to card.json> --local
rgi main -i <path to cluster.faa> -o <output_file_name> -t protein –local
rgi tab -i <path to output_file_name.json>

Virulence: VFDB

makeblastdb -in VFDB_db -dbtype 'nucl' -out <db_name>
blastn -db <db_name> -query <cluster> -out <result> -max_hsps 1 -max_target_seqs 1 -outfmt "6 qseqid length qstart qend sstart send evalue bitscore stitle" -perc_identity 100 -num_threads 5

Operons: MicrobesOnline

makeblastdb -in <fasta file > -dbtype prot -out <database>
blastp -query cdhit/faa_rep_seq.faa -db tmp/db_operon -evalue 0.01 -max_target_seqs 1 -max_hsps 1 -outfmt 6 -out tmp/hits_0.01.txt -num_threads 5

Fully Automated Functional Annotation: eggNOG Mapper

./emapper.py  -i <cluster> --output <result> -d bact -m diamond

Ab-initio

Trans Membrane Protein: TMHMM2

tmhmm <input multifasta file> > <output_file>

Signal Peptide: SignalP v5.0

signalp –fasta <input_sequence_file> -org gram- -format short –gff3

CRSIPR: PilerCR

pilercr –in <input multifasta file> -out <output file> -noinfo –quiet

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
cdhit		cdhit
data_visualization		data_visualization
input		input
operon_work		operon_work
output		output
results		results
test_input_contigs		test_input_contigs
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
cluster_wrapper.py		cluster_wrapper.py
create_abinitio_gff.py		create_abinitio_gff.py
create_homology_gff.py		create_homology_gff.py
faa_cluster_membership.txt		faa_cluster_membership.txt
fna_cluster_membership.txt		fna_cluster_membership.txt
merging_annotations.py		merging_annotations.py
pilercr_wrapper.py		pilercr_wrapper.py
plasmidseeker_wrapper.py		plasmidseeker_wrapper.py
signalp_wrapper.py		signalp_wrapper.py
tmhmm_wrapper.py		tmhmm_wrapper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Functional Annotation Tools

Objective

Pipeline

Clustering

Clustering Sequences: CDHit

Homology

Antibiotic Resistance: CARD

Virulence: VFDB

Operons: MicrobesOnline

Fully Automated Functional Annotation: eggNOG Mapper

Ab-initio

Trans Membrane Protein: TMHMM2

Signal Peptide: SignalP v5.0

CRSIPR: PilerCR

About

Releases

Packages

Languages

compgenomics2020/Team2-FunctionalAnnotation

Folders and files

Latest commit

History

Repository files navigation

Functional Annotation Tools

Objective

Pipeline

Clustering

Clustering Sequences: CDHit

Homology

Antibiotic Resistance: CARD

Virulence: VFDB

Operons: MicrobesOnline

Fully Automated Functional Annotation: eggNOG Mapper

Ab-initio

Trans Membrane Protein: TMHMM2

Signal Peptide: SignalP v5.0

CRSIPR: PilerCR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages