omicR_linux command line

omicR creates fasta files, downloads genomes from NCBI using the refseq number, creates databases to run BLAST+, runs BLAST+ and filters these results to obtain the best match per sequence. These scripts can be used to run BLAST alignment of short-read (DArTseq data) and long-read sequences (Illumina, PacBio… etc). You can use reference genomes from NCBI, or any other genetic sequence that you would like to use as reference.

Introduction

omicR creates fasta files, downloads genomes from NCBI using the refseq number, creates databases to run BLAST+, runs BLAST+ and filters these results to obtain the best match per sequence.

These scripts can be used to run BLAST alignment of short-read (DArTseq data) and long-read sequences (Illumina, PacBio… etc). You can use reference genomes from NCBI, genomes from your private collection, contigs, scaffolds or any other genetic sequence that you would like to use as reference.

Requirements

• NCBI BLAST+ V4 or latest. (https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/)

• Python V3 or latest (https://www.python.org/downloads/)

• Biopython (https://biopython.org/)

• omicR

Add these programs to your environment path variables.

Introduction If you are running omicR with an HPC computer, it likely that you know how to use a command line. For this purpose, I suggest that you only use 2 scripts to “create fasta files” and “filter”. As the steps of downloading, creating a database and running BLAST can take longer than running BLAST+ directly. The required input BLAST command line to run this filtering script is:

blastn -db [ ] -query [ ] -out [ ] -word_size [ ] -perc_identity [ ] -num_threads [ ] -outfmt ' 6 qseqid sacc stitle qseq sseq nident mismatch pident length evalue bitscore qstart qend sstart send gapopen gaps qlen slen’

For usage, please refer to the file "OmicR_User_guide.pdf" available in this repository.

If you use this script, please cite:

Berenice Talamantes-Becerra, Jason Carling, Arthur Georges. omicR: A tool to facilitate BLASTn alignments for sequence data, SoftwareX, Volume 14, 2021, 100702, ISSN 2352-7110, https://doi.org/10.1016/j.softx.2021.100702. Website: https://www.sciencedirect.com/science/article/pii/S2352711021000479

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Diagram.png		Diagram.png
LICENSE		LICENSE
OmicR_User_guide_25_02_21.pdf		OmicR_User_guide_25_02_21.pdf
README.md		README.md
Report_FreshWaterTurtle_SNP_mapping_.csv		Report_FreshWaterTurtle_SNP_mapping_.csv
SampleData_Enterococcus_faecium.csv		SampleData_Enterococcus_faecium.csv
TestingPyCharm_BLAST_filtering_and_all.py		TestingPyCharm_BLAST_filtering_and_all.py
TestingPyCharm_Downloading_genomes.py		TestingPyCharm_Downloading_genomes.py
TestingPyCharm_MKfasta.py		TestingPyCharm_MKfasta.py
TestingPyCharm_MakeDataBase.py		TestingPyCharm_MakeDataBase.py
TestingPyCharm_NCBI_BLAST_filtering.py		TestingPyCharm_NCBI_BLAST_filtering.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

omicR_linux command line

About

Uh oh!

Releases

Packages

Languages

License

BTalamantesBecerra/omicR_linux_commandline

Folders and files

Latest commit

History

Repository files navigation

omicR_linux command line

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages