CORASON is a visual tool that identifies gene clusters that share a common genomic core and reconstructs multi-locus phylogenies of these gene clusters to explore their evolutionary relatioinships.
Input: query gene and RAST genome database.
Output: SVG graph with clusters sorted according to the multi-locus phylogeny of the common core.
CORASON was developed to find and prioritize biosynthetic gene clusters, but can be used for any kind of clusters.
-SVG graphs Scalable graphs that allows metadata easy display.
-Interactive CORASON is not a static database, it allows you to explore your own genomes.
-Reproducibility CORASON runs on docker, which allows to always perform the same analysis even if you change your Linux/perl/blast/muscle/Gblocks/quicktree distributions.
CORASON is available in two modes genbank and RAST files. To install CORASON as presented in "A computational framework for systematic exploration of biosynthetic diversity from large-scale genomic data" use the installation guide from BiG-SCAPE CORASON site.
The next steps are the installation guide for CORASON in RAST mode.
- Install docker engine
- Download nselem/corason docker-image
- Run CORASON
Follow the steps, and type the commands into your terminal, do not type $.
CORASON runs on docker. If you have docker engine installed, please skip this step. This is a Linux minimal docker installation guide, if you don't use Linux or you are looking for a detailed tutorial on Linux/Windows/Mac Docker engine installation please consult Docker getting Starting.
$ curl -fsSL https://get.docker.com/ | sh
*if you don’t have curl search on this document curl installation
Then type:
$ sudo usermod -aG docker your-user
Log out from your ubuntu session (restart your machine) and get back in into your user session before the next step. You may need to restart your computer and not just log out from your session in order to changes to take effect.
Test your docker engine with the command:
$ docker run hello-world
$ docker pull nselem/corason:latest
docker pull
may be slow depending on your internet connection, because the large nselem/corason docker-image is being downloaded. This only needs to happen once.
Create an empty directory that contains your [[Input Files]]:
RAST-genome data base, Rast_Ids file and file.query
$ mkdir mydir
place your files inside the directory mydir :
GENOMES (dir)
RAST_IDs (tab separated file)
file.query (aminoacid fasta file) Save as many queries as you wish to process.
$ docker run --rm -i -t -v $(pwd):/home/output nselem/corason /bin/bash
$(pwd) points to your working directory where you store your query file and GENOMES database.
Use absolute paths. If you do not know the path to your current working directory type on the terminal
$ pwd
/home/output is fixed at the docker images, you should always use this name.
$ corason.pl -q yourquery.query -rast_ids yourRAST.Ids -s yourspecial_org
once you finished all your queries exit the container
$ exit
You can also run corason from the beggining of the image without the interactive terminal. The next line is equivalent to steps 2.2 (Run your docker nselem/corason image) and 2.3 (2.3 Run CORASON inside your docker)
docker run --rm -v $(pwd):/home/output nselem/corason:latest SSHcorason.pl yourquery.query yourRAST.Ids yourspecial_org
Outputs will be on the new folder /mypath/mydir/query
- query.svg SVG file with clusters similar to you query sorted phylogenetically
- query_Report Functional cluster genomic core report.
- *.tre Phylogenetic tree of the genomic cluster core.
In this example the query file was yourquery.query and the input directory was /home/mydir. Output files are located in /home/mydir/yourquery
The CORASON source code and docker file are located at:
Code
Docker
$ which curl
$ sudo apt-get update
$ sudo apt-get install curl
perl gbkIndex.pl yourgbkfolder