Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion metagenomics/6-annotating.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
.. ::



Installing Prokka
=================

Expand Down
106 changes: 106 additions & 0 deletions metagenomics/8-installing-blastkit.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
===============================
8. BLASTing your assembled data
===============================

One thing everyone wants to do is BLAST sequence data, right? Here's a
simple way to set up a stylish little BLAST server that lets you search
your newly assembled sequences.

Installing blastkit
-------------------

Installing some prerequisites::

pip install pygr
pip install whoosh
pip install git+https://github.com/ctb/pygr-draw.git
pip install git+https://github.com/ged-lab/screed.git
apt-get -y install lighttpd

and configure them::

cd /etc/lighttpd/conf-enabled
ln -fs ../conf-available/10-cgi.conf ./
echo 'cgi.assign = ( ".cgi" => "" )' >> 10-cgi.conf
echo 'index-file.names += ( "index.cgi" ) ' >> 10-cgi.conf

/etc/init.d/lighttpd restart

Next, install BLAST::

cd /root

curl -O ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.24/blast-2.2.24-x64-linux.tar.gz
tar xzf blast-2.2.24-x64-linux.tar.gz
cp blast-2.2.24/bin/* /usr/local/bin
cp -r blast-2.2.24/data /usr/local/blast-data

And put in blastkit::

cd /root
git clone https://github.com/ctb/blastkit.git -b ec2
cd blastkit/www
ln -fs $PWD /var/www/blastkit

mkdir files
chmod a+rxwt files
chmod +x /root

and run check.py::

cd /root/blastkit
python ./check.py

It should say everything is OK.

Adding the data
---------------

If you've just finished annotating a metagenome assembly
(:doc:`6-annotating`) then
you can do this to copy your newly generated assembly into the right place::

cp /mnt/work/metag/testasm.faa /root/blastkit/db/db-prot.fa

Alternatively, you can grab our version of the assembly (from running this
tutorial)::

cd /root/blastkit
curl -O http://athyra.idyll.org/~t/testasm.faa
mv testasm.faa db/db-prot.fa

Formatting the database
~~~~~~~~~~~~~~~~~~~~~~~

After you've done either of the above, format and install the database
for blastkit::

cd /root/blastkit
formatdb -i db/db-prot.fa -o T -p T
python index-db.py db/db-prot.fa

Done!

.. note::

You can install any file of protein sequences you want this way; just copy
it into /root/blastkit/db/db-prot.fa and run the indexing commands, above.

Running blastkit
----------------

Figure out what your machine name is
(ec2-???-???-???-???.compute-1.amazonaws.com) and go to::

http://machine-name/blastkit/

Make sure you have enabled port 80 in your security settings on Amazon.

(If you're using the human data set, try this sequence::

MYLYTSYGTYQFLNQIKLNHQERNLFQFSTNDSSIILEESEGKSILKHPSSYQVIDSTGE
FNEHHFYSAIFVPTSEDHRQQLEKKLLHVDVPLSNFGGFKSYRLLKPTEGSTYKIYFGFA
NRTAYEDFKASDIFNENFSKDALSQYFGASGQHSSYFERYLYPIEDH


It should match something in your assembly.)
3 changes: 2 additions & 1 deletion metagenomics/index.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ for the latest released version.
4-assemble
5-mapping-and-quantitation
6-annotating
installing-blastkit
7-quast-evaluation
8-installing-blastkit

Reference material
------------------
Expand Down