vibrio-tnseq - Tn-seq pipeline for Vibrio sp.

Vibrio Tn-seq

We foster the openness, integrity, and reproducibility of scientific research.

Scripts and tools used to analyse Vibrio anguillarum Tn-seq project. The data were generated using Illumina MiSeq platform.

Associated publication

Essential genes of Vibrio anguillarum and other Vibrio spp. guide the development of new drugs and vaccines. Bekaert M, Goffin N, McMillan S and Desbois A. Front. Microbiol. 12:755801

How to use this repository?

This repository hosts both the scripts and tools used by this study and the raw results generated at the time. Feel free to adapt the scripts and tools, but remember to cite their authors!

To look at our scripts and raw results, browse through this repository. If you want to reproduce our results, you will need to clone this repository, build the docker, and the run all the scripts. If you want to use our data for our own research, fork this repository and cite the authors.

Prepare a docker

All required files and tools run in a self-contained docker image.

Clone the repository

git clone https://github.com/pseudogene/vibrio-tnseq.git
cd vibrio-tnseq

Create a docker

docker build --rm=true -t vibrio-tnseq .

# test
docker run -i -t --rm -t vibrio-tnseq /usr/local/bin/run_pipeline.pl -v --gff /databases/vibrio.gff --cgview 1 --png 1 --infolder /data

Start the docker

To import the raw read files and export the results of your analyse you need to link a folder to the docker. It this example your data will be store in /home/myaccount (current filesystem) which will be seem as been /data from within the docker by using -v <USERFOLDER>:/data.

docker run -i -t --rm -v <absolute_path>:/data -t vibrio-tnseq /bin/bash

Run manually a new analysis

Make sure your raw read file are in <absolute_path>. To run manually a new analysis:

gunzip -c <input.fastq.gz> > <input.fastq>

cutadapt -g TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTTCAGAGTTCTACAGTCCGACGATCACAC \
  -a TAACAGGTTGGATGATAAGTCCCCGGTCTCTGTCTCTTATACACATCTCCGAGCCCACGAGAC -O 3 \
  -m 10 -M 18 -e 0.15 --times 2 --trimmed-only \
  -o <output.fastq> \
  <input.fastq>

bowtie2 --no-1mm-upfront --end-to-end --very-fast -x /databases/vibrio -U <output.fastq> -S <output.sam>

sam_to_map.pl --sam <output.sam> --cgview 1 --png 1 -v > output.log

Use your own genome and genome annotation

By default de database available is for Vibrio anguillarum NB10. If you used another species or strain you will have to update the database

cd /databases/

# get the genome sequence (FASTA format)
wget -cO - ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/786/425/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1_genomic.fna.gz > Vibrio_anguillarum_NB10.fa.gz

# get the genome sequence (GFF format)
wget -cO - ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/786/425/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1_genomic.gff.gz > Vibrio_anguillarum_NB10.gff.gz

# pre-process the genome sequence for Bowtie2
gunzip /databases/*.gz
bowtie2-build -q -f Vibrio_anguillarum_NB10.fa Vibrio_anguillarum_NB10

#Your "database" : /database/Vibrio_anguillarum_NB10
#Your "gff": /database/Vibrio_anguillarum_NB10.gff

cd /data/
run_pipeline.pl -v --gff /database/Vibrio_anguillarum_NB10.gff --database /database/Vibrio_anguillarum_NB10 --cgview 1 --png 1 --infolder reads

Run a pipeline

Make sure your compressed raw read files are in <absolute_path>/reads. To run a new pipeline:

run_pipeline.pl --infolder reads

Re-run the analysis of this study

You will need to download the dataset from the EBI ENA repository, project PRJEB39186:

docker run -i -t --rm -v <absolute_path>:/data -t vibrio-tnseq /bin/bash

mkdir -p reads
cd reads
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/[...].fastq.gz
[...]
cd ..
run_pipeline.pl -v --cgview 1 --png 1 --infolder reads
exit

Issues

If you have any problems with or questions about the scripts, please contact us through a GitHub issue. Any issue related to the scientific results themselves must be done directly with the authors.

Contributing

You are invited to contribute new features, fixes, or updates, large or small; we are always thrilled to receive pull requests, and do our best to process them as fast as we can.

License and distribution

The content of this project itself including the raw data and work are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License, and the source code presented is licensed under the GPLv3 license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Vibrio Tn-seq

Associated publication

How to use this repository?

Prepare a docker

Clone the repository

Create a docker

Start the docker

Run manually a new analysis

Use your own genome and genome annotation

Run a pipeline

Re-run the analysis of this study

Issues

Contributing

License and distribution

Files

README.md

Latest commit

History

README.md

File metadata and controls

Vibrio Tn-seq

Associated publication

How to use this repository?

Prepare a docker

Clone the repository

Create a docker

Start the docker

Run manually a new analysis

Use your own genome and genome annotation

Run a pipeline

Re-run the analysis of this study

Issues

Contributing

License and distribution