Skip to content
This repository has been archived by the owner on Oct 31, 2024. It is now read-only.
/ vibrio-tnseq Public archive

Tn-seq pipeline for Vibrio sp.

License

Notifications You must be signed in to change notification settings

pseudogene/vibrio-tnseq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vibrio-tnseq - Tn-seq pipeline for Vibrio sp.

Build Status Codacy Badge

Vibrio Tn-seq

We foster the openness, integrity, and reproducibility of scientific research.

Scripts and tools used to analyse Vibrio anguillarum Tn-seq project. The data were generated using Illumina MiSeq platform.

Associated publication

Essential genes of Vibrio anguillarum and other Vibrio spp. guide the development of new drugs and vaccines. Bekaert M, Goffin N, McMillan S and Desbois A. Front. Microbiol. 12:755801

DOI

How to use this repository?

This repository hosts both the scripts and tools used by this study and the raw results generated at the time. Feel free to adapt the scripts and tools, but remember to cite their authors!

To look at our scripts and raw results, browse through this repository. If you want to reproduce our results, you will need to clone this repository, build the docker, and the run all the scripts. If you want to use our data for our own research, fork this repository and cite the authors.

Prepare a docker

All required files and tools run in a self-contained docker image.

Clone the repository

git clone https://github.com/pseudogene/vibrio-tnseq.git
cd vibrio-tnseq

Create a docker

docker build --rm=true -t vibrio-tnseq .

# test
docker run -i -t --rm -t vibrio-tnseq /usr/local/bin/run_pipeline.pl -v --gff /databases/vibrio.gff --cgview 1 --png 1 --infolder /data

Start the docker

To import the raw read files and export the results of your analyse you need to link a folder to the docker. It this example your data will be store in /home/myaccount (current filesystem) which will be seem as been /data from within the docker by using -v <USERFOLDER>:/data.

docker run -i -t --rm -v <absolute_path>:/data -t vibrio-tnseq /bin/bash

Run manually a new analysis

Make sure your raw read file are in <absolute_path>. To run manually a new analysis:

gunzip -c <input.fastq.gz> > <input.fastq>

cutadapt -g TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTTCAGAGTTCTACAGTCCGACGATCACAC \
  -a TAACAGGTTGGATGATAAGTCCCCGGTCTCTGTCTCTTATACACATCTCCGAGCCCACGAGAC -O 3 \
  -m 10 -M 18 -e 0.15 --times 2 --trimmed-only \
  -o <output.fastq> \
  <input.fastq>

bowtie2 --no-1mm-upfront --end-to-end --very-fast -x /databases/vibrio -U <output.fastq> -S <output.sam>

sam_to_map.pl --sam <output.sam> --cgview 1 --png 1 -v > output.log

Use your own genome and genome annotation

By default de database available is for Vibrio anguillarum NB10. If you used another species or strain you will have to update the database

cd /databases/

# get the genome sequence (FASTA format)
wget -cO - ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/786/425/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1_genomic.fna.gz > Vibrio_anguillarum_NB10.fa.gz

# get the genome sequence (GFF format)
wget -cO - ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/786/425/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1/GCF_000786425.1_Vibrio_anguillarum_NB10_serovar_O1_genomic.gff.gz > Vibrio_anguillarum_NB10.gff.gz

# pre-process the genome sequence for Bowtie2
gunzip /databases/*.gz
bowtie2-build -q -f Vibrio_anguillarum_NB10.fa Vibrio_anguillarum_NB10

#Your "database" : /database/Vibrio_anguillarum_NB10
#Your "gff": /database/Vibrio_anguillarum_NB10.gff
cd /data/
run_pipeline.pl -v --gff /database/Vibrio_anguillarum_NB10.gff --database /database/Vibrio_anguillarum_NB10 --cgview 1 --png 1 --infolder reads

Run a pipeline

Make sure your compressed raw read files are in <absolute_path>/reads. To run a new pipeline:

run_pipeline.pl --infolder reads

Re-run the analysis of this study

You will need to download the dataset from the EBI ENA repository, project PRJEB39186:

docker run -i -t --rm -v <absolute_path>:/data -t vibrio-tnseq /bin/bash

mkdir -p reads
cd reads
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/[...].fastq.gz
[...]
cd ..
run_pipeline.pl -v --cgview 1 --png 1 --infolder reads
exit

Issues

If you have any problems with or questions about the scripts, please contact us through a GitHub issue. Any issue related to the scientific results themselves must be done directly with the authors.

Contributing

You are invited to contribute new features, fixes, or updates, large or small; we are always thrilled to receive pull requests, and do our best to process them as fast as we can.

License and distribution

The content of this project itself including the raw data and work are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License, and the source code presented is licensed under the GPLv3 license.

About

Tn-seq pipeline for Vibrio sp.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published