Codon optimization using the dnachisel library
Conda is an open source package and environment management system that runs on Windows, macOS and Linux. To know more about Conda take a look at the workshop Introduction to Conda for (Data) Scientists or the YouTube workshop prepared by the KAUST Visualization Core Lab.
-
Install Miniconda in your system:
For MacOS:
-
Download the installer in your home directory: https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
-
Open your terminal, go to your home directory (or wherever you downloaded the file) and run the installation script:
bash Miniconda3-latest-MacOSX-x86_64.sh
-
The script will present several prompts that allow you to customize the Miniconda install. It is recommended to accept the default settings. However, when prompted with the following:
Do you wish the installer to initialize Miniconda3 by running conda init?
Type
yes
to avoid to manually initialize Conda later. If you accidentally typeno
, when the script finishes you simply need to type the following command:conda init bash
-
Close and open the terminal again. You will see either
(miniconda3)
or(base)
before the terminal prompt, which means conda was installed correctly and thebase
environment is activated.
-
-
Open your terminal and go to the directory where you would like to save the project. Then run the following command:
git clone https://github.com/strubelab/CoolerCodonOpt.git
- The directory
CoolerCodonOpt/
is created.
- The directory
-
Create a virtual environment to work on:
-
Go to the
CoolerCodonOpt
directorycd CoolerCodonOpt
-
Create a virtual environment
conda env create --prefix ./env --file environment.yml
-
Activate the virtual environment
conda activate ./env
-
-
Install the
CoolerCodonOpt
package in your virtual environment:pip install -e .
-
Verify that you have python 3 installed in your computer:
- Open your terminal and type
python --version
- If the version number is less than 3, for example
Python 2.7.10
, download the current python version from https://www.python.org/ or alternatively follow the installation instructions in https://realpython.com/installing-python/
- Open your terminal and type
-
Open your terminal and go to the directory where you would like to save the project. Then run the following command:
git clone https://github.com/strubelab/CoolerCodonOpt.git
- The directory
CoolerCodonOpt/
is created.
- The directory
-
Create a virtual environment to work on:
-
Go to the
CoolerCodonOpt
directorycd CoolerCodonOpt
-
Create a virtual environment
python3 -m venv venv
-
Activate the virtual environment
source venv/bin/activate
-
-
Install requirements and the
CoolerCodonOpt
packagepip install -r requirements.txt pip install -e .
-
When you finish working, you can deactivate the virtual environment
deactivate
- Now you can call the
optimize
script from any location:
optimize --help
usage: optimize [-h] [--species SPECIES] [-v] [-d DESTINATION] input
Takes a DNA sequence and optimizes it for expression in E. coli
positional arguments:
input Fasta file with the sequence(s) to be optimized, or a directory with fasta
files.
optional arguments:
-h, --help show this help message and exit
--species SPECIES Species for which the sequence will be codon-optimized. Can be either a TaxID
(requires internet connection) or the name of the species from the available
choices: b_subtilis, c_elegans, d_melanogaster, e_coli, g_gallus, h_sapiens,
m_musculus, m_musculus_domesticus, s_cerevisiae
-v, --verbose Show the constraints evaluations, and optimization objectives score.
-d DESTINATION, --destination DESTINATION
Path for saving the resulting sequences. It defaults to the same directory as
the input.
The following command will optimize the given sequence, output the optimized sequence and optimization score to the terminal and save the optimized sequence to a file in fasta format:
python optimize.py sequence.fasta --verbose
The following command will take all the sequences in the sequences/
directory, and save them in the optimized_sequences/
directory.
python optimize.py sequences/ --destination optimized_sequences/