Home

Introduction

The NucleicNet is hosted on our webserver (http://www.cbrc.kaust.edu.sa/NucleicNet/). Here, we distribute a version that operates on Linux (Centos/Ubuntu). Users may also refer to other pages on this Wiki for a more detailed discussion on file input and interpretations.

Dependencies

The NucleicNet depends on the following publicly available software to run efficiently. Users should refer to their instruction and licenses for their prerequisite installation.

Python 3.6.7 (https://www.python.org/downloads/release/python-367/) Primary programming language

Anaconda 5.3.1 (https://www.anaconda.com/distribution/) Coordination of Python packages

FEATURE 3.1.0 (https://simtk.org/projects/feature) Analysis of atomic protein models

XSSP 2.0.4 (https://github.com/cmbi/xssp) Analysis of protein secondary structure from atomic protein models.

Pymol 2.3 (https://pymol.org/2/) Visualisation of Binding Pockets

cuda 8.0.61 and cudNN5.1 (https://developer.nvidia.com/rdp/cudnn-archive) Speed-up of deep learning operations.

After installing the prerequisite dependencies, run the following to configure the Python environment.

conda env create -f py3_env.yml

source activate nucleicnet

To exit from the environment, run the following.

source deactivate nucleicnet

How to Use NucleicNet

The NucleicNet works on protein atomic model(s) written in PDB file format. Further specification on the input PDB file can be found in Specification on PDB input files. Users can put PDB file(s) into the "GridData" Folder for their analysis. After which, run the following:

# Generate features for protein atomic models

bash command_GenerateFeature.sh

# Analyse on features by deep learning module

bash command_DeepLearningModule.sh

# Organise deep learning predictions into visualisable forms

bash command_AnalysePrediction.sh

The purpose of each python script called within the bash script are annotated.

Output

Major results are stored in the "Out" folder. Supposed our input PDB file of protein is called "GridData/0000.pdb", below outlines the purpose of the resultant output files:

"Out/0000_pymol.pse": This is a pymol session that reveal binding pockets of each RNA constituent (e.g. The 4 bases A/U/C/G and the backbone constituent P/R for phosphate and ribose). Users can open this file by "pymol Out/0000_pymol.pse" (See Fig. 3a-c)

"Out/0000_R_logo_RNACColor.png": Optional. If binding sites had been ascertained before as a RNA-protein complex PDB file, we can also call "NucleicNet_SequenceLogo_RNACcolor.py" to retrieve NucleicNet-predicted RNA binding specificity on each base location in form of a Sequence Logo diagram. Supposed the corresponding RNA-protein Complex is stored in "Control/0000.pdb" with RNA chain R, our "0000_R_logo_RNACColor.png" then refers to NucleicNet-predicted Sequence Logo indexed by RNA residue on chain R. (See Fig. 3-4)

We also include scripts and data to reproduce our study on Argonautes (See "command_AnalyseGridPrediction.sh"):

"ExperimentalSequencing/RipSeq_HMMlogPDifference.png": Using the NucleicNet to score miRNA sequence for Ago Binding. The result is compared with IP-Seq data (*.txt) stored in the "ExperimentalSequencing" Folder. (See Fig 5a)

"ExperimentalSequencing/Knockdown_Relation_All_Positive_publication.png" and "ExperimentalSequencing/Knockdown_Relation_All_Negative_publication.png" : Using the NucleicNet to evaluate miRNA loading efficiency. The result is compared with experimental Knockdown level (*.csv) stored in the "ExperimentalSequencing" Folder. (See Fig 5b)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Introduction

Dependencies

How to Use NucleicNet

Output

Clone this wiki locally