Getting started

Right now the only things that are needed to run the pipeline are the script parse_and_run.py and the DNAProDB database as downloaded from http://dnaprodb.usc.edu. You can download the latest database with the following command:

wget http://dnaprodb.usc.edu/data/dnaprodb_1.0.7z

then unzip the database using whatever tools you prefer for doing so. It's easiest to just extract the file that DNAProDB recommends on their website:

7z e dnaprodb_1.0.7z Assuming that the downloaded file is in the current directory.

Dependencies

parse_and_run.py currently relies upon keras and tensorflow to work correctly, and has only been tested with Python version 3.6.5.

Run the model

After downloading the DNAProDB database and unzipping it, running the script is as simple as:

python /path/to/parse_and_run.py /path/to/collection.txt

Where collection.txt is the locally downloaded DNAProDB database. This script reshapes the input data and filters DNAProDB for cocrystal structures involving multiple chains and structures with no interacting chains with the DNA. While running, the script will periodically pipe progress messages to sys.stderr .

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
preprocessing		preprocessing
README.md		README.md
extract_specific_features.py		extract_specific_features.py
parse_and_run.py		parse_and_run.py
parse_dnaprodb.py		parse_dnaprodb.py
transfer_annotations.py		transfer_annotations.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting started

Dependencies

Run the model

About

Releases

Packages

Languages

ParkerLab/dna_protein_contacts

Folders and files

Latest commit

History

Repository files navigation

Getting started

Dependencies

Run the model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages