-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add READMEs, fix setup.py and pre-commit-config.yml
- Loading branch information
1 parent
a4a2b16
commit b89c86e
Showing
4 changed files
with
83 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,19 @@ | ||
repos: | ||
- repo: local | ||
hooks: | ||
- id: unittests | ||
name: run unit tests | ||
entry: python -m unittest | ||
language: system | ||
pass_filenames: false | ||
args: ["discover"] | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v2.3.0 | ||
hooks: | ||
- id: check-yaml | ||
- id: end-of-file-fixer | ||
- id: trailing-whitespace | ||
- repo: https://github.com/psf/black | ||
rev: 24.3.0 | ||
hooks: | ||
- id: black | ||
- repo: local | ||
hooks: | ||
- id: unittests | ||
name: run unit tests | ||
entry: python -m unittest | ||
language: system | ||
pass_filenames: false | ||
args: ["discover"] | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v2.3.0 | ||
hooks: | ||
- id: check-yaml | ||
- id: end-of-file-fixer | ||
- id: trailing-whitespace | ||
- repo: https://github.com/psf/black | ||
rev: 24.3.0 | ||
hooks: | ||
- id: black |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# RNA3DB scripts | ||
Below are brief descriptions of the scripts in this folder. | ||
|
||
- `scripts/slurm` a directory containing useful SLURM scripts. | ||
- `scripts/build_incremental_release_fasta.py` can be used to extract the different chains from two `parse.json` files. Useful for incramental releases. | ||
- `scripts/download_pdb_mmcif.sh` a script for downloading the latest version of the PDB. | ||
- `scripts/fasta_to_json.py` take a [FASTA format](https://en.wikipedia.org/wiki/FASTA_format) file and create a [JSON](https://en.wikipedia.org/wiki/JSON) usable by RNA3DB. | ||
- **Note:** that since FASTA files don't contain this information, the `release_date` is set to 1970-01-01, `structure_method` to "", and `resolution` to 0.0. | ||
- `scripts/generate_modifications_cache.py` used to generate a modifications cache. See [Downloading required data](https://github.com/marcellszi/rna3db/wiki/Building-RNA3DB-from-scratch#downloading-required-data) on the RNA3DB Wiki. | ||
- `scripts/get_nohits.py` looks at a FASTA file and `.tbl` file(s) and identifies entries in the FASTA file that get no hits in any of the `.tbl` file(s). Useful for the second `cmscan`. See [Homology Search](https://github.com/marcellszi/rna3db/wiki/Building-RNA3DB-from-scratch#homology-search) on the RNA3DB Wiki. | ||
- `scripts/json_to_fasta.py` converts an RNA3DB [JSON](https://en.wikipedia.org/wiki/JSON) to a [FASTA file](https://en.wikipedia.org/wiki/FASTA_format). | ||
- `scripts/json_to_mmcif.py` is used to build the single-chain [mmCIFs](https://en.wikipedia.org/wiki/Macromolecular_Crystallographic_Information_File). This script re-reads the chains from a `split.json` and writes them to a hierarchial folder, with each [mmCIF](https://en.wikipedia.org/wiki/Macromolecular_Crystallographic_Information_File) file containing a single chain. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# RNA3DB SLURM scripts | ||
|
||
These [SLURM](https://slurm.schedmd.com/documentation.html) scripts will eventually be used to build releases automatically. | ||
|
||
> **Note:** The scripts are experimental as they haven't been rigorously tested. | ||
|
||
## Getting started | ||
The first of these script, `build_full_release.slurm`, builds an entire release from the start. This script does a homology search on all chains found in the PDB, so it takes a long time to run. | ||
|
||
The second script, `build_incremental_release.slurm` adds new chains (added to the PDB since last release) to an existing release. | ||
|
||
Both files start with a number of [sbatch](https://slurm.schedmd.com/sbatch.html) SLURM commands: | ||
```sh | ||
#SBATCH -c 64 | ||
#SBATCH -t 0 | ||
#SBATCH -p <insert partition here> | ||
#SBATCH --mem=64000 | ||
#SBATCH -o logs/rna3db_full_release_%j.out | ||
#SBATCH -e logs/rna3db_full_release_%j.err | ||
#SBATCH --mail-user=<insert email address here> | ||
#SBATCH --mail-type=ALL | ||
``` | ||
You will likely need to edit some of these options if you want to use these scripts. Please see the [SLURM documentation for sbatch](https://slurm.schedmd.com/sbatch.html) on what each line means. At least you will need to either enter a partition, or remove the `-p` option. Similarly, you will need to edit the `--mail-user` option. | ||
|
||
Next, there are a number of paths you need to set in both files: | ||
```sh | ||
# where you want the release to be output to | ||
OUTPUT_DIR="" | ||
# where the latest release is located | ||
OLD_RELEASE="" | ||
|
||
# you set these once and forget | ||
RNA3DB_ROOT_DIR="" | ||
PDB_MMCIF_DIR="" | ||
CMSCAN="" | ||
CMDB="" | ||
``` | ||
- `OUTPUT_DIR` specifies the root directory where the release will be placed | ||
- `OLD_RELEASE` is the path to the directory of the release you want to add the new PDB chains to. This is only needed when you are trying to build an incremental release. | ||
- `RNA3DB_ROOT_DIR` path to the rna3db repository. Scripts are called from `$RNA3DB_ROOT_DIR/scripts/`. | ||
- `CMSCAN` is the path to the `cmscan` executable. | ||
- `CMDB` is the path to the covariance models you want to use for the homology search (`cmscan`). Usually this would come from [Rfam](https://rfam.org/) in the form of `Rfam.cm`. | ||
|
||
Once you have set the required paths and edited the sbatch commands as required, you can simply run the jobs via: | ||
```sh | ||
$ sbatch build_full_release.slurm | ||
``` | ||
Or: | ||
```sh | ||
$ sbatch build_incremental_release.slurm | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters