Releases: mmagnus/rna-tools
3.22 cif2pdb - convert mmCIF to PDB if a given tool takes only PDBs (smart way)
There are many issues when working with PDB files these days, largely due to the increasing complexity of biological structures. One of the first challenges is the number of chains in a structure, which can be very large.
To accommodate this, double-letter chain identifiers were introduced at some point for the mmCIF format. However, the PDB format can only have one char chain name.
(plus this auth-ors naming… just to make it even more complicated…, for the code here I use auth values, A5 here)
The second problem is that the number of atoms for one structure is so huge that it doesn't fit the character limit for the PDB format. If I put all the chains (even after renaming to single-code the number of atoms is crashing the format, some parsers might be OK, but you can also see that XYZ is off, etc).
MY SOLUTION
Install rna-tools
$ pip install --upgrade rna-tools
For now, the solution in rna-tools is to parse the CIF file save each RNA chain into a separate file, and set the chain name to a capital letter.
$ rna_pdb_tools.py --cif2pdb input/4v6x.cif # or a separate tool `rna_cif2pdb.py 4v6x.cif`
Warning: some of the chains in this mmCIF file have chain names with more chars than 1, e.g. AB, and the PDB format needs single-letter code, e.g. A.
rna chain B2 -> A # of atoms: 38377 4v6x_B2_nA_fCIF.pdb
rna chain BC -> B # of atoms: 1604 4v6x_BC_nB_fCIF.pdb
rna chain A5 -> C # of atoms: 84946 4v6x_A5_nC_fCIF.pdb
rna chain A7 -> D # of atoms: 2578 4v6x_A7_nD_fCIF.pdb
rna chain A8 -> E # of atoms: 3334 4v6x_A8_nE_fCIF.pdb
for each RNA chain, a new file is created in the PDB format:
4v6x_B2_nA_fCIF.pdb
# auth B2 chain is renamed to (new chain) A and saved into this file.
[actually, maybe it would be more convenient if this chain was always simply ‚A’, I can change that easily, so the file would be 4v6x_B2_fCIF.pdb, and you know that the chain inside is simply A].
There is no single chain in the ribosome to exceed the atom limit, so we should be fine, Ninh let me know if the tools crash at any of the structures.
In PYMOL you can load all the files at once to see them as if there were one file.
3.21 Add calculation of GDT_TS and GDT_HA
GTSRNA Global Distance Test modified for RNA, with adapted thresholds of 1.5, 3.0, 6.0, and 12 (instead of the 1, 2, 4, 8 Å used in protein comparisons, due to the larger average distance between phosphorus atoms than between C- atoms) is computed on backbone phosphorus atoms instead of C-. To compute this score, we patched the TMscore program used for computing protein structure similarity scores for protein models [42] (see patch in File S3).
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0078007
Full Changelog: 3.21...3.21
3.20 PyMOL4RNA.py: improve pdb
3.18 PyMOL4RNA: better Sugar edge
3.17 Torsion angle analysis
Examples:
$ rna_torsions.py ./input/4GXY_min.pdb
f, alphaprime, beta
input ./input/4GXY_min.pdb <Residue G het= resseq=2 icode= >, -64.20924484900823, -143.18546007904766
input ./input/4GXY_min.pdb <Residue C het= resseq=3 icode= >, 2.3394112025736815, 70.4052871669199
Comparison:
$ rna_x3dna.py input/4GXY_min.pdb -s
input: input/4GXY_min.pdb
nt id res alpha beta gamma delta epsilon zeta e-z chi phase-angle sugar-type ssZp Dp splay paired
0 1 G A.G2 NaN -143.2 153.7 82.5 -92.3 -31.9 -60(..) -179.0(anti) 19.5(C3'-endo) ~C3'-endo 4.39 4.56 18.32 no paired
1 2 C A.C3 -111.4 70.4 160.0 80.6 NaN NaN NaN -177.6(anti) 11.1(C3'-endo) ~C3'-endo NaN NaN NaN no paired
rna_pdb_tools.py add --save-single-res --ref-frame-only
rna_pdb_tools.py --rpr input/4GXY_min.pdb --save-single-res --ref-frame-only
atoms presets:
--backbone-only used only with --get-rnapuzzle-ready, keep only backbone (= remove bases)
--ref-frame-only used only with --get-rnapuzzle-ready, keep only reference frames, OP1 OP2 P
--no-backbone used only with --get-rnapuzzle-ready, remove atoms of backbone (define as P OP1 OP2 O5')
--bases-only used only with --get-rnapuzzle-ready, keep only atoms of bases
to extract specific atoms for each residue and write them to separate PDB file
3.15 rna_pdb_tools.py --fetch-fasta 4gxy
3.14 build RNAStructure from lines/scratch
Build an RNAStructure object from parsing a text input.
rna = RNAStructure()
l = rna.get_empty_line()
l = rna.set_res_code(l, <res>)
l = rna.set_atom_code(l, <atomname>)
l = rna.set_res_index(l, <resid>)
l = rna.set_atom_coords(l, <x, y, z>)
rna.add_line(l)
rna.write(args.output)
- still not the most pretty, but works...
3.13.x Refactor --get-rnapuzzle-ready into rna_standardize.py
$ rna_standardize.py rna_tools/input/comparison/*
Output: rna_tools/input/comparison/4GXY_min_std.pdb
>A:2-3
GC
Output: rna_tools/input/comparison/4GXY_min_reconstruction_std.pdb
>A:1-2
GC
3.12.x Rename rna_pdb_toolsx.py to rna_pdb_tools.py!
This will break some compatibility, but I think this was needed. This "x" was there to avoid previous problems with importing files when the package name was rna_pdb_tools. Since the package was renamed to rna-tools (and the main folder to rna_tools), this "x" should have been removed years ago ;-)