QRef
is a plugin for the crystallographic software suite Phenix
enabling what in the literature commonly is referred to as "quantum refinement" (QR) in both real and reciprocal space, utilising the (for academic users) free software Orca as the quantum chemistry engine. This first version of QRef
using Phenix
was implemented under the paradigm "correctness and completeness first, performance and adherence to coding standards later".
Refinement of macromolecules in both real and and reciprocal space relies on previous knowledge (i.e. a Bayesian prior) for the structure. This is usually encoded as a (pseudo-energy) penalty term,
where
In this QR implementation
where index 1 in turn indicates small, but interesting, parts of the structure. Additionally another scaling factor,
implies that
where
The directory modules
should be placed under $PHENIX
; qref
will thus be a new directory under modules
, whereas the user should manually overwrite energies.py
in modules/cctbx_project/cctbx/geometry_restraints
and model.py
in modules/cctbx_project/mmtbx/model
, respectively, with the version of the file corresponding to their installation of Phenix
.
There is a commented out guard clause in energies.py
:
# if not os.path.exists('qm.lock') and (os.path.exists('xyz_reciprocal.lock') or os.path.exists('xyz.lock')):
This is the recommended way to use the quantum restraints, as they are not always needed. In order to make this work one has to edit the file $PHENIX/modules/phenix/phenix/refinement/xyz_reciprocal_space.py
and import os
as well as surround the call to mmtbx.refinement.minimization.lbfgs(...)
in the method run_lbfgs
in the class run_all
with
with open('xyz_reciprocal.lock', 'w'):
pass
and
os.remove('xyz_reciprocal.lock')
Likewise the file $PHENIX/modules/phenix/phenix/refinement/macro_cycle_real_space.py
should be edited in a similar manner, i.e. with an added import os
as well as surrounding the calls to self.minimization_no_ncs()
and self.minimization_ncs()
in the method refine_xyz
in the class run
with
with open('xyz.lock', 'w'):
pass
and
os.remove('xyz.lock')
This implementation of QRef
has been verified to work with Phenix
versions 1.20.1-4487, 1.21-5207, 1.21.1-5286 as well as 1.21.2-5419.
The scripts in the folder scripts
should be placed somewhere accessible by $PATH
. The shebang in the scripts might need to be updated to point towards wherever cctbx.python
is located.
Orca
can be found at orcaforum.kofo.mpg.de - follow their guide for installation. QRef has been verified to work with Orca
versions 5.0.4 and 6.0.0.
The general procedure to set up a quantum refinement job consists of
-
Select QM region(s). Best practises for selecting proper QM region(s) can be found at for example:
-
Build a model.
- The model in the QM regions needs to make chemical sense. This for example means that the QM regions should be protonated as well as being complete.
- The model outside of the QM region (as well as the protonation of the carbon link atom) can be incomplete.
-
phenix.ready_set add_h_to_water=True
can be useful for this purpose.
-
Prepare restraint files for unknown residues and ligands. The script
qref_prep.py
will tell you if there are any missing restraint files.- This can be achieved using
phenix.ready_set
andphenix.elbow
.
- This can be achieved using
-
Prepare
syst1
files; these files define the QM regions.- If there is only one QM region the default is to look for a file named
syst1
by the software. For multiple QM regions the recommended, and default, naming scheme issyst11
,syst12
, etc. - Which atoms to include in the QM regions is defined using the serial number from the PDB file describing the entire model.
- While setting
sort_atoms = False
in the input toPhenix
should ensure that the ordering in the input model is preserved, we have encountered instances where this is not adhered to. Thus it is recommended to useiotbx.pdb.sort_atoms
(supplied withPhenix
) which will give you a new PDB file with the suffix_sorted.pdb
where the atoms are sorted in the same order as that whichPhenix
uses internally. It is recommended to use the_sorted.pdb
file as the input model for refinement, as well as the reference when defining thesyst1
files.
- While setting
- The
syst1
files allows for multiple atoms or intervals of atoms to be specified on a single line, where,
orblank
works as delimiters;-
is used to indicate an interval. -
#
and!
can be used to include comments in thesyst1
files. - The second occurence of an atom in the
syst1
files will indicate that this is a link atom, i.e. it will be replaced by a hydrogen at the appropriate position in the QM calculation. - Examples are included in the
examples
folder.
- If there is only one QM region the default is to look for a file named
-
Run
qref_prep.py <model>_sorted.pdb
to generateqref.dat
(a file containing settings for the QR interface), as well as PDB files describing the QM regions.-
The
junctfactor
file needs to be present in the same directory as whereqref_prep.py
is run. Thejunctfactor
file contain ideal QM distances for the$C_L - H_L$ bonds for the link-atoms. If anotherjunctfactor
file is to be used this can be specified with the-j
or--junctfactor
option. -
The theory used for the ideal
$C_L - H_L$ QM distance can be changed with the-l
or--ltype
option. Default is type 12. The options are:- 6: B3LYP/6-31G*
- 7: BP(RI)/6-31G*
- 8: BP(RI)/SVP
- 9: BP(RI)/def2-SV(P)
- 10: PBE(RI)/def2-SVP
- 11: B3LYP(RI)/def2-SV(P)
- 12: TPSS(RI)/def2-SV(P)
- 13: B97-D(RI)/def2-SV(P)
There needs to be a parametrisation in the
junctfactor
file for the bond one intends to cleave; it is recommended that the user inspects thejunctfactor
file to verify that there is support to cleave the intended bond type. In the case parametrisation is lacking another selection for the QM system (and in particular where the link between QM and MM occurs) needs to be made or appropriate parametrisation added to thejunctfactor
file. -
Ideally only the input model is needed as an argument for
qref_prep.py
. If there was a need to prepare restraint files for novel residues or ligands in point 3 above,qref_prep.py
needs to be made aware of these. This can be achieved with the-c
or--cif
option. -
The output from
qref_prep.py
should be$\left(1+2 n_{syst1}\right)$ files as well as recommended selection strings for both real and reciprocal space. Additionally, a warning is given if thesyst1
file covers more than one conformation. In the case that all atoms in thesyst1
definition belong to the same conformation, this will be indicated by analtloc
specifier.-
qref.dat
, which contains the settings for the QR interface. This file can be changed manually and it is a good idea to inspect that the value fororca_binary
is the correct path for the actual Orca binary file (qref_prep.py
tries to locate this file automatically but may sometimes fail). -
mm_i_c.pdb
, which is the model used to calculate$E_{MM1, i}$ . -
qm_i_h.pdb
, which is the model used to calculate$E_{QM1, i}$ .
The output PDB files can, and probably should, be used to inspect that the QM selection is proper.
- Two selection strings are printed on the screen, one for reciprocal space and one for real space. They are intended to be used in regards to which selection of the model to refine when crafting the input to either
phenix.refine
orphenix.real_space_refine
, see point 7 below.
-
-
Harmonic (bond) distance restraints can be added through the
-rd
or--restraint_distance
option, using the syntaxi atom1_serial atom2_serial desired_distance_in_Å force_constant
. Experience has shown that the force constant needs to be$\geq$ 2500 to achieve adherence to the restraint. -
Harmonic (bond) angle restraints can be added through the
-ra
or--restraint_angle
option, using the syntaxi atom1_serial atom2_serial atom3_serial desired_angle_in_degrees force_constant
, whereatom2_serial
defines the angle tip. Experience has shown that the force constant needs to be$\geq$ 10 (?) to achieve adherence to the restraint. -
Symmetry interactions are handled through the
-t
or--transform
option, using the syntaxi "<atoms>" R11 R12 R13 R21 R22 R23 R31 R32 R33 t1 t2 t3
for each of the desired transforms, where<atoms>
follow the same syntax as forsyst1
files (do note the usage of quotation marks, i.e.<atoms>
should be given as a string).R
is the rotation matrix (in Cartesian coordinates) in row-wise order andt
is the translation vector (in Cartesian coordinates). To obtain$R$ and$t$ , find for example the fractional rotation matrix$R_{frac}$ and the fractional translation vector$t_{frac}$ for the symmetry operator of interest (which can be found through using for example Coot), together with the desired fractional unit cell translation vector,$t_{u}$ .$R$ is then calculated as$R = S^{-1}R_{frac}S$ and$t = S^{-1}(t_{frac} + t_{u})$ , where$S$ is the Cartesian to fractional conversion matrix for the space group the crystal belongs to (which can be found in theSCALEn
records in a PDB file). -
All available options for
qref_prep.py
can be seen through-h
or--help
.
-
-
The next step is to prepare the input files for Orca. Examples can be found in the
examples
folder.- The input files should be named
qm_i.inp
, wherei
refers to the i:th QM region; counting starts at 1. - The level of theory should match the junction type; using the default (type 12) the corresponding input to
Orca
then becomes! TPSS D4 DEF2-SV(P)
. - Energy and gradient needs to be written to disk. This is achieved through the keyword
! ENGRAD
. - To read coordinates from a PDB file (
qm_i_h.pdb
) use*pdbfile <charge> <multiplicity> qm_i_h.pdb
, where againi
refers to the i:th QM region. - Custom settings can also be supplied (see the
examples
folder).
- The input files should be named
-
At this point it is possible to start a quantum refinement. It is however recommended to first create an empty file named
qm.lock
(this disables QRef - additionally, ifqref.dat
is not present in the folder QRef will not run, i.e. this step can be run before starting to set up the QR job) through for example thetouch
command, then start a refinement so that an.eff
file is obtained (this file contains all thePhenix
refinement settings for the current experimental data and model). A good idea is to rename this file tophenix.params
or similar, then edit this file and make sure the following options are set:- For reciprocal space refinement (
phenix.refine
):refinement.pdb_interpretation.restraints_library.cdl = False
refinement.pdb_interpretation.restraints_library.mcl = False
refinement.pdb_interpretation.restraints_library.cis_pro_eh99 = False
refinement.pdb_interpretation.secondary_structure.enabled = False
refinement.pdb_interpretation.sort_atoms = False
refinement.pdb_interpretation.flip_symmetric_amino_acids = False
refinement.refine.strategy = *individual_sites individual_sites_real_space rigid_body *individual_adp group_adp tls occupancies group_anomalous
refinement.refine.sites.individual = <reciprocal selection string>
refinement.hydrogens.refine = *individual riding Auto
refinement.hydrogens.real_space_optimize_x_h_orientation = False
refinement.main.nqh_flips = False
- For real space refinement (
phenix.real_space_refine
):refinement.run = *minimization_global rigid_body local_grid_search morphing simulated_annealing adp occupancy nqh_flips
pdb_interpretation.restraints_library.cdl = False
pdb_interpretation.restraints_library.mcl = False
pdb_interpretation.restraints_library.cis_pro_eh99 = False
pdb_interpretation.flip_symmetric_amino_acids = False
pdb_interpretation.sort_atoms = False
pdb_interpretation.secondary_structure = False
pdb_interpretation.reference_coordinate_restraints.enabled = True
pdb_interpretation.reference_coordinate_restraints.selection = <real space selection string>
pdb_interpretation.reference_coordinate_restraints.sigma = 0.01
pdb_interpretation.ramachandran_plot_restraints.enabled = False
- Other options can be set as needed.
- For reciprocal space refinement (
-
To run the quantum refinement job, make sure that the
qm.lock
file has been deleted, then execute eitherphenix.refine phenix.params
orphenix.real_space_refine phenix.params
. If there is a need to restart the job with different settings forPhenix
, make sure to delete the filesettings.pickle
.
When using ORCA/6.0.0 either manually run (before you submit your job) or add to the beginning of your submit script the exports below:
export OMPI_MCA_btl='^uct,ofi'
export OMPI_MCA_pml='ucx'
export OMPI_MCA_mtl='^ofi'
Add symmetry support for the QM calculations.Done.Add support for distance restraints.Done.Add support for angle restraints.Done.- Refactor code to be OOP.
- Turn QRef into a proper restraint_manager class.
Lundgren, K. J. M., Caldararu, O., Oksanen, E., & Ryde, U. (2024). "Quantum refinement in real and reciprocal space using the Phenix and ORCA software", IUCrJ, 11(6), 921-937.
doi.org/10.1107/S2052252524008406