Skip to content

Latest commit

 

History

History
54 lines (41 loc) · 2.3 KB

README.md

File metadata and controls

54 lines (41 loc) · 2.3 KB

RAFBL

Contents

About

RAFBL is the repository accompanying the manuscript: Reaction-Agnostic Featurization of Bidentate Ligands for Bayesian Ridge Regression of Enantioselectivity. It includes two packages modsel and moltop.

modsel is used for additional ligand featurization from base features and takes care of the feature selection for the final models.

moltop generates topological features from molecular structures. A molecular graph is either constructed using xyz coordinates and covalent radii or SMILES directly.

Ligand features can be visualized on Materials Cloud.

Install

We recommend the use of conda to install all the require dependencies.

To create the environment, run: conda env create -f environment.yml And then activate the environment as: conda activate rafbl

Run

Generate features from Gaussian output files:

To re-generate the features from Gaussian log files you can run:

./feat_csd.sh
./feat_lit.sh

This process takes a long time but only has to be run once. Beware! If you regenerate the features you will need to finish the process, since the regeneration will overwrite the currently present, already ready to use feature lists.

The final files containing all the features can be found under ligs/csd_pool.csv for the CSD ligands and under ligs/lit_pool.csv for the literature ligands.

Run whole model selection pipeline:

# possible modes: 0 -> oa, 1- > cp, 2 -> cc, 3 -> da_f
python main_models.py 0

Screen for candidate ligands for the OA reaction:

# possible modes: 1 -> csd ligands, 2 -> literature ligands  
python main_pool_cand.py 1

A list of ligands sorted by decreasing Expected Improvement (EI) values is obtained.