A Partial Atomic Charge Predicter for Porous Materials based on Graph Convolutional Neural Network (PACMAN)
- DDEC6 (1, 2, 3, 4), Bader, Charge Model 5 (CM5), REPEAT for metal-organic frameworks (MOFs)
- DDEC6 for covalent-organic frameworks (COFs)
Developed by: Guobin Zhao
pip install PACMAN-charge
Git clone
git clone https://github.com/mtap-research/PACMAN-charge.git
cd PACMAN-charge
pip install -r requirements.txt
Jupyter notebook (using pip)
from PACMANCharge import pmcharge
pmcharge.predict(cif_file="./test/Cu-BTC.cif",charge_type="DDEC6",digits=10,atom_type=True,neutral=True,keep_connect=True)
Terminal
python pmcharge.py folder-name[path] --charge_type[DDEC6/Bader/CM5/REPEAT] --digits[int] --atom_type[bool] --neutral[bool] --keep_connect[bool]
Example command: python pmcharge.py test_file/test-1/ --charge_type DDEC6 --digits 10
Help usage information: python pmcharge.py -h
- folder-name: relative path to a folder with cif files without partial atomic charges
- charge-type (default: DDEC6): DDEC6, Bader, CM5 or REPEAT
- digits (default: 6): number of decimal places to print for partial atomic charges. ML models were trained on a 6-digit dataset
- atom-type (default: True): Default is to keep the same partial atomic charge for the same atom types (based on the similarity of partial atomic charges up to 3 decimal places)
- neutral (default: True): Default is to keep the net charge is zero. We use "mean" method to neuralize the system where the excess charges are equally distributed across all atoms
- keep_connect (default: True): retain the atomic and connection information (such as _atom_site_adp_type, bond) for the structure.
- Predict partial atomic charges using an online APP 👉 link
- Full code and dataset can be downloaded from 👉 link
- Note: All future releases will be uploaded on Github and pip only
If you use PACMAN charge, please consider citing this paper:
@article{,
title={PACMAN: A Robust Partial Atomic Charge Predicter for Nanoporous Materials based on Crystal Graph Convolution Network},
DOI={10.1021/acs.jctc.4c00434},
journal={Journal of Chemical Theory and Computation},
author={Zhao, Guobin and Chung, Yongchul},
year={2024},
volume = {20},
number = {12},
pages={5368-5380}
}
Databases with partial atomic charges | url | size |
---|---|---|
QMOF | link | 16,779 |
CoRE MOF 2014 DDEC | link | 2,932 |
CoRE MOF 2014 DFT-optimized | link | 502 |
CURATED-COFs | link | 612 |
ARC-MOF | link | 279,118 |
If you encounter any problem during using PACMAN, please email sxmzhaogb@gmail.com
or create "issues"
.
├── ..
├── figs # Figures used for introduction
│ ├── toc.jpg # Table of Contents
│ └── workflow.png # Workflow of this project
│
├── model # Python files used for dataset prepartion & GCN training
│ ├── GCN_E.py # Networks model for energy/bandgap training
│ ├── GCN_charge.py # Networks model for atomic charge training
│ ├── cif2data.py # Convert QMOF database to dataset
│ ├── data_E.py # Convert cif to graph & target (energy/bandgap)
│ ├── data_charge.py # Convert cif to graph & target (atomic charge)
│ └── utils.py # Normalizer, sampling, AverageMeter, save_checkpoint
│
├── model4pre # Python files used for prediction
│ ├── GCN_E.py # Networks model for energy/bandgap prediction
│ ├── GCN_charge.py # Networks model for atomic charge prediction
│ ├── atom_init.json # a JSON file that stores the initialization vector for each element
│ ├── cif2data.py # Read/write cif file
│ ├── data.py # Convert cif to graph & target (energy/bandgap)
│ ├── data_charge.py # Convert cif to graph & target (atomic charge)
│ └── utils.py # Normalizer, sampling, AverageMeter, save_checkpoint
│
├── pth # Models of this project
│ ├── best_bader # Bader
│ │ ├── bader.pth # Bader charge model
│ │ └── normalizer-bader.pkl # Normalizer of bandgap
│ ├── best_bandgap # Bandgap
│ │ ├── bandgap.pth # Bandgap model
│ │ └── normalizer-bandgap.pkl # Normalizer of bandgap
│ ├── best_cm5 # CM5
│ │ ├── bandgap.pth # ///
│ │ └── normalizer-bandgap.pkl # ///
│ ├── best_ddec # ///
│ │ ├── ddec.pth # ///
│ │ └── normalizer-ddec.pkl # ///
│ ├── best_pbe # ///
│ │ ├── pbe-atom.pth # ///
│ │ └── normalizer-pbe.pkl # ///
│ ├── best_repeat # ///
│ │ ├── repeat.pth # ///
│ │ └── normalizer-repeat.pkl # ///
│ ├── chk_bader # Bader
│ │ └── checkpoint.pth # Checkpoint of bader
│ ├── chk_bandgap # Bandgap
│ │ └── checkpoint.pth # Checkpoint of bandgap
│ ├── chk_cm5 # CM5
│ │ └── checkpoint.pth # ///
│ ├── chk_ddec # ///
│ │ └── checkpoint.pth # ///
│ ├── chk_pbe # ///
│ │ └── checkpoint.pth # ///
│ └── chk_repeat # ///
│ └── checkpoint.pth # ///
│
├── pmcharge.py # main python file for atomic charge assignment by command line
├── LICENSE.txt # MIT license
├── README.md # Usage/Source
├── requirements.txt # packages need to be installed
├── train_E.py # main python file for energy/bandgap training
└── train_charge.py # main python file for atomic charge training
(Elements that have been used by the model training process, not all the elements contained in the database)
- DDEC6/CM5/Bader Charges
- REPEAT Charges
- DDEC6 Charges
Parity plot of partial atomic charges from DDEC6 and PACMAN on the test set (QMOF).
- CM5 Charges
Parity plot of partial atomic charges CM5 and PACMAN on the test set (QMOF).
- Bader Charges
Parity plot of partial atomic charges from Bader and PACMAN on the test set (QMOF).
For the Bader model, use caution with Th-MOF predictions due to just 2 points used in traning set. The big error shows in the below figure isTh
.
- REPEAT Charges
Parity plot of partial atomic charges from REPEAT and PACMAN on the test set (ARC-MOF).
- Guobin Zhao (sxmzhaogb@gmail.com)
- Guobin Zhao (sxmzhaogb@gmail.com): models, training, data preparation
- Yongchul G. Chung (drygchung@gmail.com): supervising