Conformation-Importance-ML-Models

Welcome to the official repository for the paper "Understanding Conformation Importance in Data-driven Property Prediction Models"! This repository provide the data/code/model that used for the analysis.

PQC Dataset

One unique aspect of this study was creating the carefully controlled data sets for models’ performance evaluation in conformational diversity and the target property’s dependence on conformation. For example, the QM9 dataset is limited to very small atoms with 9 or less heavy atoms, and many structurally abnormal molecules were observed. We hope that the PQC dataset will be used as a benchmark dataset for predicting properties using machine learning models. Please download datasets from here

Installation

Install miniconda from here
Clone this repository:

git clone https://github.com/YuHamakawa/Conformation-Importance-ML-Models.git

Install required packeges:

cd Conformation-Importance-ML-Models
conda env create -f environment.yml

Download the PQC dataset and the APTCs dataset from here
Unzip the downloaded file and place the files directly under the 'Conformation-Importance-ML-Models' directory.

Tutorial

I provide code to reproduce the analysis in the paper as a notebook. The notebook consists of three files: one for creating data, one for building a model, and one for visualizing the results. Check tutorial_notebook directory for more detalis.

Citation

Please kindly cite our paper if you use the data/code/model.

@dataset{hamakawa_2024_13801221,
  author       = {Hamakawa, Yu and
                  Miyao, Tomoyuki},
  title        = {{Datasets for understanding the importance of 
                   conformation in property prediction models}},
  month        = sep,
  year         = 2024,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.14575682},
  url          = {https://doi.org/10.5281/zenodo.14575682}
}

License

This project is licensed under the terms of the MIT license. See LICENSE for additional details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
tutorial_notebook		tutorial_notebook
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conformation-Importance-ML-Models

PQC Dataset

Installation

Tutorial

Citation

License

About

Releases

Packages

Languages

License

YuHamakawa/Conformation-Importance-ML-Models

Folders and files

Latest commit

History

Repository files navigation

Conformation-Importance-ML-Models

PQC Dataset

Installation

Tutorial

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages