Skip to content

dajtmullaj/Example_Data_ChemPlot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Test Datasets for Chemical Space Visualization

Name Formatting: type_size_name_num_of_classes.csv

  • type: R->Regression and C->Classification
  • size: Number of instances in the dataset
  • name: Name of dataset
  • num_of_classes: Number of classes (Classification only)

Datasets and Sources

  1. BBBP dataset [1] (Blood-brain barrier penetration) -> C_2039_BBBP_2.csv
  2. SAMPL dataset [2] (Hydration free energy) -> R_642_SAMPL.csv
  3. AQSOLDB dataset [3] (Aqueous Solubility) -> R_9982_AQSOLDB.csv

Note: Datasets 1-2 are edited versions of the MoleculeNet repository [12].

References:

[6] Martins, Ines Filipa, et al. "A Bayesian approach to in silico blood-brain barrier penetration modeling." Journal of chemical information and modeling 52.6 (2012): 1686-1697.

[8] Mobley, David L., and J. Peter Guthrie. "FreeSolv: a database of experimental and calculated hydration free energies, with input files." Journal of computer-aided molecular design 28.7 (2014): 711-720.

[11] Sorkun, M. C., Khetan, A., & Er, S. (2019). AqSolDB, a curated reference set of aqueous solubility and 2D descriptors for a diverse set of compounds. Scientific data, 6(1), 1-8.

[12] Wu, Zhenqin, et al. "MoleculeNet: a benchmark for molecular machine learning." Chemical science 9.2 (2018): 513-530.

About

Example Datasets for ChemPlot development

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published