Skip to content

MoleculeHub/MoleculeDatasets.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MoleculeDatasets.jl

Code Style: Blue Aqua QA

A Julia package for easily downloading and accessing popular cheminformatics datasets.

Installation

using Pkg
Pkg.add("MoleculeDatasets")

Quick Start

using MoleculeDatasets

# Download and load a dataset
data = get_mol_dataset("esol")

Available Datasets

See dataset_info.jl

Adding a Dataset

To add a new dataset to the package, edit the MOL_DATASETS dictionary in src/dataset_info.jl. Each dataset entry should include:

For local datasets:

"dataset_key" => Dict(
    "name" => "Dataset Display Name",
    "description" => "Brief description of the dataset",
    "filepath" => "data/filename.csv",
    "format" => "csv",
    "size" => "file size",
    "type" => "local",
    "reference" => "Full citation",
    "doi" => "DOI if available",
    "website" => "URL if available"
)

For remote datasets:

"dataset_key" => Dict(
    "name" => "Dataset Display Name",
    "description" => "Brief description of the dataset",
    "url" => "https://example.com/dataset.csv",
    "format" => "csv",
    "size" => "file size",
    "type" => "remote",
    "reference" => "Full citation",
    "doi" => "DOI if available",
    "website" => "URL if available"
)

API Reference

Dataset Functions

  • get_mol_dataset(name; output_dir="data", force_download=false, verbose=true): Download and load a dataset as a DataFrame

About

A collection of cheminformatics datasets

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Packages

No packages published

Languages