🌿 a biosynformatic molecular fingerprint tailored to natural product chem- and bioinformatic research 🌿
________________________________________________________________________________________
bi·o·syn·for·ma·tic
/ˌbaɪ oʊ sɪn fərˈ mæt ɪk/
adjective Computers, Biochemistry
relating to biosynthetic information and biochemical logic.
as a concatenation of biosynthetic and bioinformatics, it was coined
during the creation of BioSynFoni
.
_________________________________________________________________________________________
We have trained a biosynthetic class predictor on biosynfoni
fingerprints.
You can try out the predictor on your own molecules here!
Biosynfoni requires Python 3.9 or later. RDKit is installed as a dependency when installing Biosynfoni.
To install the package, you can use pip:
pip install biosynfoni
Now you can import the biosynfoni
package in your Python code or use the command line tool.
Convert a SMILES string to a fingerprint:
from biosynfoni import Biosynfoni
from rdkit import Chem
smi = <SMILES>
mol = Chem.MolFromSmiles(smi)
fp = Biosynfoni(mol).fingerprint # returns biosynfoni's count fingerprint of the molecule
Create a fingerprint from a SMILES string:
biosynfoni <SMILES>
Create a fingerprint from an InChI string:
biosynfoni <InChI>
Write the fingerprints of all molecules in an SDF file to a CSV file:
biosynfoni <molecule_supplier.sdf>
If you use biosynfoni
in your research, please cite our preprint:
@article{nollen2025biosynfoni,
title={Biosynfoni: A Biosynthesis-informed and Interpretable Lightweight Molecular Fingerprint},
author={Nollen, Lucina-May, Meijer, David, Sorokina, Maria, and Van der Hooft, Justin J. J.},
journal={chemRxiv},
year={2025}
}
We created several biosynthetic class predictors for our manuscript, which can be downloaded from Zenodo here.
We have used data from the COCONUT natural product database (DOI) and ZINC compound database (DOI). The parsed data used for the analysis in our manuscript can be downloaded from Zenodo here.