mlr-xai-selfies

SELFIES mutation method to obtain atom attributions for any QSAR model.

What are SELFIES?

SELFIES (SELF-referencing Embedded Strings) are a string representation for molecules. More information can be found in the paper and the code to compute SELFIES from molecules is available in github. SELFIES differ from SMILES in the fact that they are backed by a grammar which ensures chemical validity. Meaning that any valid SELFIES string is also a valid molecule.

What is the idea behind xai-selfies?

XAI-SELFIES can be viewed as a generalization of the XAI method published in this paper and can be considered an outcome of the Bayer LSC project "Explainable AI".

The general concept is to explain any trained QSAR model using string permutations to obtain character-level attribution scores. The overall algorithm is the following

        Obtain the corresponding SELFIES string
        Obtain the prediction from the model to explain
        For each position in the string:
             Mutate the string at the position of interest by replacing the SELFIES character by all possible 
             characters in the SELFIES vocabulary
             Check for SELFIES validity
             Optionally check for distance to input molecule
             Obtain predictions for all valid mutated strings
             Attribution_for_position_i = original prediction - average(mutated predictions)
        convert the SELFIES attributions into atom attributions by using SELFIES-to-SMILES correspondences

How do I get started?

Create a conda environment with all necessary dependencies using the environment.yml file: conda env create -f environment.yml
Have a look at example.py: by running it you will download a public logD dataset, create a demo QSAR model based on this dataset, and create attribution vectors for the first 200 molecules of the dataset. It shows how the pretrained model should look like and how the featurizer should look like.

How can I visualize attributions computed with XAI-SELFIES?

Several ways!

The first one would be to use the RDKit library, specifically by using the SimilarityMaps functionality as shown here.
Another option is to use the beautiful xSMILES library published by Henry Heberle, which can work as a jupyter notebook plugin.
Finally we have also built CIME, a visual analytics platform which integrates xSMILES in a webapp, and lets you upload datasets as csv format (i.e., you can just save the pandas dataframe obtained from running XAI-SELFIES as an sdf and move on to CIME to analyze your data. The public version of CIME is available here and can be launched as a docker container.

Acknowledgements

Code developed by Floriane Montanari while employed in the Machine Learning Group at Bayer. Kudos to Linlin Zhao (whose xBCF implementation helped make XAI-SELFIES), Marco Bertolini and Thomas Wolf for contibuting ideas!

Name	Name	Last commit message	Last commit date
Latest commit jmwoll Merge pull request #2 from Bayer-Group/1-typeerror-in-groupby-mean-ag… Dec 12, 2023 6a3a89c · Dec 12, 2023 History 15 Commits
xai_selfies	xai_selfies	fixed TypeError in groupby mean aggregation (Issue #1 )	Dec 12, 2023
.gitignore	.gitignore	gitignore	Feb 13, 2023
CODEOWNERS	CODEOWNERS	Create CODEOWNERS	Feb 23, 2023
CONTRIBUTING.md	CONTRIBUTING.md	Create CONTRIBUTING.md	Feb 23, 2023
LICENSE	LICENSE	Create LICENSE	Feb 23, 2023
README.md	README.md	Update README.md	Feb 23, 2023
environment.yml	environment.yml	minor adds	Feb 13, 2023
setup.py	setup.py	adding setup.py	May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlr-xai-selfies

What are SELFIES?

What is the idea behind xai-selfies?

How do I get started?

How can I visualize attributions computed with XAI-SELFIES?

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

Bayer-Group/mlr-xai-selfies

Folders and files

Latest commit

History

Repository files navigation

mlr-xai-selfies

What are SELFIES?

What is the idea behind xai-selfies?

How do I get started?

How can I visualize attributions computed with XAI-SELFIES?

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages