Skip to content

Releases: OlivierBeq/Papyrus-scripts

Version 2.0.0

20 Sep 13:04
Compare
Choose a tag to compare

New feature:

The PapyrusDataset class allows for object-oriented 'pandas-style' querying.

Changes

  • reader.read_papyrus: raises an error when trying to load the Papyrus++ set with stereochemistry,
  • preprocess.keep_source: argument source uses regex matching,
  • preprocess.keep_organism: argument organism is now case insensitive when generic_regex=False
  • download.download_papyrus now downloads also the README files

Additions:

  • preprocess.keep_not_match: keep unmatched column values.
  • preprocess.keep_not_contains: keep records whose specified column do not contain the specified value
  • preprocess.keep_dissimilar: keep records whose molecules are not similar to the provided molecule
  • preprocess.keep_not_substructure: keep records whose molecules are not substructures of the provided molecule

Full Changelog: 1.0.3...2.0.0

Papyrus-scripts v1.0.3

27 Nov 14:17
Compare
Choose a tag to compare

Bug fixes:

  • keep_source now returns an empty dataframe for chunks in which the desired source does not appear

New features:

  • qsar and pcm's split_by argument now supports 'custom-cluster' to split training and test sets according to a custom assignment that is not directly specifying train/test (as is the case when its value is 'cutsom').

Papyrus-scripts v1.0.2

16 May 12:20
Compare
Choose a tag to compare
  • Made download disclaimer and errors due to low disk space more evident
  • papyrus_scripts.utils.IO.process_data_version
    now raises an exception stating
    Papyrus data not available (did you download it first?)

Papyrus-scripts v1.0.1

07 Apr 15:05
Compare
Choose a tag to compare

The Papyrus++ datasets contained duplicated data wrongly associated to multiple assay types (i.e. Ki, KD, EC50, IC50).

The datasets have been updated and links of this release and of the db-links branch have updated accordingly.

Papyrus-scripts v1.0

25 Aug 13:47
Compare
Choose a tag to compare

Version 1.0 of the Papyrus-scripts library.

Allows one to:

  • download the Papyrus dataset
  • convert it from/to XZ to/from GZIP
  • match the data to structures of the Protein Data Bank
  • create FPSubSim2 (extension of FPSim2) files for similarity and substructure searches
  • filter the Papyrus data
  • model it with QSAR and PCM models
  • remove the data files