Skip to content

Releases: opentargets/OnToma

v1.1.2: Changed pandas and python versions

07 Aug 15:36
19ae5be
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.1.1...v1.1.2

build: fix dependencies versions

31 Jul 17:12
cc65c27
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.1.0...v1.1.1

v1.1.0: Version aware EFO caching

31 Jan 13:02
Compare
Choose a tag to compare

The primary update is EFO version aware OnToma update by @DSuveges in #24. This fixes a behaviour where OnToma cache was not updated even when the requested EFO version changed.

In addition, OnToma no longer relies on the retry library (and consequently on the legacy py library), so using it will no longer generate security advisories.

Full Changelog: v1.0.3...v1.1.0

v1.0.3: Minor technical improvements

21 Nov 12:48
e78ece6
Compare
Choose a tag to compare

What's Changed

  • Minor improvements from issue 1808 by @tskir in #23
    • Add --version flag to print version and exit
    • Sort URI_MAPPING and add entry for CHEBI
    • Expand documentation on local installation and testing
    • Update tests to reflect changes in EFO

Full Changelog: v1.0.2...v1.0.3

v1.0.2

25 Jan 15:34
502f5f3
Compare
Choose a tag to compare

Recently the results of all manual curation efforts within OpenTargets got consolidated into a single repository. As Ontoma is using these files, the corresponding references needed to be updated too.

What's Changed

Full Changelog: v1.0.1...v1.0.2

v1.0.1: Unpin EFO version

02 Nov 07:10
Compare
Choose a tag to compare

Previously, EFO version was pinned to v3.31.0 due to the later releases missing the efo_otar_slim.owl file which is essential for OnToma operation: EBISPOT/efo#1180. This is now resolved, so the latest available EFO version will be again used from now on.

Full Changelog: v1.0.0...v1.0.1

v1.0.0: OnToma rewrite

29 Jul 08:10
0a8e2c2
Compare
Choose a tag to compare

OnToma has been rewritten with a focus on simplicity and mapping reliability. As a new major version, this release introduces some breaking changes to the CLI and Python interfaces, as well as major updates to the processing logic. Most importantly, the mapping results can be expected to change a lot.

Please read these release notes carefully before you consider upgrading. Bug reports and feedback on this release are especially highly appreciated. Please direct them to data@opentargets.org.

Mapping approach changes

OnToma has two operation modes, which are now clearly separated based on input type. For ontology input (e.g. OMIM:102900), OnToma attempts the following steps to map to EFO:

  1. Exact identifier match from EFO;
  2. Match terms by cross-references (hasDbXref);
  3. Mapping from the manual cross-reference database;
  4. Request through OxO with a distance of 2.

For string input (e.g. asthma), the following steps are attempted:

  1. Exact name match from EFO;
  2. Exact synonym (hasExactSynonym);
  3. Mapping from the manual string-to-ontology database;
  4. High confidence mapping from ZOOMA with default parameters.

Expected changes in the mapping results

All of the approaches listed in the previous section generate mappings which we consider to be of high quality, and they can be used in automated workflows straight out of OnToma. However, this is achieved at a cost of removing some low confidence approaches, such as fuzzy OLS lookup.

Our preliminary benchmarks, comparing the previous OnToma version (v0.0.18) to this release (v1.0.0), demonstrated the following approximate pattern:

  • Sensitivity—percentage of valid input mappings which are discovered—dropped from 96% to 61%.
  • At the same time, precision—the percentage of the mappings in OnToma output which are actually correct—rose from 75% to 97%.

Hence, after upgrading a significant drop in the number of the results is expected; however, the remaining results will be of significantly higher quality, which we believe is much more important in nearly all applications. We intend to work on increasing sensitivity in further releases.

Other operation changes

The CLI and Python interfaces have been simplified. The verbose and suggest flags have been removed (they might be reimplemented in a more consistent way in future releases).

Importantly, where multiple EFO terms match equally well from the single processing step, OnToma will now return multiple hits per query. (Previously, only one hit was selected, in a mostly random fashion.)

Each OnToma result consists of multiple fields.

  • In Python API they are accessed as result object attributes: OnToma().find_term('astma').id_ot_schema will contain EFO_0000270.
  • In CLI the list of fields to output can be configured via the --columns flag.

Manually curated mapping sources

Two central resources are currently being set up to store all manually curated ontology to EFO (step 3) and string to EFO (step 7) mappings. External OnToma users are encouraged to contribute to these resources as well. (More information about that will come in future releases.)

Changes to ontology handling

A new module, ontoma.ontology, was implemented to facilitate conversion between different ways to represent ontology identifiers. For example, ORDO_140162, ORPHA:140162, Orphanet:140162, and http://www.orpha.net/ORDO/Orphanet_140162 all represent the same term. The module implements an algorithm which converts all possible representations into the stable internal normalised representation to make direct comparisons possible.

The output of OnToma always follows the format specified in the Open Targets JSON schema, for example, Orphanet_140162. This means that you can plug in the output of OnToma directly into the evidence strings.

EFO OT slim is now loaded and parsed more consistently from the OWL file. There is a new option to cache this data to speed up OnToma initialisation in subsequent runs.

Additionally, you can now specify a particular EFO version to use. The version which is used by default in this release is pinned to v3.31.0.

Technical changes

The documentation has been migrated to ReadTheDocs and rewritten. RST build and configuration files have been updated and simplified.

Python 3.7+ is now required and consistently used throughout the code base. Installation has been simplified using pure PIP. The tests and CircleCI configuration have been updated to reflect all of the changes.