Skip to content

Latest commit

 

History

History
240 lines (131 loc) · 14.7 KB

target-prioritisation.md

File metadata and controls

240 lines (131 loc) · 14.7 KB
description
Learn about the new target prioritisation view released in 23.12

Target Prioritisation view

The new target prioritisation page can be accessed by clicking onto the Target prioritisation factors tab from the target-disease Associations on the Fly page when searching for targets associated with a disease or phenotype.

The view focuses on displaying target-specific properties in a disease agnostic way, which have been aggregated into four main sections - Precedence, Tractability, Doability, Safety - and individually scored by the Open Targets team.

A "traffic light" system has been designed to visually inform on target prioritisation, with the aim to facilitate target recommendations. Using a colour scale, green indicates potentially positive attributes and red indicates potentially negative attributes, providing information to help users assess the targets for further prioritisation or deprioritisation, respectively.

Precedence section

Target in clinic

Definition: Gene is targeted by available drugs in any clinical phase for any indication.

Source of Data: Platform Known Drugs widget (ChEMBL)

Scoring: Maximum clinical trial phase the target has been reported for, independently of the disease. Phases range from 0 to IV (corresponding to values of 0, 0.25, 0.5, 0.75 and 1 in the tool scores).

Tractability section

Membrane Protein

Definition: Target is annotated to be located in the cell membrane.

Source of Data: Platform Subcellular location widget [HPA (Human Protein Atlas) and UniProt]

Scoring:

  • 1 = Protein target is located (at least) in the cell or plasma membrane.
  • 0 = Protein target is not located in the cell membrane but some location information is accessible.
  • NA = No location information available.

Secreted protein

Definition: Target is secreted or predicted to be secreted.

Source of Data: Platform Subcellular location widget [HPA (Human Protein Atlas) and UniProt]

Scoring:

  • 1 = Protein target is (at least) secreted or predicted to be secreted.
  • 0 = Not secreted but with location information.
  • NA = No location information available.

{% hint style="info" %} Note: When contradictions between HPA (Human Protein Atlas) and UniProt exist (i.e. target is secreted according to HPA but in membrane according to UniProt), the information from HPA is taken. {% endhint %}

Ligand binder

Definition: Target binds at least one High-Quality Ligand according to ChEMBL tractability bucket.

Source of Data: Platform tractability widget (Open Targets tractability)

Scoring:

  • 1 = Target has a high-quality ligand reported.
  • 0 = Target does not have high-quality ligand reported.
  • NA = No information available.

Small molecule binder

Definition: Target has been co-crystallised with a small molecule, reported in the Protein Data Bank.

Source of Data: Platform tractability widget (Open Targets tractability)

Scoring:

  • 1 = Target has a small molecule reported.
  • 0 = Target does not have a small molecule reported.
  • NA = No information available.

Predicted pockets

Definition: Target has a DrugEBIlity score equal or above 0.7, which is predictive of harbouring a high-quality pocket.

Source of Data: Platform tractability widget (Open Targets tractability)

Scoring:

  • 1 = Target contains a high-quality predicted pocket.
  • 0 = Target does not have a high-quality predicted pocket.
  • NA= No information available.

Doability section

Models, tools and/or reagents that allow target assessment in preclinical settings to enable exploration of a given target

Mouse ortholog identity

Definition: Mouse orthologs maximum identity percentage. A mouse harbouring an ortholog for the target of interest could be useful for in vivo assaying.

Source of Data: Platform comparative genomics widget (Ensembl Compara)

Scoring:

From 0 to 1 are linearly scored those targets with at least one ortholog in mice harbouring at least 80% with the target.

  • 1 = There is at least one gene in mice that contains a sequence with a 100% of identity with the target.
  • 0 = There are no genes in mice containing a sequence with at least 80% of identity with the target.
  • NA = No ortholog information.

{% hint style="info" %} Note: Here we consider mouse orthologs and display the "query percentage identity" (percentage of the human target sequence that matches to the mouse gene) when there is an 80% identity or more. In the cases of targets with more than one ortholog, we take the one with the maximum query % ID. {% endhint %}

Chemical probes

Definition: Target has high quality chemical probes.

Chemical probes are small molecules acting as chemical modulators, binding reversibly to the target.

Source of Data: Platform Chemical probes widget (Probes & Drugs)

Scoring:

  • 1 = Target has high-quality chemical probes.
  • 0 = Target does not have high-quality chemical probes.
  • NA = No information available.

Safety section

Genetic constraint

Definition: Genest that are important for human physiology are seen to be depleted of deleterious variants. The Genome Aggregation Database (gnomAD) has developed a continuous measurement of intolerance to loss of function (LoF) variants per gene, based on observed/expected LoF variant analysis. As recommended by gnomAD and implemented in the Open Targets platform, the rank of genes regarding their loss-of-function observed/expected upper bound fraction (LOEUF) metric is used (LOEUF score).

Source of Data: Platform genetic constraint widget (gnomAD)

Scoring:

A score from -1 to 1 is given to genes depending on their LOEUF metric rank, being -1 the least tolerant to LoF variation and 1 the most tolerant.

Mouse models

Definition: The international database Mouse Genome Informatics contains information about reported phenotypes when a gene is knocked-out in this animal model. These phenotypes are categorised in multiple phenotype classes, using an organ/system classification. We retrieve this information (available in our platform) and score the phenotypes classes regarding their severity (from 0 to -1). After aggregating all phenotypes with their scores according to the phenotype class they belong to, we use the harmonic sum to build a continuous score, which is normalised from 0 to -1.

Source of Data: Platform mouse phenotypes widget (Mouse Phenotypes, feeded from MGI, a reference database for mice knockouts)

Scoring:

  • Below 0 to -1 = When the target has been knocked-out in mice there were multiple and severe phenotypes reported, with a score higher than the first quartile.
  • 0 = Either the target has non-severe phenotypes reported or is in the first quartile of the normalised score.
  • NA = No information available.

{% hint style="info" %} Note: Below you can find how we scored the mouse phenotype classes (-1 being the "most severe" and 0 "non relevant" phenotypes {% endhint %}

idPhenotype Classscore
MP:0005370liver/biliary system phenotype-1
MP:0005385cardiovascular system phenotype-1
MP:0010768mortality/aging-1
MP:0003631nervous system phenotype-0.75
MP:0005388respiratory system phenotype-0.75
MP:0005367renal/urinary system phenotype-0.75
MP:0005376homeostasis/metabolism phenotype-0.75
MP:0005386behavior/neurological phenotype-0.75
MP:0005381digestive/alimentary phenotype-0.5
MP:0005379endocrine/exocrine gland phenotype-0.5
MP:0005382craniofacial phenotype-0.5
MP:0005377hearing/vestibular/ear phenotype-0.5
MP:0005384cellular phenotype-0.5
MP:0005380embryo phenotype-0.5
MP:0005394taste/olfaction phenotype-0.5
MP:0002006neoplasm-0.5
MP:0005375adipose tissue phenotype-0.5
MP:0005389reproductive system phenotype-0.5
MP:0005397hematopoietic system phenotype-0.5
MP:0005387immune system phenotype-0.5
MP:0005391vision/eye phenotype-0.5
MP:0005390skeleton phenotype-0.5
MP:0005369muscle phenotype-0.25
MP:0001186pigmentation phenotype-0.25
MP:0005378growth/size/body region phenotype-0.25
MP:0005371limbs/digits/tail phenotype-0.25
MP:0010771integument phenotype-0.25
MP:0002873normal phenotype0

Gene essentiality

Definition: The second generation map of cancer dependencies (Pacini et al., 2024) increased the number of cancer cell lines analysed (930 CRISPR-Cas9 genome wide knock-out screenings, targeting almost 18,000 genes), spanning to 27 cancer types and curated patient genomic data, to identify cancer-type-specific and pan-cancer gene dependencies integrated with multi-omic markers.

Candidate anti-cancer therapeutic targets were characterised using a prioritisation criteria based on:

- Fitness Score. Strength of the effect on cellular fitness upon target depletion.

- Presence of dependency marker.

- Evidence linking the dependency and marker.

After applying a priority score based on approved drug targets, authors nominated 370 targets for 27 cancer types; 302 were cancer-type specific, while 196 where pan-cancer. This list of genes is the one used to label a target for gene essentiality.

Source of Data: A comprehensive clinically informed map of dependencies in cancer cells and framework for target prioritization. Pacini et al., 2024, Cancer Cell 42, 301–316. Supplementary Table 6. Gene essentiality widget (Cancer DepMap).

Scoring:

  • -1 = Target reported as essential.
  • 0 = Target not reported as essential
  • NA = No information available

Known safety events

Definition: Target is associated with curated adverse events.

Source of Data: Safety liability data from Platform safety widget (Open Targets Safety) and Open Targets downstream analysis of toxicity datasets from PharmGKB.

Scoring:

  • -1 = The target has at least one adverse event.
  • NA = No information available.

Cancer driver gene

Definition: Target is classified as an oncogene and/or tumour suppressor gene.

Source of Data: Platform Cancer Hallmarks widget (COSMIC)

Scoring:

We use the attribute information from the cancer hallmarks, in the target profile. Here, targets considered as "cancer driver genes" are flagged as tumour suppressor, oncogene, or both

  • -1 = Target is catalogued as driver gene (tumour suppressor, oncogene or both).
  • NA = No information available.

Paralogues

Definition: Paralogue maximum identity percentage.

Source of Data: Platform comparative genomics widget (Ensembl Compara)

Scoring:

  • Below 0 to -1 are linearly scored those targets with at least one paralogue in human harbouring at least 60% of identity with the target.
  • 0 = Those targets with paralogues harbouring less than 60% of identity.
  • NA = No information available about paralogues for that target.

Tissue specificity

Definition: HPA assessment on tissue-specific target expression.

Source of Data: Platform baseline expression widget (ExpressionAtlas, HPA and GTEx). We used the assessment for every target from the RNA expression data from the public version of Human Protein Atlas (proteinAtlasTissue)

Scoring:

Tissue specificity HPA assessmentScore
Tissue enriched >=4 fold higher mRNA in a given tissue compared to any other1
Group enriched >=4 fold higher average mRNA in 2-5 tissue compared to any other0.75
Tissue enhanced >=4 fold higher mRNA level in a given tissue compared to average of all other tissues0.5
Low tissue specificity-1
Not detectedNA

Tissue distribution

Definition: HPA assessment on any detectable baseline expression for the target across tissues.

Source of Data: Platform baseline expression widget (Expression Atlas, HPA and GTEx). We used the assessment for every target from the RNA expression data from the public version of Human Protein Atlas.

Scoring:

Tissue distribution HPA assessmentScore
Detected in single detected in a single tissue1
Detected in some detected in more than one but less than 1/3 of tissues0.5
Detected in many detected in at least 1/3 but not all tissues0
Detected in all-1
Not detectedNA