Optimal query for literature references linking targets and ligands #94

eric-czech · 2023-06-13T01:37:22Z

I would like to know what publications associate targets and ligands such that the publications explicitly note some interaction/relationship between the pair (not just the target or just the ligand). The query in #93 seemed like a reasonable place to start. Is there a better way to do this?

I would also like to run this query infrequently (monthly or quarterly at most) and with no filter, i.e. I'd like to capture all ligand <-> target relationships with citations.

Any suggestions on the best way to accomplish this would be appreciated. Thanks!

KeithKelleher · 2023-06-13T14:01:08Z

That query looks good for fetching the publications that we have for reporting each known target ligand interaction. There are a couple of things to add.

add a field alias for drugs - there's an issue to fix this, but without telling the API that you want drugs AND ligands (i.e. approved and unapproved compounds), it will just give you back the ligands
add a field ligandCounts - for sanity checks that the numbers of drugs and ligands you're getting back is consistent

ligandCounts {
name
value
}
ligands(isdrug: false) {
ligid
name
description
isdrug
activities {
pubs {
pmid
title
year
}
}
}
drugs: ligands (isdrug: true) {
ligid
name
description
isdrug
activities {
pubs {
pmid
title
year
}
}
}

If you want to run this query for all targets, you'll probably have to paginate the results, or else it will be slow, and have a very large response. It seems your doing that already, so that's good.
One optimization to make would be to filter your target list to Tchem and Tclin targets, since knowing if a target has a chemical interaction is the main criteria to no longer be considered Tdark or Tbio.

"filter": {
"facets": [
{
"facet": "Target Development Level",
"values": ["Tclin", "Tchem"]
}
]
}

The other thing to consider is that the data in TCRD (and subsequently Pharos) is a subset of ligand activities that come primarily from DrugCentral and Chembl, where activities below a threshold are not included.
Here is the blurb on Pharos about the criteria to be included:

Activity Thresholds Activity values from DrugCentral and ChEMBL must be standardizable to -Log Molar units AND meet the the following target-family-specific cutoffs:
GPCRs: <= 100nM
Kinases: <= 30nM
Ion Channels: <= 10μM
Non-IDG Family Targets: <= 1μM

If you want data outside those criteria, you'd probably want to get data straight from Chembl and DrugCentral.

eric-czech · 2023-06-13T14:09:35Z

Thanks again @KeithKelleher, that's extremely helpful! We'll try those improvements and you can close this if you'd like, otherwise I'll leave it open and report back for the sake of posterity (or if any other questions come up).

KeithKelleher · 2023-06-13T14:25:43Z

Glad to help. Yes, let us know how it goes, and if there's anything else.

Rahkovsky · 2023-06-26T23:00:28Z

@KeithKelleher, thank you very much for your advice. We have run the following query looping over the offset and limit values.

query

query ($offset: Int!, $limit: Int!) {
targets {
targets(skip: $offset, top: $limit) {
name
sym
uniprot
facetValues(facetName: "Target Development Level")
ligandCounts {
name
value
}
nonDrugLigands: ligands(isdrug: false, top:10000) {
ligid
name
description
isdrug
activities {
pubs {
pmid
title
year
}
}
}
DrugLigands: ligands(isdrug: true, top:10000) {
ligid
name
description
isdrug
activities {
pubs {
pmid
title
year
}
}
}
}
}
}

We found out that the default is to extract maximum 10 ligands per protein, so to override it, we need to add a top parameter with sufficiently large value:

DrugLigands: ligands(isdrug: true, top:10000)

The counts of unique proteins and unique protein-ligids combinations are almost identical. Curiously, we extract little bit more records from DrugLigands + nonDrugLigands query than from validation counts:
. Do you know what maybe a reason for it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal query for literature references linking targets and ligands #94

Optimal query for literature references linking targets and ligands #94

eric-czech commented Jun 13, 2023

KeithKelleher commented Jun 13, 2023 •

edited

Loading

eric-czech commented Jun 13, 2023

KeithKelleher commented Jun 13, 2023

Rahkovsky commented Jun 26, 2023 •

edited

Loading

Optimal query for literature references linking targets and ligands #94

Optimal query for literature references linking targets and ligands #94

Comments

eric-czech commented Jun 13, 2023

KeithKelleher commented Jun 13, 2023 • edited Loading

eric-czech commented Jun 13, 2023

KeithKelleher commented Jun 13, 2023

Rahkovsky commented Jun 26, 2023 • edited Loading

KeithKelleher commented Jun 13, 2023 •

edited

Loading

Rahkovsky commented Jun 26, 2023 •

edited

Loading