Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClinicalTrials.gov source #307

Merged
merged 9 commits into from
Jan 14, 2025
Merged

Add ClinicalTrials.gov source #307

merged 9 commits into from
Jan 14, 2025

Conversation

cthoyt
Copy link
Member

@cthoyt cthoyt commented Jan 14, 2025

This PR adds an initial ClinicalTrials.gov source. Currently, it gets information from:

  • ID / Name
  • Classification by study type / allocation (interventional/observational/expanded access and randomized/nonrandomized)
  • References
  • Conditions (when available from the derived section)
  • Interventions (when available from the derived section)

There are lots more things to do to parse content out of here, but PyOBO typically is not a full data science workflow that also incorporates information extraction, so some things will get left out for now.

The OBO, OFN, and OWL artifacts from build are available at https://github.com/biopragmatics/obo-db-ingest/tree/main/export/clinicaltrials. Here's some example stanzas from the OBO output:

[Instance]
id: clinicaltrials:NCT00000102
name: Congenital Adrenal Hyperplasia\: Calcium Channels as Therapeutic Targets
property_value: clinicaltrials:has_intervention mesh:D009543 ! has intervention Nifedipine
property_value: clinicaltrials:investigates_condition mesh:D000308 ! investigates condition Adrenocortical Hyperfunction
property_value: clinicaltrials:investigates_condition mesh:D000312 ! investigates condition Adrenal Hyperplasia, Congenital
property_value: clinicaltrials:investigates_condition mesh:D006965 ! investigates condition Hyperplasia
property_value: clinicaltrials:investigates_condition mesh:D047808 ! investigates condition Adrenogenital Syndrome
instance_of: interventional-clinical-trial

[Instance]
id: clinicaltrials:NCT00000104
name: Does Lead Burden Alter Neuropsychological Development?
property_value: clinicaltrials:investigates_condition mesh:D007855 ! investigates condition Lead Poisoning
property_value: clinicaltrials:investigates_condition mesh:D011041 ! investigates condition Poisoning
instance_of: observational-clinical-trial

[Instance]
id: clinicaltrials:NCT00000106
name: 41.8 Degree Centigrade Whole Body Hyperthermia for the Treatment of Rheumatoid Diseases
property_value: clinicaltrials:investigates_condition mesh:D003095 ! investigates condition Collagen Diseases
property_value: clinicaltrials:investigates_condition mesh:D012216 ! investigates condition Rheumatic Diseases
instance_of: randomized-interventional-clinical-trial

[Instance]
id: clinicaltrials:NCT00000250
name: Cold Water Immersion Modulates Reinforcing Effects of Nitrous Oxide - 2
property_value: clinicaltrials:has_intervention mesh:D009609 ! has intervention Nitrous Oxide
property_value: clinicaltrials:investigates_condition mesh:D009293 ! investigates condition Opioid-Related Disorders
property_value: clinicaltrials:investigates_condition mesh:D019966 ! investigates condition Substance-Related Disorders
instance_of: non-randomized-interventional-clinical-trial

[Instance]
id: clinicaltrials:NCT00040625
name: ALIMTA \(Pemetrexed\) Alone or in Combination With Cisplatin for Patients With Malignant Mesothelioma.
property_value: clinicaltrials:has_intervention mesh:D000068437 ! has intervention Pemetrexed
property_value: clinicaltrials:investigates_condition mesh:D000086002 ! investigates condition Mesothelioma, Malignant
property_value: clinicaltrials:investigates_condition mesh:D008654 ! investigates condition Mesothelioma
instance_of: expanded-access-study

Note that it uses ad-hoc terms and relations, this is being discussed in obi-ontology/obi#1831.

It would also be nice to have a way of annotating phases, which have the following statistics over the database:

Phase Count
NA 196036
122727
PHASE2 59412
PHASE1 44195
PHASE3 39160
PHASE4 33129
PHASE1,PHASE2 15219
PHASE2,PHASE3 6982
EARLY_PHASE1 5434

@cthoyt cthoyt enabled auto-merge (squash) January 14, 2025 12:14
@cthoyt cthoyt merged commit 7e4353d into main Jan 14, 2025
10 checks passed
@cthoyt cthoyt deleted the add-clinicaltrials branch January 14, 2025 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant