Skip to content

Commit

Permalink
Merge pull request #11 from opentargets/il-2701
Browse files Browse the repository at this point in the history
New burden curation, improved documentation and bug fixes
  • Loading branch information
DSuveges authored Sep 8, 2022
2 parents f6d3be9 + ea70fa5 commit 412f207
Show file tree
Hide file tree
Showing 5 changed files with 8,007 additions and 7,919 deletions.
29 changes: 4 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,7 @@

This repository contains a collection of different manual annotations produced within the [Open Targets organisation](https://www.opentargets.org).

## Content
- **Mappings**. Label to ontology mappings of different nature:
- [biosystem](mappings/biosystem).
- Labels of anatomical structures such as tissues, organs or systems are mapped to the [UBERON ontology](https://uberon.github.io).
- Labels of cell lines are mapped to the [Cell Line Ontology](http://www.clo-ontology.org).
- [disease](mappings/disease). Labels of diseases mapped to the [EFO ontology](https://www.ebi.ac.uk/efo/). This directory is pending to be harmonised under a common file (see [#1400](https://github.com/opentargets/platform/issues/1400) for context).

- **Target Safety annotation**.
Manual curation of experimental data and insights from publications and other well-known sources of target safety and toxicity data, including the Tox21 database.

The annotation related to `safety_risks.tsv` and `adverse_effects.tsv` is currently being fed to the Platform's pipelines, whereas `experimental_toxicity.tsv` has been discontinued in favour of a new pipeline that processes [ToxCast data files](https://www.epa.gov/chemical-research/toxcast-data-accessing-toxcast-data-and-scenarios-exploring-data).

- **Gene Burden annotation**.
Manual curation of burden tests from key publications and resources.

- Schizophrenia Consortium. Associations of 9 genes with schizophrenia. The data is pulled from their downloads page (https://schema.broadinstitute.org/downloads).
- Epi25 Consortium. Associations of 2 genes with epilepsy. The data is pulled from their downloads page (https://epi25.broadinstitute.org/downloads).
- OTAR022 Metabolomics study. Associations of 21 genes with different metabolomics measurements. The data is pulled from 2 sites:
- Table 1 of their manuscript provides the key metrics.
- The sheet named "strongest gene/metabolite" in the Supplementary Table 5 provides annotation of nº samples.
- CKD Publication. Associations of 3 genes with chronic kidney disease. The data is pulled from 2 sites:
- Table 2 of their publication (https://jasn.asnjournals.org/content/jnephrol/30/6/1109.full.pdf?with-ds=yes).
- Supplementary Table 5 provides annotation of nº samples.
- Autism Sequencing Consortium. Associations of 102 genes with autism. The data is pulled from the Supplementary Table 2 in their publication (https://www.sciencedirect.com/science/article/pii/S0092867419313984#mmc2).
- REGENERON. Associations of 497 genes with traits from the UK Biobank. The data has been parsed from the Supplementary Table 2, 3, and 4 of their publication (https://www.nature.com/articles/s41586-021-04103-z) + the mappings done by GWAS Catalog.
See instructions for:
- [Mappings](docs/mappings.md)
- [Gene Burden annotation](docs/gene_burden.md)
- [Target Safety annotation](docs/target_safety.md)
31 changes: 31 additions & 0 deletions docs/gene_burden.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
## Gene Burden annotation

Manual curation of burden tests from key publications and resources.

### Schizophrenia Consortium
Associations of 9 genes with schizophrenia. The data is pulled from their downloads page (https://schema.broadinstitute.org/downloads).

### Epi25 Consortium
Associations of 2 genes with epilepsy. The data is pulled from their downloads page (https://epi25.broadinstitute.org/downloads).


### OTAR022 Metabolomics study
Associations of 21 genes with different metabolomics measurements. The data is pulled from 2 sites:
- Table 1 of their manuscript provides the key metrics.
- The sheet named "strongest gene/metabolite" in the Supplementary Table 5 provides annotation of nº samples.

### CKD Publication
Associations of 3 genes with chronic kidney disease. The data is pulled from 2 locations:
- Table 2 of their publication (https://jasn.asnjournals.org/content/jnephrol/30/6/1109.full.pdf?with-ds=yes).
- Supplementary Table 5 provides annotation of nº samples.
### Autism Sequencing Consortium
Associations of 102 genes with autism. The data is pulled from the Supplementary Table 2 in their publication (https://www.sciencedirect.com/science/article/pii/S0092867419313984#mmc2).

### REGENERON
Associations of 497 genes with traits from the UK Biobank. The data has been parsed from the Supplementary Table 2, 3, and 4 of their publication (https://www.nature.com/articles/s41586-021-04103-z) + the mappings done by GWAS Catalog.

### Autism Publication from the SPARK cohort
Associations of 60 genes with autism were identified by analyzing de novo and rare inherited variants from WES and WGS data. The data is pulled from the results of the meta analysis described in the Supplementary Table 9 of their publication (https://www.nature.com/articles/s41588-022-01148-2). Only associations with a p-value < 2.5 × 10−6 are included.

### Fat distributionn Publication
16 genes associated with fat distribution (BMI-adjusted WHR) were identified by analysing missense variants from WES data. Most of them with a protective direction of effect. The data is pulled from the Table 1 of their publication (https://www.nature.com/articles/s41467-022-32398-7.pdf). Only associations with a p-value < 3.6 × 10−7 are included.
7 changes: 7 additions & 0 deletions docs/mappings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Mappings

Label to ontology mappings of different nature:
- [biosystem](../mappings/biosystem).
- Labels of anatomical structures such as tissues, organs or systems are mapped to the [UBERON ontology](https://uberon.github.io).
- Labels of cell lines are mapped to the [Cell Line Ontology](http://www.clo-ontology.org).
- [disease](../mappings/disease). Labels of diseases mapped to the [EFO ontology](https://www.ebi.ac.uk/efo/). This directory is pending to be harmonised under a common file (see [#1400](https://github.com/opentargets/platform/issues/1400) for context).
5 changes: 5 additions & 0 deletions docs/target_safety.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Target Safety annotation

Manual curation of experimental data and insights from publications and other well-known sources of target safety and toxicity data, including the Tox21 database.

The annotation related to `safety_risks.tsv` and `adverse_effects.tsv` is currently being fed to the Platform's pipelines, whereas `experimental_toxicity.tsv` has been discontinued in favour of a new pipeline that processes [ToxCast data files](https://www.epa.gov/chemical-research/toxcast-data-accessing-toxcast-data-and-scenarios-exploring-data).
Loading

0 comments on commit 412f207

Please sign in to comment.