Skip to content

Latest commit

 

History

History
497 lines (367 loc) · 13.3 KB

README.md

File metadata and controls

497 lines (367 loc) · 13.3 KB

Enrichment Server

Developed and Maintained by Julian Müller (julian2.mueller@tum.de).

Usage

The Enrichment Server is currently running here: https://enrichment.kusterlab.org/main_enrichment-server/ The currently implemented services are described below. You can use each one of them by sending a POST request and attaching your input data in JSON format, as well as a session ID and a dataset name (those are needed for PTMNavigator, you can use whatever - maybe I will implement defaults for that at some point).
Pro Tip: If you are preparing your input data as a pandas data frame, an easy way to convert it into the required input format is using df.to_json(orient='records').

PTM Signature Enrichment Analysis

Description

PTM-Centric Enrichment Analysis using the PTM Signature Database (PTMSigDB). Basically a GSEA that is Single-Site-Centric (ssc).

Endpoint

/ssgsea/ssc

Reference

Code: https://github.com/broadinstitute/ssGSEA2.0
Publication: https://www.mcponline.org/article/S1535-9476(20)31860-0/fulltext

Input

  1. .../ssc/flanking: A list of PTM sites surrounded by their +-7 flanking sequence, and their expression in each experiment. E.g.:
 [...,
 {
  "id":"ALLQLDGTPRVCRAA-p",
  "Experiment01": 15.7046003342,
  "Experiment02": 12.9784002304
 },
 ...]
  1. .../ssc/uniprot: Alternatively, encode the sites as a list of Uniprot identifiers and site positions: E.g.:
 [...,
 {
  "id":"Q96MK2;T832-p",
  "Experiment01":15.7046003342,
  "Experiment02":12.9784002304
 },
 ...]

Example Command

curl -X POST -F file=@fixtures/ptm-sea/input/input_flanking.json -F session_id=ABCDEF12345 -F dataset_name=ptm-sea https://enrichment.kusterlab.org/main_enrichment-server/ssgsea/ssc/flanking -o output_ptmsea_flanking.json

curl -X POST -F file=@fixtures/ptm-sea/input/input_uniprot.json -F session_id=ABCDEF12345 -F dataset_name=ptm-sea https://enrichment.kusterlab.org/main_enrichment-server/ssgsea/ssc/uniprot -o output_ptmsea_uniprot.json

Gene-Centric Pathway Enrichment Analysis

Description
Basically a GSEA against a database of pathway signatures. We use the same algorithm as for PTM-SEA (ssGSEA), but with the MSigDB database instead of PTMSigDB (https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp). This means when using this endpoint on a PTM datasets, the site-specific information cannot be used (data has to be collapsed to gene level).
We use the KEGG and Wikipathways signatures only (running against the entire MSigDB would take a long time and is strongly discouraged by the creators).

Endpoint

/ssgsea/gc

Reference

Code: https://github.com/broadinstitute/ssGSEA2.0
Publication: https://www.mcponline.org/article/S1535-9476(20)31860-0/fulltext

Input

A list of gene symbols, and their expression in each experiment.

E.g.:

 [...,
 {
  "id":"PSEN1",
  "Experiment01":10.0033998489,
  "Experiment02":14.6499004364
 },
 ...]

Example Command

curl -X POST -F file=@fixtures/ssgsea/input/input.json -F session_id=ABCDEF12345 -F dataset_name=genecentric https://enrichment.kusterlab.org/main_enrichment-server/ssgsea/gc -o output_gc.json

Gene-Centric Redundant Pathway Enrichment Analysis

Description
The only difference to gene-centric enrichment is that genes are repeatedly counted for each regulated site in the data. It was shown in Krug et al. 2019 that while not performing as good as PTM-level enrichment, this works better than only counting each gene with regulated sites once, regardless of the number of regulated sites. Since gene-centric signatures are more comprehensive than site-centric signatures (e.g., they cover all human WikiPathways and KEGG pathways), it poses a good compromise between the two approaches.

Endpoint

/ssgsea/gcr

Reference

Code: https://github.com/broadinstitute/ssGSEA2.0
Publication: https://www.mcponline.org/article/S1535-9476(20)31860-0/fulltext

Input

Identical to Non-Redundant Gene-Centric PEA.
E.g.:

 [...,
 {
  "id":"PSEN1",
  "Experiment01":10.0033998489,
  "Experiment02":14.6499004364
 },
 ...]

Example Command

curl -X POST -F file=@fixtures/ssgsea/input/input.json -F session_id=ABCDEF12345 -F dataset_name=genecentricredundant https://enrichment.kusterlab.org/main_enrichment-server/ssgsea/gcr -o output_gcr.json

KSEA

Description
KSEA uses phosphoproteomics data (usually fold changes) and prior knowledge on kinase-substrate relationships to infer kinase activities. There are multiple implementations for KSEA, we use the one from the kinact package, which compares the mean fold change among the set of substrates of a kinase to an expected value. The implementation is based on a publication by Casado et al. (see below). The prior knowledge we use are the most recent kinase-substrate relationships from PhosphoSitePlus, retrieved using Omnipath on 2024-02-11. If you're interested, you can find the code to update the database in db/scripts/update_ksea_es_db.py.

Endpoint

/ksea

Reference

Code: https://github.com/saezlab/kinact
Publication: https://www.science.org/doi/10.1126/scisignal.2003573

Input

E.g.: A list of phosphosites, encoded in the format <Uniprot_Acc>_<Res><Position>, and their expression in each experiment.

 [...,
 {
  "Site":"O75822_S11",
  "Experiment_1":0.0,
  "Experiment_2":-0.002266224,
  "Experiment_3":0.0
 },
 ...]

Example Command

curl -X POST -F file=@fixtures/ksea/input/input.json -F session_id=ABCDEF12345 -F dataset_name=ksea https://enrichment.kusterlab.org/main_enrichment-server/ksea -o output_ksea.json

KSEA with RoKAI

Description
This endpoint uses RoKAI to refine the phosphorylation profiles before using kinact to perform KSEA. RoKAI has been shown to produce more robust results when combined with any kinase activity inference method (see the publication by Yılmaz et al. below). We use all 5 components of RoKAI's functional/structural neighbourhood network as information source (see Fig. 3 in the publication).

Endpoint

/ksea/rokai

Reference

Code: https://github.com/serhan-yilmaz/RokaiApp
Publication: https://www.nature.com/articles/s41467-021-21211-6

Input

Identical to KSEA.
E.g.:

 [...,
 {
  "Site":"O75822_S11",
  "Experiment_1":0.0,
  "Experiment_2":-0.002266224,
  "Experiment_3":0.0
 },
 ...]

Example Command

curl -X POST -F file=@fixtures/ksea/input/input.json -F session_id=ABCDEF12345 -F dataset_name=ksea_rokai https://enrichment.kusterlab.org/main_enrichment-server/ksea/rokai -o output_ksea_rokai.json

PHONEMeS

Description

PHONEMeS uses a prior knowledge network of PPIs and Kinase-Substrate Relationships to reconstruct a signaling network from a phosphoproteomics dataset and a set of perturbation targets. The current version is a wrapper around the causal reasoning tool CARNIVAL. Essentially it works by trimming away parts of the prior knowledge network until the resulting subnetwork optimally explains the observed data.
This endpoint first runs PHONEMeS on the input data and uses Cytoscape to set 2-D coordinates for the protein nodes.
The yFiles plugin (https://www.yworks.com/products/yfiles-layout-algorithms-for-cytoscape) is utilized to arrange the graph in a hierarchic layout. The result is converted into JSON format and sent back to the User. Note that the phosphosite nodes are trimmed away from the PHONEMeS result, only protein nodes are returned.

Endpoint

/phonemes

Reference

Code: https://github.com/saezlab/PHONEMeS
Publication: https://pubs.acs.org/doi/full/10.1021/acs.jproteome.0c00958

Input

A list of targets, split by experiment and regulation direction, as well as a list of sites, encoded in the format <Uniprot_Acc>_<Res><Position>, together with the expression of each site in each experiment.

E.g.:

{
  "targets": {
    "Experiment01": {
      "up": [
        "RICTOR"
      ],
      "down": [
        "EGFR",
        "MAPKAPK2"
      ]
    },
    "Experiment02": {
      "up": [
        "AHNAK",
        "MTOR"
      ],
      "down": [
        "AKT1S1"
      ]
    }
  },
    "sites":  [...,
       {
        "Site":"O75822_S11",
        "Experiment_1":0.0,
        "Experiment_2":-0.002266224,
        "Experiment_3":0.0
       },
 ...]
 }

Example Command

curl -X POST -F file=@fixtures/phonemes/input/input.json -F session_id=ABCDEF12345 -F dataset_name=phonemes https://enrichment.kusterlab.org/main_enrichment-server/phonemes -o output_phonemes.json

Motif Enrichment

Description

Performs a Kinase Motif Enrichment by making use of the Kinase Library (Johnson et al., Nature 2023).
Position-specific scoring matrices are used to score the motif of each kinase against a phosphoproteomics dataset.
The endpoint returns the enrichment values for every scored kinase motif.

Endpoint

/motif_enrichment

Reference

Code: https://kinase-library.phosphosite.org
Publication: https://www.nature.com/articles/s41586-022-05575-3

Input

A list of modified sequences, the Uniprot accession number(s) of the proteins they reside on, and for each experiment whether the peptide was up- or down-regulated. E.g.:

 [...,
  {
    "Modified sequence": "RDS(ph)ASYR",
    "Proteins": "A0A1X7SBZ2;A0A5H1ZRQ2;Q92841;Q92841-1;Q92841-2;Q92841-3",
    "Experiment01": "down",
    "Experiment02": "up"
  },
 ...]

Example Command

curl -X POST -F file=@fixtures/motif_enrichment/input/input.json -F session_id=ABCDEF12345 -F dataset_name=motif_enrichment https://enrichment.kusterlab.org/main_enrichment-server/motif_enrichment -o output_motif_enrichment.json

KEA3

Description

Performs Kinase Enrichment Analysis 3 (KEA3) enrichment. KEA3 infers upstream kinases whose putative substrates are overrepresented in a user-inputted list of proteins or differentially phosphorylated proteins.
The endpoint calls the API of KEA3 and returns the MeanRank and TopRank tables of the query result.

Endpoint

/kea3

Reference

Code: https://maayanlab.cloud/kea3/templates/api.jsp
Publication: https://academic.oup.com/nar/article/49/W1/W304/6279841

Input

A list of proteins for each experiment. E.g.:

{
  "Experiment01": [
    "FOXM1",
    "SMAD9"
  ],
    "Experiment02": [
    "ZNF264",
    "TMPO",
    "ISL2"
  ]

Example Command

curl -X POST -F file=@fixtures/kea3/input/input.json -F session_id=ABCDEF12345 -F dataset_name=kea3 https://enrichment.kusterlab.org/main_enrichment-server/kea3 -o output_kea3.json

KSTAR

Description

Performs Kinase Activity Prediction using the KSTAR algorithm.
Since KSTAR can only test for activity changes in one direction at a time, we only score down-regulations.
As a threshold for retaining phosphorylation sites, we use a fixed value of 0, i.e., we retain all negative values. Thus, the user needs to make sure to filter out non-significant regulations before using the endpoint.
For reasons of performance, this endpoint only performs the hypergeometric tests for calculating enrichment scores and p-values. The subsequent random analysis and Mann-Whitney-U test steps are omitted since they require significantly more processing power and time.

Endpoint

/kstar

Reference

Code: https://github.com/NaegleLab/KSTAR
Publication: https://www.nature.com/articles/s41467-022-32017-5

Input

A list of modified sequences, the Uniprot accession number(s) of the proteins they reside on, and for each experiment the expression value of the peptide. E.g.:

 [...,
 {
  "Modified sequence":"RS(ph)VGSDE",
  "Proteins":"C9JBX5;E9PAL7;P43307;P43307-2",
  "Experiment01":-1.2895137775,
  "Experiment02":-2.2462854621
 },
 ...]

Example Command

curl -X POST -F file=@fixtures/kstar/input/input.json -F session_id=ABCDEF12345 -F dataset_name=kstar https://enrichment.kusterlab.org/main_enrichment-server/kstar -o output_kstar.json

Hosting

If you would like to host an instance of the Enrichment Server yourself, there are two preliminary steps:

If you don't want to make use of the PHONEMeS or KSTAR endpoint(s), you can also skip these steps.

Now you can just build and run the docker container:
docker build -t enrichment_server .
docker run --network host enrichment_server