pathdinfR output for RNAseq and DIA_proteomics data #63
-
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
Hello Mathias, I'm not sure about the exact message you're getting, but if there are no active subnetworks identified, the issue is usually a small number of input genes.
Best, |
Beta Was this translation helpful? Give feedback.
-
The pathfindR version I\m using is pathfindR_1.6.0
Also, from the differential expression analysis I saw that:
Down-Regulated genes = 10443
Up-Regulated gene = 14386
Genes with no significant change = 8761
At the start of the analysis i get this message:
"## Testing input
The input looks OK
## Processing input. Converting gene symbols,
if necessary (and if human gene symbols provided)
Number of genes provided in input: 33588
Number of genes in input after p-value filtering: 24827
pathfindR cannot handle p values < 1e-13. These were changed to 1e-13"
But then ,tat the end of the pathfindR analysis i get this message ( and
the results object/data.frame is empty ):
"Found 0 active subnetworks
Warning message:
Did not find any enriched terms!"
I run pathfindR with the following command:
rna.res.pathfindR = run_pathfindR( input = rna.pathfindR ,
output_dir =
"../Results/rna.CRC.pathfindR_results.reactome" ,
gene_sets = "Reactome" ,
plot_enrichment_chart = FALSE ,
visualize_enriched_terms = FALSE ,
score_quan_thr = 0.7 ,
sig_gene_thr = 0.01 )
Where rna.pathfindR is the pathfindR input data.frame. I'm sending you a
copy of it, saved as a .rds file,
Let me know if you need more information .
Best,
…-Mathias
On Thu, Dec 10, 2020 at 1:24 PM Ege Ulgen ***@***.***> wrote:
Hello Mathias,
I'm not sure about the exact message you're getting, but if there are no
active subnetworks identified, the issue is usually a small number of input
genes.
1. To better answer this question: what exactly is the message you
get? How many differentially-expressed genes are there for your RNAseq
data? Also, what version of pathfindR are you using? If you wouldn't mind
sharing the data and the script you used for the RNAseq data, I could
pinpoint any potential issues.
2. We have used pathfindR in many cases using output from RNAseq data.
In fact, any gene-associated p-value data can be analyzed with pathfindR.
3. The default options for filtering active subnetworks were
determined based on analyses of multiple datasets. We're still working on
novel ways to ensure pathfindR is providing high-confidence results. There
is no straightforward answer to your question. Ideally, any subnetwork that
contains at least 2 input genes might be biologically relevant from the
active-subnetwork-oriented enrichment analysis perspective. Hence, you may
even keep all such active subnetworks.
For testing the confidence of your results, you can (a) search for
literature support or (b) perform experimental validation of the enriched
pathways.
Best,
-E
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#63 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGGIUT57DD2LLHEZULOPA6TSUC4Y5ANCNFSM4UU4ZQSQ>
.
|
Beta Was this translation helpful? Give feedback.
-
Hello Mathias, |
Beta Was this translation helpful? Give feedback.
-
Hey @masaver, As @ozanozisik pointed out, you should be filtering for significantly differentially expressed genes, taking into logFC account. Below are two volcano plots, showing p<0.05 only (left) and |LFC| > 1.5 + p < 0.05 (right). As you see, many of the genes in your RNAseq data have low logFC values, implying little impact (even if statistically significant). After using the latter filtering approach (i.e., |LFC| > 1.5 + p < 0.05), I obtained 93 enriched pathways for your data (with default subnetwork filtering). Hope these answers help, |
Beta Was this translation helpful? Give feedback.
-
That was of great help!. Indeed, filtering the genes by |LFC| > 1.5 + p < 0.05 did the trick. Best, |
Beta Was this translation helpful? Give feedback.
Hello Mathias,
In pathfindR, in line with the scoring in jActiveModules by Ideker et al., a background score distribution is calculated and it is used to adjust the score of subnetworks. In your case, almost all genes are significant, which prevents any subnetwork from being significant. I suggest using a more strict filtering on your gene set, taking logfoldchange into account.
Best,
Ozan