-
Notifications
You must be signed in to change notification settings - Fork 2
Ranks comparison
The gpseqc_compare
script calculates the distance between a pair of centrality rankings, obtained with gpseqc_estimate
. The workflow behind it is the following:
- Read and parse ranking tables.
- Subset ranking tables to the same genomic regions.
- Calculate the pair-wise distance between all the scores of the two tables.
- Shuffle N times the tables to generate a random distribution of distances.
- Use the random distribution to calculate a p-value of the original distance.
- Write output and plot.
Use the --no-test
option to run only steps 1-3 with minor output and computation time.
Three distance types are available in gpseqc_compare
through the -d
option:
- kt: Kendall tau distance.
- ktw: weighted Kendall tau distance.
- emd: Earth Mover's Distance.
More details on the distances and how to choose at the Distances page.
The random distribution is used to calculate a p-value of the original distance, which gives the probability of having a more extreme (similar or different) rank when randomly shuffling the ranks.
To obtain a proper p-value, the size of the sample used to build the random distribution must be large enough. You can change it with the -n
(or --niter
) option, which is 5000 by default.
GPSeqC v2.3.3
is published under the MIT License - Copyright (c) 2017-18 Gabriele Girelli