Skip to content

🔵 Exact Distributions and Performance of some Two-sample Nonparametric Tests for Circular Data

Notifications You must be signed in to change notification settings

SMAC-Group/TwoCircles

Repository files navigation

Travis-CI Build Status Project Status: Active Licence minimal R version CRAN packageversion Last-changedate

TwoCircles Overview

This package implements various nonparametric methods for testing whether two circular samples come from the same population and allows to obtain its associated small-sample distribution. In particular, the following test are considered:

  • dixon: for Dixon test,
  • rao: for Rao test,
  • wilcox: for Wilcoxon test,
  • ww for Wheeler-Watson test.

The code below presents a simple example with the pigeons dataset of Schmidt-Koenig (1958).

# Load dataset
data(pigeons)

# Dixon test (exact pvalue)
circular_test(pigeons$control, pigeons$experimental)
#> 
#>       Dixon Two Sample Test
#> 
#> Data:  pigeons$control and pigeons$experimental
#> Test Statistic: 82
#> Exact P-value: 0.00185
#> Bracketing Points and Pair of Signif. Levels:
#> c1 = 44 (p1 = 0.0537)
#> c2 = 46 (p2 = 0.0479)

# Dixon test (approximated pvalue)
circular_test(pigeons$control, pigeons$experimental, type = "mc")
#> 
#>       Dixon Two Sample Test
#> 
#> Data:  pigeons$control and pigeons$experimental
#> Test Statistic: 82
#> Approx. P-value: 0.0027
#> P-value stand. error: 0.0001731
#> based on 10000 Monte-Carlo replications

It is also possible to compare the exact distribution of the test statistic under (H_0) to the one obtained by simulations. This can be done as follows:

par(mfrow = c(1,2))
plot(circular_test(pigeons$control, pigeons$experimental), cex.main = 0.7)
plot(circular_test(pigeons$control, pigeons$experimental, type = "mc", B = 10^3), cex.main = 0.7)

The simulation-based approach is proposed here as an alternative to reduce the computational burden of then method when the samples are too large to allow for the calculation of the exact sampling distribution through exhaustive enumeration of all possible combinations. When the number of Monte-Carlo replication is relatively large (say around (10^3) or more) the results with the exact and approximated method are very close. In addition, this method also provides an estimate of the standard error associated to the obtained p-value, which is computed by nonparametric bootstrap. The precision of the simulation-based approach is illustrated in the simple simulation presented below where we compared the p-values obtained with the exact and simulation-based method.

# Number of simulations
B = 500

# Initialisation
exact_pval = rep(NA, B)
mc_pval = matrix(NA, B, 3)

for (i in 1:B){
  # Simulate samples
  set.seed(i)
  X = runif(8, 0, 360)
  Y = runif(7, 0, 360)
  
  # Compute exact pvalue
  exact_results = circular_test(X, Y)
  exact_pval[i] = mean(exact_results$stat <= exact_results$cv$dist)
  
  # Compute MC pvalue
  mc_results = circular_test(X, Y, type = "mc", B = 1000)
  mc_pval[i,1] = (1 + sum(mc_results$stat <= mc_results$mc$dist))/(length(mc_results$mc$dist) + 1)
  mc_results = circular_test(X, Y, type = "mc", B = 10000)
  mc_pval[i,2] = (1 + sum(mc_results$stat <= mc_results$mc$dist))/(length(mc_results$mc$dist) + 1)
  mc_results = circular_test(X, Y, type = "mc", B = 100000)
  mc_pval[i,3] = (1 + sum(mc_results$stat <= mc_results$mc$dist))/(length(mc_results$mc$dist) + 1)
}

The difference between the p-values obtained from the two methods are presented in the histograms below:

par(mfrow = c(1,3))
hist(100*(exact_pval-mc_pval[,1]), col = "lightgrey", probability = TRUE, xlab = "Exact p-value - MC p-value (%)", main = "B = 1,000", xlim = c(-0.5, 2))

hist(100*(exact_pval-mc_pval[,2]), col = "lightgrey", probability = TRUE, xlab = "Exact p-value - MC p-value (%)", main = "B = 10,000", xlim = c(-0.5, 2))

hist(100*(exact_pval-mc_pval[,3]), col = "lightgrey", probability = TRUE, xlab = "Exact p-value - MC p-value (%)", main = "B = 100,000", xlim = c(-0.5, 2))

Install Instructions

To install the TwoCircles package, there is currently one option: GitHub.

# Install dependencies
install.packages(c("circular", "devtools"))

# Install the package from GitHub
devtools::install_github("SMAC-Group/TwoCircles")

About

🔵 Exact Distributions and Performance of some Two-sample Nonparametric Tests for Circular Data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published