Skip to content
/ DDKS Public

A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions

License

Notifications You must be signed in to change notification settings

pnnl/DDKS

Repository files navigation

ddKS - a d-dimensional Kolmogorov-Smirnov Test

Alex Hagen1, Shane Jackson1, James Kahn2, Jan Strube1, Isabel Haide2, Karl Pazdernik1, and Connor Hainje1

1: Pacific Northwest National Laboratory, 2: Karlsruhe Institute of Technology

This code accompanies our paper submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence titled "Accelerated Computation of a High Dimensional Kolmogorov-Smirnov Distance" (arXiv).

As of 6/25/2021 there are 3 methods implemented:

  • ddKS - d-dimensional KS test caclulated per
    • Variable splitting of space (all points, subsample, grid spacing)
  • rdKS - ddKS approximation using distance from (d+1) corners
  • vdKS - ddKS approximation calculating ddks distance between voxels instead of points

Quickstart

Installation of ddks should be pretty easy, simple run

pip install git+https://github.com/pnnl/DDKS

or, if you want to develop on DDKS, simply clone this repository into a safe spot on your computer and run

pip install -e .

from the top level of the repository.

Then, you can get started used the repository by starting a ddks object and performing the distance calculation on any pair of torch tensors that are sample_size x dimension.

import torch
import ddks

p = torch.rand((100, 3))
t = torch.rand((50, 3))

calculation = ddks.methods.ddKS()
distance = calculation(p, t)
print(f"The ddKS distance is {distance}")

To operate on GPU, all you need to do is move the tensors to the device before calculation:

p = torch.rand((100, 3)).to('cuda:0')
t = torch.rand((50, 3)).to('cuda:0')

calculation = ddks.methods.ddKS()
distance = calculation(p, t)

If you want to use a different accelerated method, simply use ddks.methods.rdKS or ddks.methods.vdKS. Note that rdKS and vdKS cannot use GPU.

Package Structure:

  1. methods - Callable classes for xdks methods [x=d,r,v]
  2. data - Contains several data generators to play around with
  3. run_scripts - Contains an example run script
  4. Unit_tests - Contains unit tests for repo

About

A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •