Skip to content

sdall/spass

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Discovering Significant Patterns under Sequential False Discovery Control

This repository provides a Julia library that implements the Significant Pattern Association (Spass) algorithm. By leveraging a binomial redundancy test for a sequentially-updating maximum entropy null-model, Spass provides an efficient method for discovering concise sets of statistically significantly non-redundant higher-order feature interactions (i.e., patterns). To highlight commonalities and differences between groups, Spass statistically associates each pattern with a subset of groups.

The code is a from-scratch implementation of algorithms described in the paper.

Sebastian Dalleiger and Jilles Vreeken. 2022. 
Discovering Significant Patterns under Sequential False Discovery Control. 
(KDD '22), pp. 263–272. https://doi.org/10.1145/3534678.3539398

Please consider citing the paper.

Contributions are welcome.

Installation

To install the library from the REPL:

julia> using Pkg; Pkg.add(url="https://github.com/sdall/spass.git")

To install the library from the command line:

julia -e 'using Pkg; Pkg.add(url="https://github.com/sdall/spass.git")'

To set up the command line interface (CLI) located in bin/spass.jl:

  1. Clone the repository:
git clone https://github.com/sdall/spass
  1. Install the required dependencies including the library:
julia -e 'using Pkg; Pkg.add(path="./spass"); Pkg.add.(["Comonicon", "CSV", "GZip", "JSON"])'

Usage

A typical usage of the library is:

julia> using Spass: spass, FDR, FWER, patterns
julia> p = spass(FWER, X; alpha = 0.01)
julia> patterns(p)

For more information, please see the documentation:

help?> spass

A typical usage of the command line interface is:

chmod +x bin/spass.jl
bin/spass.jl dataset.dat.gz dataset.labels.gz --alpha=0.01 --fdr > output.json

The output contains patterns and executiontime in seconds (cf. --measure-time for details). For more information regarding usage, additional options, or input format, please see the provided documentation:

bin/spass.jl --help

About

Discovering Significant Patterns under Sequential False Discovery Control (KDD 2022)

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages