Releases: giotto-ai/giotto-ph
v0.2.4
What's Changed
- Add Python 3.12 by @raphaelreinauer in #77
Full Changelog: v0.2.3...v0.2.4
Reason for this small release
To allow giotto-tda
to be built on python 3.12
v0.2.3
What's Changed
- [CPP] Avoid unnecessary binary search when k = 1 (0-simplices) by @ulupo in #51
- Small codebase updates by @MonkeyBreaker in #50
- [PY] Compute sparse matrix when input is a point-cloud with a threshold applied by @MonkeyBreaker in #54
- [DOCS] Specify that parallel computations are ony available in dims >= 1 by @ulupo in #58
- Prepare v0.2.1 by @ulupo in #59
- Proposal to deal with explicit infinite edges by @ulupo in #53
- Extend dense matrix backend to allow for nonzero vertex weights by @MonkeyBreaker in #61
- [CPP] Specialized simplex vertices by @ulupo in #38
- Drop Python 3.6, enable Apple Silicon wheels for Python 3.8, 3.9 and 3.10, upgrade to manylinux2014 by @MonkeyBreaker in #62
- Prepare v0.2.2 by @ulupo in #64
- Add macOS x86_64 wheels to cover Python 3.7 on Intel Macs by @ulupo in #66
- Include tests in build distribution by @ulupo in #67
- Remove unnecessary requirements by @ulupo in #69
- pybind11 module updated by @rballeba in #71
- Add support for building Python 3.11 wheels by @raphaelreinauer in #76
New Contributors
- @rballeba made their first contribution in #71
- @raphaelreinauer made their first contribution in #76
Full Changelog: v0.2.0...v0.2.3
giotto-ph version 0.2.2
Minor release bringing bug fixes, performance improvements, wheels for Apple Silicon, and EOS for Python 3.6.
Major Features and Improvements
- The dense matrix C++ backend has been extended to allow for nonzero vertex weights. This can lead to large speedups when computing weighted Rips filtrations (#61).
- The binary search routine to find the largest-indexed vertex in a simplex [
get_max_vertex
in the C++ backend, as inRipser
) has been replaced with a faster floating-point routine in the case of 1-simplices (edges). This still gives exact results for all cases of interest, and can be substantially faster (#38). - Wheels for Apple Silicon are now available for Python versions 3.8, 3.9 and 3.10 (#62).
Bug Fixes
- Bars in the barcode with death at
numpy.inf
are now explicitly treated as essential bars instead of finite bars (#53).
Backwards-Incompatible Changes
- Python 3.6 is no longer supported, and the manylinux standard has been bumped from
manylinux2010
tomanylinux2014
(#62).
Thanks to our Contributors
This release contains contributions from many people:
Umberto Lupo and Julian Burella Pérez.
We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.
giotto-ph version 0.2.1
Minor release bringing bug fixes and performance improvements.
Major Features and Improvements
- Miscellaneous improvements including verification that
coeff
is a prime, improvements to the Python bindings (see "Backwards-Incompatible Changes" below), updated C++ preprocessor directives and code style fixes (#50). - An unnecessary binary search for 0-dimensional simplices is now avoided in the C++ backend, leading to faster runtimes (#51).
scikit-learn
'sNearestNeighbors
is now used for computing sparse thresholded distance matrices, leading to large benefits on memory consumption and runtime in many cases of interest (#54).- General improvements to the documentation (#54 and #58).
Bug Fixes
- A bug in computing sparse thresholded distance matrices has been fixed (#54).
Backwards-Incompatible Changes
- Barcodes are now returned as (lists of) arrays of dtype
numpy.float32
instead ofnumpy.float64
, since single-precision floats are used internally by the C++ backend (#50).
Thanks to our Contributors
This release contains contributions from many people:
Julian Burella Pérez and Umberto Lupo.
We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.
giotto-ph version 0.2.0
Major Features and Improvements
ripser_parallel
can now return "flag persistence generators", i.e. vertices and edges creating/destroying bars in the barcode (#29). A new Jupyter notebook illustrates usage of this new feature.- Wheels for Python 3.10 have been added (#47).
- The computation of the enclosing radius has been sped up (#28).
ripser_parallel
can now be imported directly fromgph
(#36).- A garbage collection step which was found to negatively impact runtimes with little memory benefit has been removed (#33).
Bug Fixes
- Sparse input with (signed) 64-bit row and column indices is now correctly dealt with by the Python interface (#34).
- A bug causing segfaults to occur when
maxdim=0
was passed to the C++ backend has been fixed (#40). - An algorithmic error in dealing with edges with zero weight in the 0-dimensional computation has been fixed (#43).
Backwards-Incompatible Changes
None.
Thanks to our Contributors
This release contains contributions from many people:
Umberto Lupo and Julian Burella Pérez.
We are also grateful to all who filed issues or helped resolve them, asked and answered questions, and were part of inspiring discussions.
giotto-ph version 0.1.0
Major Features and Improvements
Introduction
We introduce giotto-ph
, a high-performance, open-source software package for the computation of Vietoris--Rips barcodes. giotto-ph
is based on Morozov and Nigmetov's lockfree (multicore) implementation of Ulrich Bauer's Ripser package. It also contains a re-implementation of Boissonnat and Pritam's "Edge Collapser", implemented so far only in the GUDHI library. Our contribution is twofold: on the one hand, we integrate existing state-of-the-art ideas coherently in a single library and provide Python bindings to the C++ code. On the other hand, we increase parallelization opportunities and improve overall performance by adopting higher performance data structures. The final implementation of our persistent homology backend establishes a new state of the art, surpassing even GPU-accelerated implementations such as Ripser++ when using as few as 5-10 CPU cores. Furthermore, our implementation of the edge collapser algorithm has reduced dependencies and significantly improved run-times.
Python API
There is one unique main function that can be called: ripser_parallel
To call the main function in python, simply do:
from gph.python import ripser_parallel
import numpy as np
# generate your data
data = np.random.rand(100, 3)
# compute the persistence diagram
dgm = ripser_parallel(data)
Here a description of the different parameters of this function:
Compute persistence diagrams for X data array using a high performance, parallel version of Ripser.
If X is a point cloud, it will be converted to a distance matrix using the chosen metric.
-
Xndarray
of shape(n_samples, n_features)
A numpy array of either data or distance matrix. Can also be a sparse distance matrix of type scipy.sparse -
maxdim
int, optional, default:1
Maximum homology dimension computed. Will compute all dimensions lower than or equal to this value. For 1, both H_0 and H_1 will be computed. -
thresh
float, optional, default:numpy.inf
Maximum distances considered when constructing filtration. If numpy.inf, compute the entire filtration. -
coeff
int prime, optional, default:2
Compute homology with coefficients in the prime field Z/pZ forp=coeff
. -
metric
string or callable, optional, default:'euclidean'
The metric to use when calculating distance between instances in a feature array. If set to 'precomputed', input data is interpreted as a distance matrix or of adjacency matrices of a weighted undirected graph. If a string, it must be one of the options allowed by scipy.spatial.distance.pdist() for its metric parameter, or a or a metric listed in sklearn.pairwise.PAIRWISE_DISTANCE_FUNCTIONS, including 'euclidean', 'manhattan' or 'cosine'. If a callable, it should take pairs of vectors (1D arrays) as input and, for each two vectors in a pair, it should return a scalar indicating the distance/dissimilarity between them. -
metric_params
dict, optional, default: {}
Additional parameters to be passed to the distance function. -
weights
"DTM", ndarray or None, optional, default: None
If notNone
, the persistence of a weighted Vietoris-Rips filtration is computed as described in 3, and this parameter determines the vertex weights in the modified adjacency matrix. "DTM" denotes the empirical distance-to-measure function. -
weight_params
dict or None, optional, default: None
Parameters to be used in the case of weighted filtrations, see weights. In this case, the key "p" determines the power to be used in computing edge weights from vertex weights. It can be one of 1, 2 or np.inf and defaults to 1. If weights is "DTM", the additional keys "r" (default: 2) and "n_neighbors" (default: 3) are available (see weights, where the latter corresponds to n). -
collapse_edges
bool, optional, default: False
Whether to use the edge collapse algorithm as described in 2 prior to calling ripser_parallel. -
n_threads
int, optional, default:1
Maximum number of threads available to use during persistent homology computation. When passing-1
, it will try to use the maximal number of threads available on the host machine.
For more details about the performance and benchmarks, you can have a look here
Bug Fixes
None.
Backwards-Incompatible Changes
None.
Thanks to our Contributors
This release contains contributions from many people:
Julián Burella Pérez, Sydney Hauke, Umberto Lupo, Matteo Caorsi.
We are also grateful to all of those who filed issues or helped resolve them, asked and
answered questions, and were part of inspiring discussions.