|
1 |
| -balltree |
2 |
| -======== |
| 1 | +.. image:: https://raw.githubusercontent.com/jlvdb/balltree/main/docs/source/_static/logo.png |
| 2 | + :width: 800 |
| 3 | + :alt: balltree |
3 | 4 |
|
| 5 | +| |
| 6 | +
|
| 7 | +.. image:: https://img.shields.io/pypi/v/balltree?logo=pypi&logoColor=blue |
| 8 | + :target: https://pypi.org/project/balltree/ |
4 | 9 | .. image:: https://github.com/jlvdb/balltree/actions/workflows/python-extension-ci.yml/badge.svg
|
5 | 10 | :target: https://github.com/jlvdb/yet_another_wizz/actions/workflows/python-extension-ci.yml
|
6 | 11 | .. image:: https://readthedocs.org/projects/balltree/badge/?version=latest
|
7 | 12 | :target: https://balltree.readthedocs.io/en/latest/?badge=latest
|
8 | 13 |
|
9 |
| -Fast balltree implementation for 3-dim (weighted) data with an Euclidean |
10 |
| -distance norm. The base implementation is in `C` and there is a wrapper for |
11 |
| -`Python`. |
| 14 | +| |
| 15 | +
|
| 16 | +A fast ball tree implementation for three dimensional (weighted) data with an |
| 17 | +Euclidean distance norm. The base implementation is in `C` and there is a |
| 18 | +wrapper for `Python`. |
12 | 19 |
|
13 | 20 | The tree is optimised towards spatial correlation function calculations since
|
14 | 21 | the query routines are geared towards range queries, i.e. counting pairs with a
|
15 | 22 | given (range of) separations. Fixed number nearest neighbour search is currently
|
16 | 23 | not implemented.
|
17 | 24 |
|
18 |
| -Range queries are typically 25-30x faster than the corresponding implementation |
19 |
| -in `scipy.spatial.KDTree` (see below). |
20 |
| - |
| 25 | +| |
21 | 26 |
|
22 | 27 | Installation
|
23 | 28 | ------------
|
24 | 29 |
|
25 | 30 | A `C` library can be built with the provided make file, the python wrapper is
|
26 |
| -automatically compiled and installed with ``pip install .``. |
| 31 | +automatically compiled and installed with ``pip install balltree``. |
27 | 32 |
|
28 | 33 | The installation does not require any external `C` libraries, the python wrapper
|
29 | 34 | requires the ``Python.h`` header (which should be included in a default python
|
30 | 35 | installation) and `numpy` (including ``numpy/arrayobject.h``).
|
31 | 36 |
|
32 | 37 |
|
33 |
| -Usage |
34 |
| ------ |
35 |
| - |
36 |
| -Below are two examples that illustrate how to use the ball tree from `C` and |
37 |
| -`Python`. |
38 |
| - |
39 |
| -Using the `C` library |
40 |
| -^^^^^^^^^^^^^^^^^^^^^ |
41 |
| - |
42 |
| -.. code-block:: c |
43 |
| -
|
44 |
| - #include <stdio.h> |
45 |
| - #include <stdlib.h> |
46 |
| -
|
47 |
| - #include "point.h" // point_create, ptbuf_gen_random |
48 |
| - #include "balltree.h" // balltree_build, balltree_count_radius |
49 |
| -
|
50 |
| - int main(int argc, char** argv) { |
51 |
| - // uniform random points with (x, y, z) coordinates in range [-1, 1) |
52 |
| - int n_data = 1000000; |
53 |
| - srand(12345); // seed random generator |
54 |
| - PointBuffer *buffer = ptbuf_gen_random(-1.0, 1.0, n_data); |
55 |
| - if (buffer == NULL) return 1; |
56 |
| -
|
57 |
| - // build tree from points with default leaf size |
58 |
| - BallTree *tree = balltree_build(buffer); |
59 |
| - if (tree == NULL) return 1; |
60 |
| -
|
61 |
| - // count neighbours of all points |
62 |
| - double query_radius = 0.2; |
63 |
| - double count; |
64 |
| - for (long i = 0; i < buffer->size; ++i) { |
65 |
| - Point *query_point = buffer->points + i; |
66 |
| - count += balltree_count_radius(tree, query_point, query_radius); |
67 |
| - } |
68 |
| - printf("pairs in r <= %.1f: %.0f\n", query_radius, count); |
69 |
| -
|
70 |
| - // count neighbours of all points using the efficient dual-tree algorithm |
71 |
| - count = balltree_dualcount_radius(tree, tree, query_radius); |
72 |
| - printf("pairs in r <= %.1f: %.0f\n", query_radius, count); |
73 |
| -
|
74 |
| - return 0; |
75 |
| - } |
76 |
| -
|
77 |
| -Using the `Python` wrapper |
78 |
| -^^^^^^^^^^^^^^^^^^^^^^^^^^ |
79 |
| - |
80 |
| -.. code-block:: python |
81 |
| -
|
82 |
| - import numpy as np |
83 |
| - from balltree import BallTree |
84 |
| -
|
85 |
| -
|
86 |
| - if __name__ == "__main__": |
87 |
| - # uniform random points with (x, y, z) coordinates in range [-1, 1) |
88 |
| - n_data = 1_000_000 |
89 |
| - rng = np.random.default_rng(12345) |
90 |
| - points = rng.uniform(-1.0, 1.0, size=(n_data, 3)) |
91 |
| -
|
92 |
| - # build tree from points with default leaf size |
93 |
| - tree = BallTree(points) |
94 |
| -
|
95 |
| - # count neighbours of all points |
96 |
| - query_radius = 0.2 |
97 |
| - count = tree.count_radius(points, query_radius) |
98 |
| - print(f"pairs in r <= {query_radius:.1f}: {count:.0f}") |
99 |
| -
|
100 |
| - # count neighbours of all points using the efficient dual-tree algorithm |
101 |
| - count = tree.dualcount_radius(tree, query_radius) |
102 |
| - print(f"pairs in r <= {query_radius:.1f}: {count:.0f}") |
103 |
| -
|
104 |
| -
|
105 |
| -Comparison to scipy.spatial.KDTree |
106 |
| ----------------------------------- |
107 |
| - |
108 |
| -The python package `scipy` implements a popular KDTree in |
109 |
| -``scipy.spatial.KDTree``. The majority of this code is written in `Cython/C++`. |
110 |
| - |
111 |
| -Setup |
112 |
| -^^^^^ |
113 |
| - |
114 |
| -- Dataset: ``953,255`` galaxies from the Baryon Oscillation Spectroscopic Survey, |
115 |
| - converted from sky coordinates *(right ascension, declination)* to points on the |
116 |
| - 3D unit sphere *(x, y, z)*. |
117 |
| -- Counting pairs formed between all objects within a fixed radius of ``r <= 0.2``: |
118 |
| - - ``balltree.count_radius(...)`` (with unit weights) |
119 |
| - - ``scipy.spatial.KDTree.query_ball_point(..., return_length=True)`` (no weights) |
120 |
| -- Counting the same pairs using the optimised dualtree algorithm. |
121 |
| - - ``balltree.dualcount_radius(...)`` (with unit weights) |
122 |
| - - ``scipy.spatial.KDTree.count_neighbors(...)`` (with unit weights) |
123 |
| - |
124 |
| -Results (single thread, AMD Epyc) |
125 |
| -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
126 |
| - |
127 |
| -- Single point query using all points: |
128 |
| - | ``balltree.count_radius: found 24688969825 pairs in 26.737 sec`` |
129 |
| - | ``KDTree.query_ball_point: found 24688969825 pairs in 630.395 sec`` |
130 |
| -
|
131 |
| -- Using the dualtree algorithm: |
132 |
| - | ``balltree.dualcount_radius: found 24688969825 pairs in 11.591 sec`` |
133 |
| - | ``KDTree.count_neighbors: found 24688969825 pairs in 321.993 sec`` |
134 |
| -
|
135 |
| -This corresponds to a **speed of of 25-30x** given test hardware and dataset. |
136 |
| - |
137 |
| - |
138 | 38 | Maintainers
|
139 | 39 | -----------
|
140 | 40 |
|
|
0 commit comments