Speed up edge bundling #1383

lmcinnes · 2025-01-07T15:22:22Z

The hammer edge bundling is fantastic, but can be quite time consuming for large graphs (100,000+ edges and upward). I spent some time benchmarking the code, and then doing some profiling to determine how the time was being spent, and then attempted to make some minor improvements. I quickly discovered that, at least on the machines and setups I tried, the dask usage actually made it significantly slower. I have therefore made dask optional, with a param use_dask which defaults to False. I also spent a while trying to wring the most I could out of numba for many of the core or frequently called functions. Primarily this involved adding more annotations to the decorators, and a careful rewrite of the distance function which is called extremely often. The remainder of the work was re-juggling when and where various computations were done to avoid duplicate work, or move more loops inside numba where possible. Lastly I rewrote _convert_edge_segments_to_dataframe to use a single large numpy allocation followed by zip and chain rather than a generator with many many small allocations (the code is a little less readable, but significantly faster for very large graphs).

After all these changes a typical use case for me (a knn-graph) went from 1h20m to 15s, and scaled up versions of the random graph examples from the docs (with n=500 and m=100000) went from 1h 13min for circular layout and 1h 39min for force-directed layout to 1min 38s and 60s respectively. This would make edge-bundling and graph drawing with datashader (which I love!) far more practical for a much wider range of datasets.

I'll be happy to discuss the changes as required -- some are more important than others, but I went down the optimization rabbit hole and did all the things I could.

…use named tuples for segment lengths and simplify parameters

…smooth segment processing

philippjfr · 2025-01-07T15:57:27Z

Woah, nice work! Will have to play around a bit.

amaloney · 2025-01-07T17:11:47Z

Thanks @lmcinnes, I created an issue for this PR, issue #1384. The goal of the issue is to document the examples for the speedup, and to discuss other topics that are unrelated to the PR code.

jbednar · 2025-01-07T19:30:07Z

That all sounds really promising. Thanks for the contribution! We'll review and let you know, but as @amaloney suggests, posting some benchmarking examples in the associated issue would be really useful.

codecov · 2025-01-07T22:29:37Z

Codecov Report

Attention: Patch coverage is 95.09804% with 5 lines in your changes missing coverage. Please review.

Project coverage is 88.40%. Comparing base (633f33c) to head (56cd97e).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
datashader/bundling.py	95.09%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1383      +/-   ##
==========================================
- Coverage   88.42%   88.40%   -0.02%     
==========================================
  Files          93       93              
  Lines       18705    18727      +22     
==========================================
+ Hits        16540    16556      +16     
- Misses       2165     2171       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

amaloney · 2025-01-10T20:20:02Z

I've had the opportunity to add the parallel=True flag to spots in the module where numba can parallelize loops, as well as adding type signatures to the other parts of the code that have the jit decorator. I am not able to squeeze out any more optimization without diving really deep, which should not hold up this PR as it functions very well right now.

@lmcinnes I would suggest making new PRs if you have other ideas or thoughts on how to optimize this module further. I plan to take what you have done and apply it to other areas of the codebase, as I find using the explicit flags inside the numba.jit decorator much easier to read than the nonstandard @ngjit currently being used.

amaloney

lgtm

amaloney · 2025-01-10T21:06:57Z

datashader/bundling.py

+@nb.jit(
+    nb.float32(nb.float32[::1], nb.float32[::1]),
+    nopython=True,
+    nogil=True,
+    fastmath=True,
+    locals={"result": nb.float32, "diff": nb.float32},
+)


I'm a big fan of the explicit nature of this decorator. I think we should use this as a good example of why being terse (e.g. using @ngjit) is not always better.

Also, I like the usage of the trailing comma

amaloney · 2025-01-10T21:11:21Z

datashader/bundling.py

 def distance_between(a, b):
    """Find the Euclidean distance between two points."""
-    return (((a[0] - b[0]) ** 2) + ((a[1] - b[1]) ** 2))**(0.5)
-
-
-@ngjit
-def resample_segment(segments, new_segments, min_segment_length, max_segment_length, ndims):
+    diff = a[0] - b[0]
+    result = diff * diff
+    diff = a[1] - b[1]
+    result += diff * diff
+    return result


The previous code took the square root to find the Euclidean distance, while the new addition does not. Computationally it doesn't matter, but we should update the docstring to let the reader know it doesn't matter, in case someone has never come across this before. I'm thinking about future readers that may consider it a bug.

amaloney · 2025-01-10T21:12:18Z

datashader/bundling.py

+    locals={
+        'next_point': nb.float32[::1],
+        'current_point': nb.float32[::1],
+        'step_vector': nb.float32[::1],
+        'i': nb.uint16,
+        'pos': nb.uint64,
+        'index': nb.uint64,
+        'distance': nb.float32
+    }


superb usage of locals, that I am going to use in the codebase everywhere

datashader/bundling.py

amaloney · 2025-01-10T21:22:46Z

datashader/bundling.py

+    fastmath=True,
+    locals={'it': nb.uint8, "i": nb.uint16, "x": nb.uint16, "y": nb.uint16}
+)
+def advect_and_resample(vert, horiz, segments, iterations, accuracy, squared_segment_length,


accuracy should be nb.int16, see line 521. To me this says we might want to incorporate static type checking, but that's on us to implement for the future.

amaloney · 2025-01-10T21:26:35Z

datashader/bundling.py

+        if p.use_dask:
+            resample_edges_fn = delayed(resample_edges)
+            draw_to_surface_fn = delayed(draw_to_surface)
+            get_gradients_fn = delayed(get_gradients)
+            advect_resample_all_fn = delayed(advect_resample_all)
+        else:
+            resample_edges_fn = resample_edges
+            draw_to_surface_fn = draw_to_surface
+            get_gradients_fn = get_gradients
+            advect_resample_all_fn = advect_resample_all
+


Co-authored-by: Andy Maloney <amaloney@mailbox.org>

lmcinnes added 9 commits January 6, 2025 16:20

Improve performance of hammer edge bundling

10a6d3b

Typo on naming function var

49ea9e7

Jit more things more carefully.

3478ebb

exploring numba parallelism

f11ec9d

Don't do double work on resampling; refactor resampling functions to …

d0428a2

…use named tuples for segment lengths and simplify parameters

float accuracy for img

c38505a

Refactor accumulation methods to handle multiple points and optimize …

0239eed

…smooth segment processing

Update accuracy parameter bounds to allow a maximum of 65535

6080067

Refactor advect_segments function to inline logic

15af8fb

dask puts it all in a length 1 list; fix for that

fea12d6

amaloney mentioned this pull request Jan 7, 2025

Speed up edge bundling #1384

Closed

lmcinnes added 4 commits January 7, 2025 16:13

reformat long lines

9e8b686

Remove trailing whitespace

c81fa45

more trailing whitespace

9143ec1

yet more trailing whitespace

89984f4

Merge branch 'main' into speed_up_bundling

904dd9f

amaloney approved these changes Jan 10, 2025

View reviewed changes

Update datashader/bundling.py

2610f53

Co-authored-by: Andy Maloney <amaloney@mailbox.org>

amaloney approved these changes Jan 13, 2025

View reviewed changes

Merge branch 'main' into speed_up_bundling

56cd97e

amaloney merged commit d9403a9 into holoviz:main Jan 13, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up edge bundling #1383

Speed up edge bundling #1383

lmcinnes commented Jan 7, 2025

philippjfr commented Jan 7, 2025

amaloney commented Jan 7, 2025

jbednar commented Jan 7, 2025

codecov bot commented Jan 7, 2025 •

edited

Loading

amaloney commented Jan 10, 2025

amaloney left a comment

amaloney Jan 10, 2025

amaloney Jan 10, 2025

amaloney Jan 10, 2025

amaloney Jan 10, 2025

amaloney Jan 10, 2025

Speed up edge bundling #1383

Speed up edge bundling #1383

Conversation

lmcinnes commented Jan 7, 2025

philippjfr commented Jan 7, 2025

amaloney commented Jan 7, 2025

jbednar commented Jan 7, 2025

codecov bot commented Jan 7, 2025 • edited Loading

Codecov Report

amaloney commented Jan 10, 2025

amaloney left a comment

Choose a reason for hiding this comment

amaloney Jan 10, 2025

Choose a reason for hiding this comment

amaloney Jan 10, 2025

Choose a reason for hiding this comment

amaloney Jan 10, 2025

Choose a reason for hiding this comment

amaloney Jan 10, 2025

Choose a reason for hiding this comment

amaloney Jan 10, 2025

Choose a reason for hiding this comment

codecov bot commented Jan 7, 2025 •

edited

Loading