-
-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up edge bundling #1383
Speed up edge bundling #1383
Conversation
…use named tuples for segment lengths and simplify parameters
…smooth segment processing
Woah, nice work! Will have to play around a bit. |
That all sounds really promising. Thanks for the contribution! We'll review and let you know, but as @amaloney suggests, posting some benchmarking examples in the associated issue would be really useful. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1383 +/- ##
==========================================
- Coverage 88.42% 88.40% -0.02%
==========================================
Files 93 93
Lines 18705 18727 +22
==========================================
+ Hits 16540 16556 +16
- Misses 2165 2171 +6 ☔ View full report in Codecov by Sentry. |
I've had the opportunity to add the @lmcinnes I would suggest making new PRs if you have other ideas or thoughts on how to optimize this module further. I plan to take what you have done and apply it to other areas of the codebase, as I find using the explicit flags inside the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@nb.jit( | ||
nb.float32(nb.float32[::1], nb.float32[::1]), | ||
nopython=True, | ||
nogil=True, | ||
fastmath=True, | ||
locals={"result": nb.float32, "diff": nb.float32}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of the explicit nature of this decorator. I think we should use this as a good example of why being terse (e.g. using @ngjit
) is not always better.
Also, I like the usage of the trailing comma
def distance_between(a, b): | ||
"""Find the Euclidean distance between two points.""" | ||
return (((a[0] - b[0]) ** 2) + ((a[1] - b[1]) ** 2))**(0.5) | ||
|
||
|
||
@ngjit | ||
def resample_segment(segments, new_segments, min_segment_length, max_segment_length, ndims): | ||
diff = a[0] - b[0] | ||
result = diff * diff | ||
diff = a[1] - b[1] | ||
result += diff * diff | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code took the square root to find the Euclidean distance, while the new addition does not. Computationally it doesn't matter, but we should update the docstring to let the reader know it doesn't matter, in case someone has never come across this before. I'm thinking about future readers that may consider it a bug.
locals={ | ||
'next_point': nb.float32[::1], | ||
'current_point': nb.float32[::1], | ||
'step_vector': nb.float32[::1], | ||
'i': nb.uint16, | ||
'pos': nb.uint64, | ||
'index': nb.uint64, | ||
'distance': nb.float32 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
superb usage of locals, that I am going to use in the codebase everywhere
fastmath=True, | ||
locals={'it': nb.uint8, "i": nb.uint16, "x": nb.uint16, "y": nb.uint16} | ||
) | ||
def advect_and_resample(vert, horiz, segments, iterations, accuracy, squared_segment_length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accuracy
should be nb.int16
, see line 521. To me this says we might want to incorporate static type checking, but that's on us to implement for the future.
if p.use_dask: | ||
resample_edges_fn = delayed(resample_edges) | ||
draw_to_surface_fn = delayed(draw_to_surface) | ||
get_gradients_fn = delayed(get_gradients) | ||
advect_resample_all_fn = delayed(advect_resample_all) | ||
else: | ||
resample_edges_fn = resample_edges | ||
draw_to_surface_fn = draw_to_surface | ||
get_gradients_fn = get_gradients | ||
advect_resample_all_fn = advect_resample_all | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Co-authored-by: Andy Maloney <amaloney@mailbox.org>
The hammer edge bundling is fantastic, but can be quite time consuming for large graphs (100,000+ edges and upward). I spent some time benchmarking the code, and then doing some profiling to determine how the time was being spent, and then attempted to make some minor improvements. I quickly discovered that, at least on the machines and setups I tried, the dask usage actually made it significantly slower. I have therefore made dask optional, with a param
use_dask
which defaults toFalse
. I also spent a while trying to wring the most I could out of numba for many of the core or frequently called functions. Primarily this involved adding more annotations to the decorators, and a careful rewrite of the distance function which is called extremely often. The remainder of the work was re-juggling when and where various computations were done to avoid duplicate work, or move more loops inside numba where possible. Lastly I rewrote_convert_edge_segments_to_dataframe
to use a single large numpy allocation followed byzip
andchain
rather than a generator with many many small allocations (the code is a little less readable, but significantly faster for very large graphs).After all these changes a typical use case for me (a knn-graph) went from 1h20m to 15s, and scaled up versions of the random graph examples from the docs (with
n=500
andm=100000
) went from 1h 13min for circular layout and 1h 39min for force-directed layout to 1min 38s and 60s respectively. This would make edge-bundling and graph drawing with datashader (which I love!) far more practical for a much wider range of datasets.I'll be happy to discuss the changes as required -- some are more important than others, but I went down the optimization rabbit hole and did all the things I could.