Skip to content

Conversation

timothymillar
Copy link
Collaborator

@timothymillar timothymillar commented Sep 26, 2025

Copy link
Collaborator

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Is there a time-cost for rechunking like this on small inputs? (I'm not entirely clear what happens by calling rechuink)

@timothymillar
Copy link
Collaborator Author

I don't think there is any cost to this rechunk unless the 'chunks' argument is actually used. By default, this will just 'rechunk' a single chunk array to another single chunk array. Dask doesn't appear to even include the operation in the task graph (with .visualize(optimize_graph=False)).

I tried rechunking a dummy array to its original size 10_000 times in a loop and don't notice any slowdown on .compute(). So I'm assuming that this case is recognized at call time and doesn't affect the actual compute graph.

@timothymillar
Copy link
Collaborator Author

Looks like the CI failures are unrelated, Docs failing on 3.11 isn't ideal but it seems to be an issue with the ipynb tutorials not executing.

@timothymillar timothymillar merged commit 1374138 into sgkit-dev:main Sep 30, 2025
12 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High memory usage when calculating chunked Hamilton-Kerr relationship matrix
2 participants