Skip to content

feat: move slow jumps and jumpfix to so3g #1081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 13, 2025
Merged

feat: move slow jumps and jumpfix to so3g #1081

merged 5 commits into from
Feb 13, 2025

Conversation

skhrg
Copy link
Member

@skhrg skhrg commented Dec 26, 2024

Relies on simonsobs/so3g#196
This is a direct rewrite for the most part, no algorithmic changes.

@skhrg skhrg requested review from kwolz and mmccrackan January 8, 2025 23:34
Copy link
Contributor

@kwolz kwolz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into the changes. I have only two small comments for my ease of understanding, otherwise it looks good to me.

if isinstance(jumps, np.ndarray):
jumps = RangesMatrix.from_mask(np.atleast_2d(jumps))
elif isinstance(jumps, Ranges):
jumps = RangesMatrix.from_mask(np.atleast_2d(jumps.mask()))
if not isinstance(jumps, RangesMatrix):
raise TypeError("jumps not RangesMatrix or convertable to RangesMatrix")
jumps = cast(RangesMatrix, jumps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant to me, but it probably isn't. Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly that was just to make my typechecker shut up...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough!

@@ -310,6 +311,10 @@ def test_jumpfinder(self):
heights = heights[heights.nonzero()].ravel()
self.assertTrue(np.all(np.abs(np.array([10, -13, -8]) - np.round(heights)) < 3))

# Check fixing
ptp_fix = np.ptp(fixed.ravel()[~jumps.buffer(10).mask().ravel()])
self.assertTrue(ptp_fix < 1.1*ptp_orig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really checking the jump fixing? It seems to me this is just checking that the fixed matrix is not messed up in the places where there were no jumps in the first place...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so the jump fixing assumes everything within the jump range is the jump so you end up with an offset still inside the jump range (which we ignore since we gapfill those ranges anyways).
The ptp checks if we fixed because after the jump there is a DC offset that will be gone in the fixed TOD.

See below for an example (blue is before fixing, orange after):
image

If we wanted to not have the errors within the jump ranges in the fixed TOD you just need to lower the buffer size for the ranges matrix (but then your jump could be not in the range). The current range is set by the window size used for jump finding to give a reasonable margin of error.

Copy link
Contributor

@kwolz kwolz Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for the visual explanation, the ptp matter is clear now. Just for my understanding: is jumps.buffer(10).mask() flagging the jump regions plus a buffer region covering the 10 samples before and the 10 samples after the respective jump region?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, upped the buffer in the test to be conservative so that things never fail in CI. But also I set the PRNG key so I think that is unneeded.

Copy link
Contributor

@mmccrackan mmccrackan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks fine other than the failing checks.

@skhrg
Copy link
Member Author

skhrg commented Jan 15, 2025

simonsobs/so3g#196 is now merged so this should be good to go very soon

from scipy.sparse import csr_array
from skimage.restoration import denoise_tv_chambolle
from so3g import (
matched_jumps,
matched_jumps64,
block_minmax,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to create some backwards-compatibility is to not import these by name. Just import so3g. Then the code will crash when you try to use those functions, rather than when you try to import them.

As written now, everyone's installation will break, if they try to use anything from sotodlib.tod_ops, unless they've updated so3g.

@skhrg
Copy link
Member Author

skhrg commented Feb 11, 2025

Now that the new so3g is out can someone approve this?

Copy link
Member

@mhasself mhasself left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments ... looking forward to seeing this out in the wild.

from numpy.typing import NDArray
from pixell.utils import block_expand, block_reduce, moveaxis
from pixell.utils import moveaxis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the one remaining moveaxis with np.moveaxis.

x_fixed = x
if not inplace:
x_fixed = x.copy()
orig_shape = x.shape
x_fixed = np.atleast_2d(x_fixed)
x_fixed = np.ascontiguousarray(x_fixed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce unnecessary reallocations, make use of np.asarray. To do these two lines, for example, you can just: x_fixed = np.asarray(x, order='C', copy=True)

Comment on lines 200 to 201
heights = heights.astype(x.dtype)
heights = np.ascontiguousarray(heights)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# This will only copy if needed for dtype/order.
heights = np.asarray(heights, dtype=x.dtype, order='C')


orig_shape = x.shape
x = np.atleast_2d(x)
x = np.ascontiguousarray(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user requested inplace, and x is not contiguous, then the change won't be in-place. So you need to check for that.

For example

x = np.asarray(x, order='C', copy=(False if inplace else None))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to raise a value error if inplace is requested but the array is not contiguous?
I could instead raise a warning, make a copy, and then copy back into the original x later. But that may be babying the user too much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually assuming we want to raise the error in that case is the most correct block something like:

x_fixed = np.asarray(np.atleast_2d(x), order="C", copy=(False if inplace else None))

That way the cases are as follows:

  • in place and already contiguous -> x_fixed just references x
  • in place and not contiguous -> ValueError
  • no in place and already contiguous -> x_fixed is a copy of x
  • no in place and not contiguous -> x_fixed is a contiguous copy of x

Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W.r.t. "no in place and already contiguous -> x_fixed is a copy of x" -- actually not true; np.asarray(... copy=None) will not copy if it doesn't need to.

Another way to approach this, to minimize copies and simplify logic:

  • Apply all the conversions you need to get x_fixed, minimizing copies where possible.
  • Check whether you ended up making a copy, or not -- fixed_is_a_copy = np.may_share_memory(x, x_fixed)
  • Take different actions depending on (fixed_is_a_copy, inplace).

I'm ok with your suggestion of making a copy and then updating x at the end.

The reason to ValueError would be that "inplace" is sometimes used not just to compress code, but to minimize copying of the data -- it's sort of implied that doing things inplace will use less RAM, less time. But I think printing a warning is sufficient to help super-optimizers tweak things up.

@mhasself
Copy link
Member

Oh, darn... asarray(copy=...) is numpy>=2 :(

@skhrg
Copy link
Member Author

skhrg commented Feb 12, 2025

Tests failing with TypeError: asarray() got an unexpected keyword argument 'copy', looks like that was added in numpy 2.0.0 (and I believe the PR for numpy 2.0 support in so3g is not yet ready).

I'll redo some logic to sidestep this... but hopefully in the future we can switch to it

@skhrg skhrg force-pushed the more_jump_speedups branch 2 times, most recently from 24bd10f to b007892 Compare February 12, 2025 18:34
@skhrg skhrg force-pushed the more_jump_speedups branch from b007892 to 56dd87d Compare February 12, 2025 18:55
@skhrg
Copy link
Member Author

skhrg commented Feb 12, 2025

...the problem with devving on both my laptop and site computing is sometimes I don't push all my fixes.
Should pass now.

@skhrg skhrg requested a review from mhasself February 12, 2025 19:07
@skhrg skhrg merged commit b81e1d3 into master Feb 13, 2025
3 checks passed
@skhrg skhrg deleted the more_jump_speedups branch February 13, 2025 14:04
@msilvafe
Copy link
Contributor

@skhrg does this close out Issue #975 ?

@msilvafe msilvafe mentioned this pull request May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants