Skip to content

Speeding up reduce #142

@martinjohndyer

Description

@martinjohndyer

I know there's an existing issue #48 that covers reduce with drift mode, but I thought it would be cleaner to open a new one here to keep track of my efforts.

I'm looking at ways to speed up reduce, with two common use cases:

  • Full-frame, unbinned data is very slow to process
  • Ultra-fast drift mode reductions can't keep up with the data

I've started looking at some profiling for both cases. First off, simply timing how long each frame takes to get reduced. This is only running on my laptop, so the absolute times probably aren't that useful, but it's a benchmark to get started with.

FF, 1x1 images (using a random standard star observation):

  • Without plotting: dt = 0.64 $\pm$ 0.18 s
  • With plotting: dt = 0.66 $\pm$ 0.13 s
  • Actual time between frames: 3.06 s

Drift mode (using one of Vik's FRB observations):

  • Without plotting: dt = 0.043 $\pm$ 0.026 s
  • With plotting: dt = 0.088 $\pm$ 0.041 s
  • Actual time between frames: 0.013 s

For the FF images we're obviously well within the frame time, but over half a second for each frame feels painfully slow. Plotting makes very little difference.

On the other hand with drift mode it's well known that reduce can't keep up with the instrument. Here it's very clear: without plotting we're ~3.1x slower than real time, while with plotting we're ~6.5x slower. The code is already pretty well optimised, so reaching those speedups might be a challenge.


I've gone through and done some line profiling of the code called in reduce. These are all with plotting disabled for now, since for FF runs there's no difference and drift mode runs already need improving before we start considering plotting.

  • ~90% of the time spent in scripts.reduce() is spent calling the reduction.ProcessCCDs processer class, not a surprise since that does the actual reduction.
  • 99.7% of time in reduction.ProcessCCDs.__call__() goes straight to the reduction.ccdproc function.
  • ~90% of the time in reduction.ccdproc() goes to the moveApers function, with only 10% going to the actual extraction function (e.g. reduction.extractFlux)
  • ~90% of the time in reduction.moveApers() goes to the profile fit (in this case fitting.fitMoffat) via fitting.combFit (which just selects Moffat or Gaussian fit functions)
  • Finally in fitting.fitMoffat() 90% of the time is spent in the least squares fit, which uses scipy.optimize.least_squares().

So there are a few other losses along the way, but the majority (~66%) of the time spent when reducing is in carrying out the Moffat fit for adjusting the aperture positions. Which I think wasn't unexpected. So that's the main focus of trying to find extra speed. The fitting code is already very well optimised by Tom, the actual functions fitting.moffat() and fitting.dmoffat() etc use Numba JIT compiling and (at least in the case with ndiv=0) there's isn't much obvious to improve. But this is only an initial look.


Incidentally going through the fitting module I did find one awkward big introduced in 03783f6 (there are actually two errors I found traced back to that commit, the minor one is that Gaussian fit has nfev += nfev rather than nfev += res.nfev). The more major error (I think) is in fitMoffat where the initial doubled threshold which is supposed to only be for the first pass is actually used in all cases:

# reject any above the defined threshold
if first_fit:
# first fit carried out with higher threshold for safety
sigma[ok & (np.abs(resid) > 2*sfac*thresh)] *= -1
else:
sigma[ok & (np.abs(resid) > 2*sfac*thresh)] *= -1

fitGaussian has the correct code

hipercam/hipercam/fitting.py

Lines 1205 to 1209 in 817f2fa

if first_fit:
# first fit carried out with higher threshold for safety
sigma[ok & (np.abs(resid) > 2*sfac*thresh)] *= -1
else:
sigma[ok & (np.abs(resid) > sfac*thresh)] *= -1

This means that using the default RMS rejection threshold fit_thresh=8 from any .red file has actually been using 16 instead, when using the Moffat profile.

I'm bringing this up here because when I tried fixing the bug and removing the wayward 2* the code massively slowed down, going from 0.6 seconds per loop for FF data to 1.2s! That's a bit concerning, but I need to read more into it to understand exactly what's going on and why that change make such a difference.

Metadata

Metadata

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions