Speeding up reduce

*I know there's an existing issue #48 that covers `reduce` with drift mode, but I thought it would be cleaner to open a new one here to keep track of my efforts.*

I'm looking at ways to speed up `reduce`, with two common use cases:
* Full-frame, unbinned data is very slow to process
* Ultra-fast drift mode reductions can't keep up with the data

I've started looking at some profiling for both cases. First off, simply timing how long each frame takes to get reduced. This is only running on my laptop, so the absolute times probably aren't that useful, but it's a benchmark to get started with.

#### FF, 1x1 images (using a random standard star observation):
* Without plotting: **dt = 0.64  $\pm$ 0.18 s**
* With plotting: **dt = 0.66 $\pm$ 0.13 s**
* Actual time between frames: 3.06 s

#### Drift mode (using one of Vik's FRB observations):
* Without plotting: **dt = 0.043  $\pm$ 0.026 s**
* With plotting: **dt = 0.088 $\pm$ 0.041 s**
* Actual time between frames: 0.013 s

For the FF images we're obviously well within the frame time, but over half a second for each frame feels painfully slow. Plotting makes very little difference.

On the other hand with drift mode it's well known that reduce can't keep up with the instrument. Here it's very clear: without plotting we're **~3.1x** slower than real time, while with plotting we're **~6.5x** slower. The code is already pretty well optimised, so reaching those speedups might be a challenge.

----

I've gone through and done some line profiling of the code called in `reduce`. These are all with plotting disabled for now, since for FF runs there's no difference and drift mode runs already need improving before we start considering plotting.
* ~90% of the time spent in `scripts.reduce()` is spent calling the `reduction.ProcessCCDs` processer class, not a surprise since that does the actual reduction.
* 99.7% of time in `reduction.ProcessCCDs.__call__()` goes straight to the `reduction.ccdproc` function.
* ~90% of the time in `reduction.ccdproc()` goes to the `moveApers` function, with only 10% going to the actual extraction function (e.g. `reduction.extractFlux`)
* ~90% of the time in `reduction.moveApers()` goes to the profile fit (in this case `fitting.fitMoffat`) via `fitting.combFit` (which just selects Moffat or Gaussian fit functions)
* Finally in `fitting.fitMoffat()` 90% of the time is spent in the least squares fit, which uses `scipy.optimize.least_squares()`.

So there are a few other losses along the way, but the majority (~66%) of the time spent when reducing is in carrying out the Moffat fit for adjusting the aperture positions. Which I think wasn't unexpected. So that's the main focus of trying to find extra speed. The fitting code is already very well optimised by Tom, the actual functions `fitting.moffat()` and `fitting.dmoffat()` etc use Numba JIT compiling and (at least in the case with `ndiv=0`) there's isn't much obvious to improve. But this is only an initial look.

----

Incidentally going through the `fitting` module I did find one awkward big introduced in https://github.com/HiPERCAM/hipercam/commit/03783f6288a9702211f593f9368b1746b130d592 (there are actually two errors I found traced back to that commit, the minor one is that Gaussian fit has `nfev += nfev` rather than `nfev += res.nfev`). The more major error (I think) is in `fitMoffat` where the initial doubled threshold which is supposed to only be for the first pass is actually used in all cases:

https://github.com/HiPERCAM/hipercam/blob/817f2fa421dd186001c4b98d881353ca58c932a0/hipercam/fitting.py#L403-L408

`fitGaussian` has the correct code

https://github.com/HiPERCAM/hipercam/blob/817f2fa421dd186001c4b98d881353ca58c932a0/hipercam/fitting.py#L1205-L1209

This means that using the default RMS rejection threshold `fit_thresh=8` from any .red file has actually been using 16 instead, when using the Moffat profile.

I'm bringing this up here because when I tried fixing the bug and removing the wayward `2*` the code massively slowed down, going from 0.6 seconds per loop for FF data to 1.2s! That's a bit concerning, but I need to read more into it to understand exactly what's going on and why that change make such a difference.



	# reject any above the defined threshold
	if first_fit:
	# first fit carried out with higher threshold for safety
	sigma[ok & (np.abs(resid) > 2sfacthresh)] *= -1
	else:
	sigma[ok & (np.abs(resid) > 2sfacthresh)] *= -1

	if first_fit:
	# first fit carried out with higher threshold for safety
	sigma[ok & (np.abs(resid) > 2sfacthresh)] *= -1
	else:
	sigma[ok & (np.abs(resid) > sfacthresh)] = -1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up reduce #142

FF, 1x1 images (using a random standard star observation):

Drift mode (using one of Vik's FRB observations):

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speeding up reduce #142

Description

FF, 1x1 images (using a random standard star observation):

Drift mode (using one of Vik's FRB observations):

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions