Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New functions for fft_ops #869

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open

New functions for fft_ops #869

wants to merge 26 commits into from

Conversation

17-sugiyama
Copy link
Contributor

@17-sugiyama 17-sugiyama commented May 23, 2024

This PR suggests adding calc_masked_psd and calc_binned_psd to fft_ops.
calc_masked_psd masks hwpss peaks and other peaks in the given PSD. This enables applying 1/f curve fitting to data with hwpss.
calc_binned_psd puts given PSD into bins. This makes fit_noise_model work much faster.
Here is a demonstration for new functions.
https://github.com/17-sugiyama/scratch/blob/main/20250114_fft_ops_PR.ipynb

Copy link
Member

@kmharrington kmharrington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm noticing this PR brings in masked numpy arrays, I think for the first time in sotodlib. I'm worried that will break our IO setup going into and out of hdf5 files.

Is there a reason we can't use Ranges / RangesMatrices to pass around masks here?

I would also like the calc_masked_psd to accept more general masks. The function would be much more useful if it was generalized beyond HWPSS lines. Ex: what if I also want to mask PTC lines?

@17-sugiyama
Copy link
Contributor Author

I was using MaskedArray for no reason, so I modified the functions not to use it.
The change also made the calculation faster. Thanks for your comments!
It's not apparent but the function calc_masked_psd has an option to mask a single peak like PTC lines.
fft_ops.calc_masked_psd(aman, peak=True, peak_freq=peak_freq, peak_width=peak_width) will do that.
Currently peak_freq accepts single frequency, so we need to run the code several times to mask multiple peaks.

# HWP speed in Hz
speed = (np.sum(np.abs(np.diff(np.unwrap(aman.hwp_angle)))) /
speed = (np.sum(np.abs(np.diff(np.unwrap(hwp_angle)))) /
(aman.timestamps[-1] - aman.timestamps[0])) / (2 * np.pi)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you implement below and use here and elsewhere?

def get_hwp_freq(timestamps=None, hwp_angle=None):
    hwp_freq = (np.sum(np.abs(np.diff(np.unwrap(hwp_angle)))) /
            (timestamps[-1] - timestamps[0])) / (2 * np.pi)
    return hwp_freq

if hwpss:
pxx_masked = []
if hwp_freq is None:
hwp_freq = np.median(aman['hwp_solution']['raw_approx_hwp_freq_1'][1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the code I suggested above.

@@ -457,3 +606,105 @@ def fit_noise_model(
if merge_fit:
aman.wrap(merge_name, noise_fit_stats)
return noise_fit_stats

def hwpss_mask(f, hwp_freq, f_max=100, width_for_1f=(-0.4, +0.6), width_for_Nf=(-0.2, +0.2)):
Copy link
Contributor

@tterasaki tterasaki Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can implement it more straight forward.
Like

def get_mask_for_hwpss(freq, hwp_freq, modes=[1,2,3,4,5,6], width=0.4):
    if isinstance(width, float):
         "apply (-width/2, width/2) for all nf"
    elif isinstance(width, [np.array, list, tuple]):
         width = np.array(width)
         if len(width.shape) == 1:
             "apply symmetric mask for each nf"
         elif len(width.shape) == 2:
             "apply asymmetric mask for each nf"

"""
mask_arrays = [((f < hwp_freq + width_for_1f[0])|(f > hwp_freq + width_for_1f[1]))]
for n in range(int(f_max//hwp_freq-1)):
mask_arrays.append(((f < hwp_freq*(n+2) + width_for_Nf[0])|(f > hwp_freq*(n+2) + width_for_Nf[1])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use get_mask_for_single_peak defined below.

mask = np.all(np.array(mask_arrays), axis=0)
return mask

def peak_mask(f, peak_freq, peak_width=(-0.002, +0.002)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think get_mask_for_single_peak is more straightforward naming.

@17-sugiyama
Copy link
Contributor Author

Thank you so much for your comments!
Besides applying your comments, I modified several things.

  • calc_pds_mask has a new option, mode. If mode==replace, it will calculate new mask array. If mode==add, newly calculated mask will be stacked on the existing one.
  • fit_noise_model has a new option, fixed_parameter. You can declare a parameter to fix during the PSD fitting.
  • fit_binned_noise_model is a new function to make PSD fittings much faster. besides knee frequency and white noise (fit parameters) are unbiased, the calculation becomes more than 20 times faster.

The demonstration of new features locates here.
https://github.com/17-sugiyama/scratch/blob/main/20240705_fft_ops_PR.ipynb

@17-sugiyama
Copy link
Contributor Author

Extended neglnlike function to apply to binned PSD.
This made fit results of binned and unbinned PSDs almost identical.

def neglnlike(params, x, y, bin_size=1, **fixed_param):
model = noise_model(x, params, **fixed_param)
output = np.sum((np.log(model) + y / model)*bin_size)
if not np.isfinite(output):
return 1.0e30
return output

@tterasaki tterasaki self-requested a review August 8, 2024 10:57
Copy link
Contributor

@tterasaki tterasaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you are almost there to get approved with minor my comments.
Another thing I want to ask you to add is a unit test for the fitting function.

I have already written basic component for the unit test like below.
Can you add the unit test using those at https://github.com/simonsobs/sotodlib/tree/fft_ops_hwpss_mask/tests ?
https://github.com/simonsobs/sotodlib/blob/fft_ops_hwpss_mask/tests/test_subpolyf.py (or some other tests) would be a good reference.

import numpy as np
from sotodlib import core
from numpy.fft import rfftfreq,irfft
from numpy.fft import fftfreq,rfft
from scipy import interpolate
import matplotlib.pyplot as plt
from sotodlib.tod_ops import fft_ops

def model_func(x, sigma, fk, alpha):
    return sigma**2 * (1 + (fk/x)**alpha)

fs = 200.
dets = core.LabelAxis('dets', [f'det{di:003}' for di in range(20)])
nsamps = 200*3600

aman = core.AxisManager(dets)
ndets = aman.dets.count

white_noise_amp_array_input = 50 + np.random.randn(ndets) #W/sqrt{Hz}
white_noise_power_array_input = white_noise_amp_array_input**2 #W^2/Hz
fknee_array_input = 1 + 0.1*np.random.randn(ndets)
alpha_array_input = 3 + 0.2*np.random.randn(ndets)

freqs = rfftfreq(nsamps, d=1/fs)
pxx_input = model_func(freqs, 
           sigma=white_noise_amp_array_input[:, np.newaxis], 
           fk=fknee_array_input[:, np.newaxis],
           alpha=alpha_array_input[:, np.newaxis])

pxx_input[:, 0] = 0
nusamps_input = core.OffsetAxis('nusamps_input', len(freqs))

aman.wrap('freqs_input', freqs, [(0, nusamps_input)])
aman.wrap('wnl_input', white_noise_amp_array_input, [(0, 'dets')])
aman.wrap('fk_input', fknee_array_input, [(0, 'dets')])
aman.wrap('alpha_input', alpha_array_input, [(0, 'dets')])
aman.wrap('Pxx_input', pxx_input, [(0, 'dets'), (1, 'nusamps_input')])

T = nsamps/fs
ft_amps = np.sqrt(pxx_input * T * fs**2 / 2)

ft_phases = np.random.uniform(0, 2*np.pi, size=ft_amps.shape)
ft_coefs = ft_amps * np.exp(1.0j*ft_phases)
realized_noise = irfft(ft_coefs)
timestamps = 1700000000 + np.arange(0, realized_noise.shape[1])/fs
aman.wrap('timestamps', timestamps, [(0, core.OffsetAxis('samps', len(timestamps)))])
aman.wrap('signal', realized_noise, [(0, 'dets'), (1, 'samps')])

if peak:
mask_idx = peak_mask(f, peak_freq, peak_width=peak_width)
f = f[mask_idx]
pxx = pxx[:, mask_idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not good to discard noise spectrum.
I think what Katie meant is to use flags manager for masking.

return output
return output

def get_hwp_freq(timestamps=None, hwp_angle=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this function to sotodlib.hwp.hwp?

else:
raise ValueError('"alpha" or "wn" can be a fixed parameter.')
return
return wn * (1 + (fknee / f) ** alpha)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you modify the definition of white noise level (wn)?
The unit of wn here (=W^2/Hz) is inconsistent with the white noise level in other place(=W/sqrt{Hz}).

return
elif len(fixed_param)==0:
if len(params)==3:
fknee, wn, alpha = params[0], params[1], params[2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you modify the order of parameters?
I think the order of [wn, fknee, alpha] is more natural and less confusing.

fitout = np.zeros((aman.dets.count, 3))
# This is equal to np.sqrt(np.diag(cov)) when doing curve_fit
covout = np.zeros((aman.dets.count, 3, 3))
for i in range(aman.dets.count):
if isinstance(fknee_est, (int, float)):
fknee_est = np.zeros(aman.dets.count)+fknee_est
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.full is a bit smarter.

@17-sugiyama
Copy link
Contributor Author

17-sugiyama commented Sep 24, 2024

I've reflected Tomoki's latest comments and added the unittest for fft_ops.
Investigating test failures

tterasaki
tterasaki previously approved these changes Sep 26, 2024
@tterasaki tterasaki self-requested a review September 26, 2024 05:23
@tterasaki tterasaki dismissed their stale review September 26, 2024 05:24

To check.

@msilvafe msilvafe requested a review from kmharrington December 6, 2024 13:02
@17-sugiyama 17-sugiyama requested review from msilvafe and removed request for msilvafe December 9, 2024 04:28
p = pxx[i]
wnest = np.median(p[((f > fwhite[0]) & (f < fwhite[1]))])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick question on this. Was there another reason that the polyfit parameter guesses were removed, other than due to the addition of the initial guesses in the function inputs? When running this with preprocessing, there isn't a way to add per-detector guesses for the initial parameter values with this branch. Was it found that it didn't really matter too much?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have removed polyfit to reduce the calculations.
White noise values of each detectors can be estimated with calc_wn.
I have confirmed that the initial parameter input of white noise is enough to perform fits as good as the polyfit version.

wn_initial_val = fft_ops.calc_wn(aman)
noise_fit_stats = fft_ops.fit_noise_model(aman, wn_est=wn_initial_val)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants