Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integrateSingleImage indexing error #56

Closed
pdudenas opened this issue Nov 9, 2022 · 17 comments · Fixed by #78
Closed

integrateSingleImage indexing error #56

pdudenas opened this issue Nov 9, 2022 · 17 comments · Fixed by #78

Comments

@pdudenas
Copy link
Collaborator

pdudenas commented Nov 9, 2022

When trying to call integrateSingleImage directly, I run into an IndexError here. img.energy is of type xarray.core.dataarray.DataArray, not float, so the expression on line 41 will always evaluate to True. If you pass a single energy DataArray (e.g. imgstack.unstack('system').sel(energy=270)), the array is 0-dimensional and this is where the IndexError arises.

When PyHyper calls integrateSingleImage through integrateImageStack via groupby-progress_apply, img.energy is of length 1 (perhaps because it still has the system multi index?), so indexing its first value is no issue.

Long story short, I think we can get rid of the if-else statement on lines 41-44, and just replace it with en = img.energy.values.ravel()[0]

@pbeaucage
Copy link
Collaborator

Image.energy is not strongly typed, is the problem. Within progress_apply (just a souped-up apply) if slicing directly on energy it's a float (may change with xarray version), if slicing on a multiindex that includes energy it's a DataArray.

If you look at the analogous lines in the chunked-reduction-dev branch, I might have fixed this while doing the Dask prototyping? I don't see a problem with the .ravel()[0] solution per se, just worth keeping the if-else for edge cases. You can also get it out of a 0D array by casting to float which might be simpler. Maybe a better version would be to try to cast to float (handles case where it's 0D and where it already is a float) and then catch the resulting exception, warn, and use first value? THat still breaks if it's a Dask array where you have to .read() it first but it is logically cleaner.

I'd also suggest copying the warning/guard I added over on chunked-reduction-dev.

@pdudenas
Copy link
Collaborator Author

I'll have to try the chunked-reduction-dev branch, but it does look like you fixed this issue there. Is there a remaining set of to-dos on that branch before merging?

@J-avery32
Copy link
Contributor

I also have an issue with this. Is there any temporary hack that I should know about or is it on its way to getting fixed?

@J-avery32
Copy link
Contributor

Actually I get this with the datatype being of numpy.float64. I can change the attribute energy to be a float though

@pbeaucage
Copy link
Collaborator

Try installing the latest pre-production release with 'pip install -i https://test.pypi.org/simple --pre --upgrade pyhyperscattering' and see if that fixes it? What loader is being used to generate the data -- SST1RSoXSDB or something else?

@J-avery32
Copy link
Contributor

J-avery32 commented Mar 26, 2023

I now get a different error. And yes this is SST1 RSoXS.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File [~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:45](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:45), in PFEnergySeriesIntegrator.integrateSingleImage(self, img)
     44 try:
---> 45     en = img.energy.values[0]
     46     if len(img.energy)>1:

AttributeError: 'numpy.float64' object has no attribute 'values'

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
[/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb) Cell 12 in ()
----> [1](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb#X14sZmlsZQ%3D%3D?line=0) integrated_data = integ.integrateSingleImage(data)
      [3](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb#X14sZmlsZQ%3D%3D?line=2) # the way that PyHyperScattering handles the energy dimension[/axis](https://file+.vscode-resource.vscode-cdn.net/axis) is a pain, so we clean it up now
      [4](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering.ipynb#X14sZmlsZQ%3D%3D?line=3) integrated_data = integrated_data.unstack('system')

File [~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:51](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:51), in PFEnergySeriesIntegrator.integrateSingleImage(self, img)
     49         en = float(img.energy)
     50     except AttributeError:
---> 51         en = img.energy[0]
     52         warnings.warn(f'Using the first energy value of {img.energy}, check that this is correct.',stacklevel=2)
     53 else:

IndexError: invalid index to scalar variable.

It seems like it is not handling the type numpy.float64 properly. It is assuming that the energy attribute is an xarray.

@pbeaucage
Copy link
Collaborator

Gotcha, yes, I agree with the diagnosis. Probably a change in the SST1 file format but we ought to handle this case anyway. I'm hoping to find a minute today to get a patch up for this. Just adding another try/except to the existing tree of possible types of energy...

Can you point me to a dataset (scan number is fine) that I can test with? A copy of how you're invoking the load and integrating steps would also be a big help. Thanks!

@J-avery32
Copy link
Contributor

J-avery32 commented Mar 27, 2023

The scan id is 36950 I believe.

Here is the code for invoking the load and integrating:

# From a Jupyter notebook by Matt Landsman
import PyHyperScattering
import os
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from PyHyperScattering import __version__
print(f'You are now using PyHyperScattering, version: {__version__}')

# input scan info --> "data_path" represents the master SST1_RSoXS folder where you store everything 
data_path = "/home/j/programming/work/datasets/RSoXS_SST1/ingest_round2/2022-02-07/"
scan_id = '36950/'

# turn flags on to save the .txt files and plots 
flag_save = False

image_path = os.path.join(data_path, scan_id)
save_path = os.path.join(data_path,'SST1_RSoXS_data', scan_id + '_reduced')

os.makedirs(os.path.join(data_path, 'SST1_RSoXS_data' , str(scan_id + '_reduced')), exist_ok=True)

print(image_path)
file_loader = PyHyperScattering.load.SST1RSoXSLoader(corr_mode='none')
data = file_loader.loadSingleImage(image_path+"36950-bw30_snomNa-dark-Wide Angle CCD Detector_image-7.tiff")

if data.rsoxs_config == 'waxs':
    maskmethod = 'nika'
    mask = os.path.join(data_path, 'SST1_RSoXS_masks', 'SST1_WAXS_mask.hdf')
elif data.rsoxs_config == 'saxs':
    maskmethod = 'nika'
    mask = os.path.join(data_path, 'SST1_RSoXS_masks', 'SST1_SAXS_mask.hdf')
else:
    maskmethod = 'none'
    
    
# set up integration parameters, typically shouldn't need to change anything here 
integ = PyHyperScattering.integrate.PFEnergySeriesIntegrator(maskmethod=maskmethod,
                                                             maskpath=mask,
                                                             geomethod='template_xr',
                                                             template_xr=data,
                                                             integration_method='csr_ocl')

integrated_data = integ.integrateSingleImage(data)

pbeaucage added a commit that referenced this issue Apr 4, 2023
@J-avery32
Copy link
Contributor

Additionally, I get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb) Cell 12 in ()
----> [1](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb#X14sZmlsZQ%3D%3D?line=0) integrated_data = integ.integrateSingleImage(data)
      [3](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb#X14sZmlsZQ%3D%3D?line=2) # the way that PyHyperScattering handles the energy dimension[/axis](https://file+.vscode-resource.vscode-cdn.net/axis) is a pain, so we clean it up now
      [4](vscode-notebook-cell:/home/j/programming/work/scicat_beamline/SST1_PyHyperScattering_altered.ipynb#X14sZmlsZQ%3D%3D?line=3) integrated_data = integrated_data.unstack('system')

File [~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:61](https://file+.vscode-resource.vscode-cdn.net/home/j/programming/work/scicat_beamline/~/programming/work/scicat_beamline/env/lib/python3.8/site-packages/PyHyperScattering/PFEnergySeriesIntegrator.py:61), in PFEnergySeriesIntegrator.integrateSingleImage(self, img)
     59 res = super().integrateSingleImage(img)
     60 try:
---> 61     if len(self.dest_q)>0:
     62         return res.interp(q=self.dest_q)
     63     else:

AttributeError: 'PFEnergySeriesIntegrator' object has no attribute 'dest_q'

Do you have any idea why this is? I am not giving my own mask to the integrator so maybe that is explaining it?

@pbeaucage
Copy link
Collaborator

So, integrateSingleImage is really more of a worker function, honestly it probably should be renamed _integrateSingleImage because it expects (many) outer features of the integrator to be set up first. The relevant bits are here, but you need to call integrator.setupIntegrators() and integrator.setupDestQ().

Alternately, you could just try .integrateImageStack() on your single image and see if that works -- it ought to, but there are likely to be many other instances where energy is expected to be an array, because in a true energy series measurement it is, by definition, an array. If it's a single-energy measurement, then the appropriate integration machinery is really a PFGeneralIntegrator.

Basically, understanding your use case a little better might be helpful.

That said, integration of freestanding images rather than just stacks is certainly within API scope for the package, so I would like to get this working, but understanding the context might yield a faster work-around.

@J-avery32
Copy link
Contributor

I am basically using loadSingleImage to load one image and then I try to integrate that. I tried using the loadFileSeries with regex to load one image and then integrate but it will throw an error if the regex only matches one file.

I think the best workaround is to load two images with a regex OR pattern, integrate both, and display the one I want.

@pbeaucage
Copy link
Collaborator

I see! Good news: I think the 56-integratesingleimage-indexing-error branch has a complete fix for this. It was a matter of dealing with some of the scan stacking logic and adding a cut-out for single-image stacks with no indexes at all. Can you install this branch with, e.g., pip install git+https://github.com/usnistgov/PyHyperScattering@56-integratesingleimage-indexing-error and see if that gets it working? In my test, you can now either do a integ.integrateImageStack() or integ.integrateSingelmage() on a single frame with either a PFEnergySeriesIntegrator or a PFGeneralIntegrator. I tried this with a single frame loaded from SST1 files SST1RSoXSLoader and with data streaming from Tiled using SST1RSoXSDB. Give it a shot and let me know if it's working and I'll open a PR.

@J-avery32
Copy link
Contributor

It seems to integrate a single image fine for my case.

@J-avery32
Copy link
Contributor

J-avery32 commented Apr 7, 2023

One other issue I discovered with loadSingleImage. It does not assign coords automatically to the data like loadFileSeries does.
loadFileSeries has this line:

if not output_qxy and not output_raw:
            out = out.assign_coords(pix_x=('pix_x',np.arange(0,len(out.pix_x))),pix_y=('pix_y',np.arange(0,len(out.pix_y))))

Actually not sure if this is an issue. It seems inconsistent to me but there might be a reason for it.

@J-avery32
Copy link
Contributor

J-avery32 commented Apr 7, 2023

I tried using the loadFileSeries with regex to load one image and then integrate but it will throw an error if the regex only matches one file.

This is wrong on my part as well.

@pbeaucage
Copy link
Collaborator

This is an interesting one. The assignment of the pix_x and pix_y axes is essentially in loadFileSeries for historical reasons. It might be painless to move into singleimage. I can give it a try.

Just to clarify, regex matching is working if it only matches one file, or failing? I tested this on my machine and it worked but it could be sensitive to the directory structure.

@J-avery32
Copy link
Contributor

J-avery32 commented Apr 7, 2023

It is working. I grabbed a file name at random to try and didn't realize that it excludes dark images. That's a few hours of my life I'll never get back lol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants