Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct use of NeighbourhoodProcessing #1893

Closed
ecasellas opened this issue Apr 6, 2023 · 2 comments
Closed

Correct use of NeighbourhoodProcessing #1893

ecasellas opened this issue Apr 6, 2023 · 2 comments
Assignees

Comments

@ecasellas
Copy link

Hello,

First of all, thank you for sharing this Python package, it's extremely useful! At the Meteorological Service of Catalonia we built a probability blending scheme based on Improver tools (neighbourhood, blending and calibration). Now, we're trying to simplify the code and I came across the lead_times parameter (I don't remember if it was available in previous versions) in the NeighbourhoodProcessing class. I have a doubt regarding how to properly use this class.

square_nbhood = NeighbourhoodProcessing(neighbourhood_method='square',
                                        radii=nbhood_sizes,
                                        lead_times=list(cube.coord('forecast_period').points.data))

where

nbhood_sizes = [1000, 2000, 3000, 4000, 5000, 6000]
lead_times = [1, 2, 3, 4, 5, 6]

When I call square_nbhood.process(cube) where cube is <iris 'Cube' of probability_of_precipitation_amount_above_threshold / (1) (time: 6; projection_y_coordinate: 688; projection_x_coordinate: 688)> , I get the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/nowcasting/.conda/envs/lup/lib/python3.10/site-packages/improver/nbhood/nbhood.py", line 376, in process
    check_radius_against_distance(cube, self.radius)
  File "/home/nowcasting/.conda/envs/lup/lib/python3.10/site-packages/improver/nbhood/nbhood.py", line 72, in check_radius_against_distance
    if radius > max_allowed:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

If I call square_nbhood.process(cube) using:

for cube_slice in cube.slices([cube.coord(axis="y"), cube.coord(axis="x")]):
    square_nbhood.process(cube_slice)

Everything seems fine. So my doubt is, the correct way to implement NeighbourhoodProcessing with multiple lead_times is the second one? Because, if I check the source code for BaseNeighbourhoodProcessing (lines 389-395 of improver.improver.nbhood.nbhood.py) it already does the slicing I do before calling square_nbhood.process(cube), so I don't see why I have to previously slice the cube.

result_slices = CubeList()
for cube_slice in cube.slices([cube.coord(axis="y"), cube.coord(axis="x")]):
    cube_slice.data = self._calculate_neighbourhood(
        cube_slice.data, mask_cube_data
    )
    result_slices.append(cube_slice)
neighbourhood_averaged_cube = result_slices.merge_cube()

I hope I've explained myself well, please let me know if I didn't. Thank you very much!

@bayliffe
Copy link
Contributor

Hi Enric,

Apologies for the slow response, but we've had a long weekend here in the UK for Easter. You've hit upon an issue that we don't encounter because of the way we use IMPROVER, but it remains something that it would be nice to correct.

The issue

Our processing using IMPROVER always uses files that contain a single validity time, i.e. there is not a leading time coordinate on the cubes. Most plugins will handle the case of multiple times in a single file, but it seems that neighbourhood processing is not amongst them, with this line being the reason, along with a lack of later handling.

The _find_radii function is calculating the radius to use for the cube's lead-time. For example, we might set up the plugin like this:

square_nbhood = NeighbourhoodProcessing(
    neighbourhood_method='square',
    radii=[10000, 20000],  # Units of metres
    lead_times=[0, 10]  # Units of hours
)

and then provide a cube with a lead-time of 5 hours (T+5). This function will calculate that the radius to use for neighbourhood processing should be 15000m, assuming a linear change in radius with lead-time.

In the case where a cube is provided with multiple times, as in your example, this function will return multiple radii that correspond to each of the relevant lead-times. For example:

cube lead-times: 0, 5, 10
returned radii: 10000, 15000, 20000

The plugin currently then assigns this to self.radius, our use of the singular "radius" here showing our assumption that this will be a single value. When a check is made that the radius is not too large it assumes a single value and it falls over with the error you are seeing.

The immediate solution

The immediate workaround for your case is to slice over the time coordinate, calling the neighbourhooding plugin for each slice, and then recombine the data into a single cube. This is essentially what you've shown above, slicing over the x-y coordinates outside the plugin. Note that the slicing that already exists within the plugin (which you highlight) is to handle multiple threshold values assuming there is a single time; the x-y slices are effectively slicing over a threshold coordinate.

A variation on your own solution is this:

square_nbhood = NeighbourhoodProcessing(
    neighbourhood_method='square',
    radii=[10000, 20000],  # Units of metres
    lead_times=[0, 10]  # Units of hours
)

result_slices = CubeList()
for cube_slice in cube.slices_over("time"):
    result_slices.append(
        square_nbhood.process(cube_slice)
    )
neighbourhood_averaged_cube = result_slices.merge_cube()

A better solution

This issue ought to be resolved to allow the neighbourhood plugin to handle multiple times. I've created the following issue for this to be done, though it is not high-priority work for us. If you felt it was something you could do, we always welcome contributions, but otherwise it will be done at some unknown future date.

#1894

I hope that's helpful and describes why you are seeing this issue. Let me know if you have any further questions, and if not, please close this issue.

Regards,
Ben

@ecasellas
Copy link
Author

Hi Ben,

No worries at all! We also had a long weekend here in Catalonia.

Thank you very much for all your explanations and your rapid response, it was very helpful. It's clear now for me how to use the current NeighbourhoodProcessing class and I'll adopt the immediate solution you propose. However, I'll also try to provide a solution for multi-time cubes, I don't know if I'll manage to do it, we'll see.

Thank you again,

Regards,
Enric

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants