-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated WorldCerealInferenceDataset #103
Conversation
Scattered discussion of the above in older PR (#65). Once we'd merge this one, we can close the other one. |
presto/dataset.py
Outdated
months, | ||
target, | ||
lon, | ||
lat, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need latlons
and lon, lat
here? If I understand correctly, lon == latlons[:, 1]
and lat == latlons[:, 0]
, which means we don't need lon, lat
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not so simple I'm afraid, I struggled with this a lot. After the meshgrid and flattening of latlons
it becomes quite hard to get back to original lon, lat
we need to properly reconstruct the DataArray.
latlons.shape
(2500, 2)
lon.shape
(50,)
So no, lon != latlons[:, 1]
. I've been thinking about easier ways but haven't found them as of yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm okay. I think this is probably (?) easier, especially considering we additionally apply the transformation. The latlons take up quite a bit of RAM, and for large tiles this might become an issue.
Just in case its useful here is some code to go from the flat latlons back to the original ones (but without the transformation):
import numpy as np
from einops import rearrange
org_lat = np.array([1, 2, 3])
org_lon = np.array([4, 5, 6, 7])
def to_flat_latlons(lat, lon):
lon, lat = np.meshgrid(lon, lat)
latlons = rearrange(np.stack([lat, lon]), "c x y -> (x y) c")
return latlons
def from_latlons(latlons):
x = len(np.unique(latlons[:, 0]))
y = len(np.unique(latlons[:, 1]))
latlons = rearrange(latlons, "(x y) c -> c x y", x=x, y=y)
lats, lons = latlons[0], latlons[1]
return lats[:, 0], lons[0, ]
output_lat, output_lon = from_latlons(to_flat_latlons(org_lat, org_lon))
assert np.equal(output_lon, org_lon).all()
assert np.equal(output_lat, org_lat).all()
I wonder if its worth passing around the original lats and lons and only applying that transformation right before the model ingests the values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope i understood this well. Could you check if my latest commit address this the way you suggest it? Feel free to suggest to do it differently. Rest of the day, unfortunately I'm away. Will catch up tomorrow morning.
@gabrieltseng can't get the checks to work on this one :-( flake complains (not locally for me) about missing white space but when I add it, black wants to split the line and this is going endless in loops. Not sure how to fix. |
hmm strange. I will try fixing on my end and if it works merge it in |
Hmm actually @kvantricht would you mind undoing 8ac58df ? I think i prefer the solution before. Perhaps in perhaps |
This reverts commit 8ac58df.
…esto-worldcereal into updated-inferencedatasets
@gabrieltseng so this appears to be all a bit more nasty than anticipated. I reverted the commit and started implementing your suggestion of name changes, but it turns out that with all the changes that were done (had to be done), once we arrive at Not sure if I'm making myself clear, but the only way the code works is by using the original x and y coordinates of the DataArray in the reconstruction, which are not necessarily lat and lon. So for now I called them what they are, Looking forward to what you make of all this ... |
Cool this looks good to me - thanks @kvantricht |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Extractions of test patches are now more in line with actual inference on openEO. WorldCerealInferenceDataset has been updated to cope with these new files.