Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataframe with cftime issues #107

Open
marikoll opened this issue Oct 21, 2019 · 7 comments
Open

dataframe with cftime issues #107

marikoll opened this issue Oct 21, 2019 · 7 comments

Comments

@marikoll
Copy link

marikoll commented Oct 21, 2019

here is an example to reproduce the issue
https://abisko.uiogeo-apps.sigma2.no/user/a23bf869-f10d-4e06-b63a-3b201c67dda6/lab/tree/Arctic_haze.ipynb

@daliagachc
Copy link
Contributor

It has to do with the index being a cftime object rather than pd.Timestamp.
we have similar issues with xarrays too. (see 102.
Ive created a function to solve this inside
negi-stuff (also instrucction to install update are here)

from negi_stuff.modules import koalas as kl
kl.check_transform_cftime_dim_2_timestamp(df_with_cftime)

@daliagachc
Copy link
Contributor

@marikoll

@daliagachc daliagachc changed the title @daliagachc dataframe with cftime issues Oct 21, 2019
@daliagachc daliagachc reopened this Oct 21, 2019
@marikoll
Copy link
Author

@daliagachc now it works with cftime.DatetimeNoLeap, but not cftime.Datetime360Day..

@daliagachc
Copy link
Contributor

hi! i went into the file but im not sure where the error occurs. could you create a small notebook where you open a dataframe with cftime.Datetime360Day so that i can debug?
the send me the shareable file again
thanks!

@daliagachc
Copy link
Contributor

ok. now its fixed. the problem was that if cftime.Datetime360Day, then values like (2019,2,30) are allowed but that makes no sense in pd.Timestamp so we set it to 15 days to be in the middle and avoid the error.

new_df = kl.check_transform_cftime_dim_2_timestamp(_dsUKESM, middle_of_month=True)

make sure to update negi_stuff.
also you may want to resample to month after using the above line of code to make sure
that all monthly data is synced e.g.

new_df = new_df.resample('M').mean()

@marikoll
Copy link
Author

Great, now it works! Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants