Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for use_logical_type in write_pandas. #1706

Closed

Conversation

dvorst
Copy link
Contributor

@dvorst dvorst commented Aug 22, 2023

use_logical_type is a new file format option of Snowflake. It is a Boolean that specifies whether Snowflake interprets Parquet logical types during data loading. The default behavior of write_pandas is unchanged. When users write a dataframe that contains datetimes with timezones and do not pass use_logical_type = True as an argument, a warning is raised (see #1687). Providing this option also fixes issue #1687

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-889573: write_pandas incorrectly writes timestamp_tz #1687

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

Snowflake recently released a fix for issue #1688, which was server side related. The fix involves a new file format parameter "use_logical_type". When writing data that contains timezones, the parameter should be set to True. The pull request presented here not only adds this option to the write_pandas method, but also raises a user warning if the users writes a pandas dataframe that contains timezones, but does not set use_logical_type to True. Reason for raising a warning and not an error is to maintain old behaviour, and to prevent users from suddenly running into issues when updating their pandas version.

use_logical_type is a new file format option of Snowflake.
It is a Boolean that specifies whether Snowflake interprets Parquet logical types during data loading.
The default behavior of write_pandas is unchanged.
When users write a dataframe that contains datetimes with timezones and do not pass use_logical_type = True as an argument, a warning is raised (see snowflakedb#1687).
Providing this option also fixes issue snowflakedb#1687
@github-actions
Copy link

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


Dennis Van de Vorst seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request

@dvorst
Copy link
Contributor Author

dvorst commented Aug 22, 2023

I have read the CLA Document and I hereby sign the CLA

@dvorst
Copy link
Contributor Author

dvorst commented Aug 22, 2023

recheck

@dvorst dvorst closed this Aug 22, 2023
@dvorst dvorst deleted the add_support_for_use_logical_type branch August 22, 2023 13:41
@github-actions github-actions bot locked and limited conversation to collaborators Aug 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SNOW-889573: write_pandas incorrectly writes timestamp_tz
1 participant