-
Notifications
You must be signed in to change notification settings - Fork 126
SNOW-1672579 Encode DataFrame.to_snowpark_pandas
#2711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1672579 Encode DataFrame.to_snowpark_pandas
#2711
Conversation
|
||
snowpark_pandas_df = df.to_snowpark_pandas(index_col=["A"], columns=["C", "B"]) | ||
|
||
# Perform a pandas operation on the result to ensure nothing breaks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this operation should then produce with AST enabled the internal Snowpark calls, or should it not? Just trying to understand the logic here. Perhaps you'll need to add snowpark_pandas_df.groupby("A").to_numpy()
or so to trigger an eval... Yet our test logic should flush the last batch no matter what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't produce an AST because I don't think we added any pandas AST logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid scope creep, I think it's a good idea to test the pandas AST logic in a different PR. This PR should only focus on whether the to_snowpark_pandas
protobuf is generated correctly. Here's the ticket for it: https://snowflakecomputing.atlassian.net/browse/SNOW-1849281
Testing the pandas logic with local testing/nop testing is very janky so it will have to be in a separate test file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The modin file changes LGTM, but someone on the IR team should look at the rest
# create a temporary table out of the current snowpark dataframe | ||
temporary_table_name = random_name_for_temp_object( | ||
TempObjectType.TABLE | ||
) # pragma: no cover | ||
ast_id = self._ast_id | ||
self._ast_id = None # set the AST ID to None to prevent AST emission. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So setting _emit_ast=False
wasn't enough? This seems like a bug.
@@ -3925,7 +3950,7 @@ def write(self, _emit_ast: bool = True) -> DataFrameWriter: | |||
""" | |||
|
|||
# AST. | |||
if _emit_ast and self._ast_id is not None: | |||
if self._ast_id is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. This is a bit of a hack, but probably reasonable given the alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, since write
is a property, I can't use the _emit_ast
parameter.
tests/ast/test_ast_driver.py
Outdated
@@ -94,6 +94,11 @@ def load_test_cases(): | |||
Returns: a list of test cases. | |||
""" | |||
test_files = DATA_DIR.glob("*.test") | |||
if sys.version_info[1] < 9: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably check that [0]
== 3. Your grandchildren who will have to maintain this code will thank you ;).
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-1672579
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
DataFrame.to_snowpark_pandas
.to_snowpark_pandas
.