Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-828211 Support registering vectorized UDTF #940

Merged
merged 10 commits into from
Jul 20, 2023

Conversation

sfc-gh-stan
Copy link
Collaborator

@sfc-gh-stan sfc-gh-stan commented Jul 12, 2023

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

  • Implement the support for registering vectorized UDTF by (1) specifying PandasDataFrameType as output_schema or (2) specify PandasDataFrame as type hints or (3) specify pd.DataFrame as type hints and a StructType(List[StructField]) as output_schema.

@sfc-gh-stan sfc-gh-stan changed the title SNOW-828211 Support registering vectorized UDTF SNOW-828211 Support registering vectorized UDTF (1) Jul 12, 2023
@sfc-gh-stan sfc-gh-stan changed the title SNOW-828211 Support registering vectorized UDTF (1) SNOW-828211 Support registering vectorized UDTF Jul 13, 2023
@sfc-gh-stan sfc-gh-stan marked this pull request as ready for review July 13, 2023 22:43
@sfc-gh-stan sfc-gh-stan requested a review from a team as a code owner July 13, 2023 22:43
tests/integ/test_udtf.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 13, 2023

Codecov Report

Merging #940 (9d2a30c) into main (edd7f29) will increase coverage by 0.05%.
The diff coverage is 95.00%.

❗ Current head 9d2a30c differs from pull request most recent head d2ff232. Consider uploading reports for the commit d2ff232 to get more accurate results

@@            Coverage Diff             @@
##             main     #940      +/-   ##
==========================================
+ Coverage   98.46%   98.52%   +0.05%     
==========================================
  Files          51       50       -1     
  Lines        9079     9019      -60     
  Branches     1626     1621       -5     
==========================================
- Hits         8940     8886      -54     
+ Misses         55       52       -3     
+ Partials       84       81       -3     
Impacted Files Coverage Δ
src/snowflake/snowpark/_internal/udf_utils.py 97.56% <93.33%> (+0.25%) ⬆️
src/snowflake/snowpark/_internal/type_utils.py 98.07% <100.00%> (ø)
src/snowflake/snowpark/functions.py 99.52% <100.00%> (-0.01%) ⬇️
src/snowflake/snowpark/types.py 100.00% <100.00%> (ø)
src/snowflake/snowpark/udtf.py 97.29% <100.00%> (-0.43%) ⬇️

... and 4 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@sfc-gh-sfan sfc-gh-sfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are gonna have some merge conflict fun :)

src/snowflake/snowpark/_internal/udf_utils.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/udf_utils.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/udf_utils.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/udf_utils.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/udf_utils.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/udf_utils.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/udf_utils.py Show resolved Hide resolved
output_schema, Iterable
): # with column names instead of StructType. Read type hints to infer column types.
# can we refactor this block to be in process_registration_inputs?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add a TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored already, lemme remove this comment haha.

@@ -34,6 +38,36 @@
pytestmark = pytest.mark.udf


@pytest.fixture(scope="module")
def vectorized_udtf_test_table(session) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might need to enable the vectorized udtf in the stored proc test file by adding an alter session.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is internal though. Should we set it using alter session or on the account level?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I mean you could add the alter session in this test file.

@sfc-gh-stan sfc-gh-stan enabled auto-merge (squash) July 20, 2023 16:58
@sfc-gh-stan sfc-gh-stan merged commit 81e5396 into main Jul 20, 2023
39 checks passed
@sfc-gh-stan sfc-gh-stan deleted the Support-vectorized-UDTF branch July 20, 2023 17:03
@github-actions github-actions bot locked and limited conversation to collaborators Jul 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants