-
Notifications
You must be signed in to change notification settings - Fork 132
SNOW-1805842: Add plotly integ tests and Interoperability doc #2725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
sfc-gh-lmukhopadhyay
merged 18 commits into
main
from
lmukhopadhyay-SNOW-1805842-add-plotly-integ-tests
Dec 14, 2024
Merged
Changes from 12 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
539799d
SNOW-1805842: add plotly integ tests and interop doc
sfc-gh-lmukhopadhyay 12ee8aa
update setup for plotly dependency
sfc-gh-lmukhopadhyay 6f27bbd
move and update doc
sfc-gh-lmukhopadhyay aecb735
Merge branch 'main' into lmukhopadhyay-SNOW-1805842-add-plotly-integ-…
sfc-gh-lmukhopadhyay 747964e
add interop doc to toctree
sfc-gh-lmukhopadhyay 5f6abc8
Merge branch 'main' into lmukhopadhyay-SNOW-1805842-add-plotly-integ-…
sfc-gh-lmukhopadhyay c176fd2
Apply doc change suggestions from code review
sfc-gh-lmukhopadhyay fde1806
cleanup doc and test file
sfc-gh-lmukhopadhyay ee54d7b
update doc and test changes from review
sfc-gh-lmukhopadhyay 9d4b988
Merge branch 'main' into lmukhopadhyay-SNOW-1805842-add-plotly-integ-…
sfc-gh-lmukhopadhyay 85d31a5
fix doc title
sfc-gh-lmukhopadhyay 4c71489
limit plotly version
sfc-gh-lmukhopadhyay f56ddc5
review changes
sfc-gh-lmukhopadhyay c86f59f
Merge branch 'main' into lmukhopadhyay-SNOW-1805842-add-plotly-integ-…
sfc-gh-lmukhopadhyay 68fb6a5
change setup comment
sfc-gh-lmukhopadhyay afdacec
Update setup.py
sfc-gh-mvashishtha 0dcd3e4
update setup comment
sfc-gh-lmukhopadhyay 25172d3
Merge branch 'main' into lmukhopadhyay-SNOW-1805842-add-plotly-integ-…
sfc-gh-lmukhopadhyay File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
Interoperability with third party libraries | ||
============================================= | ||
|
||
Many third party libraries are interoperable with pandas, for example by accepting pandas dataframes objects as function | ||
inputs. Here we have a non-exhaustive list of third party library use cases with pandas and note whether each method | ||
works in Snowpark pandas as well. | ||
|
||
Snowpark pandas supports the `dataframe interchange protocol <https://data-apis.org/dataframe-protocol/latest/>`_, which | ||
some libraries use to interoperate with Snowpark pandas to the same level of support as pandas. | ||
|
||
The following table is structured as follows: The first column contains a method name. | ||
The second column is a flag for whether or not interoperability is guaranteed with Snowpark pandas. For each of these | ||
methods, we validate that passing in a Snowpark pandas dataframe as the dataframe input parameter behaves equivalently | ||
to passing in a pandas dataframe. | ||
|
||
.. note:: | ||
``Y`` stands for yes, i.e., interoperability is guaranteed with this method, and ``N`` stands for no. | ||
|
||
Plotly.express module methods | ||
|
||
.. note:: | ||
Currently only plotly versions <6.0.0 are supported through the dataframe interchange protocol. | ||
|
||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
sfc-gh-helmeleegy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Method name | Interoperable with Snowpark pandas? (Y/N) | Notes for current implementation | | ||
sfc-gh-helmeleegy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``scatter`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``line`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``area`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``timeline`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``violin`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``bar`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``histogram`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``pie`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``treemap`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``sunburst`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``icicle`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``scatter_matrix`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``funnel`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``density_heatmap`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``boxplot`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ | ||
| ``imshow`` | Y | | | ||
+-------------------------+---------------------------------------------+--------------------------------------------+ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
213 changes: 213 additions & 0 deletions
213
tests/integ/modin/interoperability/plotly/test_plotly.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
# | ||
# Copyright (c) 2012-2024 Snowflake Computing Inc. All rights reserved. | ||
# | ||
|
||
import modin.pandas as pd | ||
import numpy as np | ||
import plotly.express as px | ||
import pytest | ||
import pandas as native_pd | ||
|
||
import snowflake.snowpark.modin.plugin # noqa: F401 | ||
from tests.integ.utils.sql_counter import sql_count_checker | ||
from tests.integ.modin.utils import eval_snowpark_pandas_result | ||
|
||
# Integration tests for plotly.express module (https://plotly.com/python-api-reference/plotly.express.html). | ||
# To add tests for additional APIs, | ||
# - Call the method with Snowpark pandas and native pandas df input and get the JSON representation with | ||
# `to_plotly_json()`. | ||
# - Assert correctness of the plot produced using `assert_plotly_equal` function defined below. | ||
|
||
|
||
def assert_plotly_equal(expect, got): | ||
sfc-gh-mvashishtha marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# referenced from cudf plotly integration test | ||
# https://github.com/rapidsai/cudf/blob/main/python/cudf/cudf_pandas_tests/third_party_integration_tests/tests/ | ||
# test_plotly.py#L10 | ||
|
||
assert type(expect) == type(got) | ||
if isinstance(expect, dict): | ||
assert expect.keys() == got.keys() | ||
for k in expect.keys(): | ||
assert_plotly_equal(expect[k], got[k]) | ||
elif isinstance(got, list): | ||
assert len(expect) == len(got) | ||
for i in range(len(expect)): | ||
assert_plotly_equal(expect[i], got[i]) | ||
elif isinstance(expect, np.ndarray): | ||
if isinstance(expect[0], float): | ||
np.testing.assert_allclose(expect, got) | ||
else: | ||
assert (expect == got).all() | ||
else: | ||
assert expect == got | ||
|
||
|
||
@pytest.fixture() | ||
def test_dfs(): | ||
nsamps = 50 | ||
rng = np.random.default_rng(seed=42) | ||
data = { | ||
"x": rng.random(nsamps), | ||
"y": rng.random(nsamps), | ||
"category": rng.integers(0, 5, nsamps), | ||
"category2": rng.integers(0, 5, nsamps), | ||
} | ||
snow_df = pd.DataFrame(data) | ||
native_df = native_pd.DataFrame(data) | ||
return snow_df, native_df | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_scatter(test_dfs): | ||
# test_dfs = dfs() | ||
sfc-gh-mvashishtha marked this conversation as resolved.
Show resolved
Hide resolved
|
||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.scatter(df, x="x", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_line(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.line(df, x="category", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_area(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.area(df, x="category", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_timeline(): | ||
native_df = native_pd.DataFrame( | ||
[ | ||
dict(Task="Job A", Start="2009-01-01", Finish="2009-02-28"), | ||
dict(Task="Job B", Start="2009-03-05", Finish="2009-04-15"), | ||
dict(Task="Job C", Start="2009-02-20", Finish="2009-05-30"), | ||
] | ||
) | ||
snow_df = pd.DataFrame(native_df) | ||
eval_snowpark_pandas_result( | ||
snow_df, | ||
native_df, | ||
lambda df: px.timeline( | ||
df, x_start="Start", x_end="Finish", y="Task" | ||
).to_plotly_json(), | ||
comparator=assert_plotly_equal, | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_violin(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.violin(df, y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_bar(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.bar(df, x="category", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_histogram(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.histogram(df, x="category").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_pie(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.pie(df, values="category", names="category2").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_treemap(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.treemap(df, names="category", values="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_sunburst(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.sunburst(df, names="category", values="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_icicle(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.icicle(df, names="category", values="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_scatter_matrix(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.scatter_matrix(df, dimensions=["category"]).to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_funnel(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.funnel(df, x="x", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_density_heatmap(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.density_heatmap(df, x="x", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=1) | ||
def test_box(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.box(df, x="category", y="y").to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) | ||
|
||
|
||
@sql_count_checker(query_count=4) | ||
def test_imshow(test_dfs): | ||
eval_snowpark_pandas_result( | ||
*test_dfs, | ||
lambda df: px.imshow(df, x=df.columns, y=df.index).to_plotly_json(), | ||
comparator=assert_plotly_equal | ||
) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.