Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-896744: Allow creating dataframes using tuples as schema #1025

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/snowflake/snowpark/_internal/type_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ def infer_type(obj: Any) -> DataType:


def infer_schema(
row: Union[Dict, List, Tuple], names: Optional[List] = None
row: Union[Dict, List, Tuple], names: Optional[Union[List, Tuple]] = None
Copy link
Collaborator

@sfc-gh-sfan sfc-gh-sfan Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we could do names.extend when names is a tuple? How about we explicitly copy names into a list first? This way we could support Iterable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I can convert iterable into a list and then assume list datatype from thereon...

) -> StructType:
if row is None or (isinstance(row, (tuple, list, dict)) and not row):
items = zip(names if names else ["_1"], [None])
Expand Down
4 changes: 2 additions & 2 deletions src/snowflake/snowpark/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -1865,7 +1865,7 @@ def write_pandas(
def create_dataframe(
self,
data: Union[List, Tuple, "pandas.DataFrame"],
schema: Optional[Union[StructType, List[str]]] = None,
schema: Optional[Union[StructType, List[str], Tuple[str]]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make sure we use schema as an iterable when it is not StructType, and then support Iterable[str]?

) -> DataFrame:
"""Creates a new DataFrame containing the specified values from the local data.

Expand Down Expand Up @@ -1961,7 +1961,7 @@ def create_dataframe(
else:
if not data:
raise ValueError("Cannot infer schema from empty data")
if isinstance(schema, list):
if isinstance(schema, (tuple, list)):
names = schema
new_schema = reduce(
merge_type,
Expand Down
7 changes: 7 additions & 0 deletions tests/integ/test_dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -2855,6 +2855,13 @@ def test_create_dataframe_special_char_column_name(session):
Utils.check_answer(df2, [Row(1, 2, 3), Row(1, 2, 3)])


def test_create_dataframe_with_tuple_schema(session):
df = session.create_dataframe(
[(20000101, 1, "x"), (20000101, 2, "y")], schema=("TIME", "ID", "V2")
)
Utils.check_answer(df, [Row(20000101, 1, "x"), Row(20000101, 2, "y")])


def test_df_join_suffix(session):
df1 = session.create_dataframe([[1, 1, "1"], [2, 2, "3"]]).to_df(["a", "b", "c"])
df2 = session.create_dataframe([[1, 1, "1"], [2, 3, "5"]]).to_df(["a", "b", "c"])
Expand Down
Loading