Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-896744: Allow creating dataframes using tuples as schema #1025

Merged

Conversation

sfc-gh-aalam
Copy link
Contributor

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-896744: creating a DataFrame with a tuple of names will be ignored an generic names will be used #1009

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

@@ -1865,7 +1865,7 @@ def write_pandas(
def create_dataframe(
self,
data: Union[List, Tuple, "pandas.DataFrame"],
schema: Optional[Union[StructType, List[str]]] = None,
schema: Optional[Union[StructType, List[str], Tuple[str]]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make sure we use schema as an iterable when it is not StructType, and then support Iterable[str]?

@@ -325,7 +325,7 @@ def infer_type(obj: Any) -> DataType:


def infer_schema(
row: Union[Dict, List, Tuple], names: Optional[List] = None
row: Union[Dict, List, Tuple], names: Optional[Union[List, Tuple]] = None
Copy link
Collaborator

@sfc-gh-sfan sfc-gh-sfan Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we could do names.extend when names is a tuple? How about we explicitly copy names into a list first? This way we could support Iterable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I can convert iterable into a list and then assume list datatype from thereon...

@sfc-gh-sfan
Copy link
Collaborator

Did not mean to "requested changes". The change looks good overall but wonder if we could extend to Iterable

@sfc-gh-aalam sfc-gh-aalam merged commit 98e446a into main Aug 29, 2023
40 checks passed
@sfc-gh-aalam sfc-gh-aalam deleted the aalam-SNOW-896744-create-dataframe-with-tuple-schema branch August 29, 2023 17:18
@github-actions github-actions bot locked and limited conversation to collaborators Aug 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SNOW-896744: creating a DataFrame with a tuple of names will be ignored an generic names will be used
2 participants