Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1660544: The overwrite parameter does not behave as documented in write_pandas() #2054

Open
chrisglass-slam opened this issue Sep 11, 2024 · 2 comments
Assignees
Labels
bug documentation status-triage_done Initial triage done, will be further handled by the driver team

Comments

@chrisglass-slam
Copy link

Python version

Python 3.11.0rc1 (main, Apr 21 2024, 22:33:16) [GCC 11.4.0]

Operating system and processor architecture

Linux-6.5.0-1025-azure-x86_64-with-glibc2.35

Installed packages

snowflake==0.12.1
snowflake-connector-python==3.12.0
snowflake-snowpark-python==1.20.0
snowflake._legacy==0.11.0
snowflake.core==0.12.1

What did you do?

Using the write_pandas() function, I am passing "overwrite=True", but with "auto_create_table=False".

Observation:

The function performs a "CREATE TABLE IF NOT EXISTS". The table does exist, and the schema is maintained externally. This means the user I am using needs to be granted "CREATE TABLE" privileges on the table's schema.

Possible solution:

Switching the conditional on line 356 of pandas_tools.py as follows would probably solve the issue:


From `if auto_create_table or overwrite:` to `if auto_create_table and overwrite:`

What did you expect to see?

Expectation:

The function does not attempt to create a table (do not perform a "CREATE TABLE", with or without "IF NOT EXISTS").
The function performs a TRUNCATE of the table before loading the data with COPY INTO.

Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ('snowflake.connector',):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
@github-actions github-actions bot changed the title The overwrite parameter does not behave as documented in write_pandas() SNOW-1660544: The overwrite parameter does not behave as documented in write_pandas() Sep 11, 2024
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Sep 12, 2024
@sfc-gh-sghosh
Copy link

Hello @chrisglass-slam ,

Thanks for raising the issue, we are looking into it, will update.

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage Issue is under initial triage and removed needs triage labels Sep 16, 2024
@sfc-gh-sghosh
Copy link

Hello @chrisglass-slam ,

It seems the issue is due to documentation not clear:

The parameter auto_create_table=False seems not working as per the documentation.
Ideally, it should not create the table if ‘auto_create_table’ is False.

session.write_pandas(
    df,
    table_name=‘TABLE_OVERWRITE2’,
    auto_create_table=False,  # Create table automatically if it doesn’t exist
    overwrite=True,         # Set to True to overwrite the table if needed
)

but its creating the table when overwrite=True

We are working on it, will update.

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh removed their assignment Sep 24, 2024
@sfc-gh-sghosh sfc-gh-sghosh added status-triage_done Initial triage done, will be further handled by the driver team documentation and removed status-triage Issue is under initial triage labels Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug documentation status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

3 participants