Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1438313: Invisible UTF-8 BOM char (ufeff) at beginning of script causing error when communicating with Snowflake #1949

Closed
dp-rp opened this issue May 21, 2024 · 5 comments
Assignees
Labels
status-triage_done Initial triage done, will be further handled by the driver team

Comments

@dp-rp
Copy link

dp-rp commented May 21, 2024

Python version

Python 3.9.4 (tags/v3.9.4:1f2e308, Apr 4 2021, 13:27:16) [MSC v.1928 64 bit (AMD64)]

Operating system and processor architecture

Windows-10-10.0.22621-SP0

Installed packages

asn1crypto==1.5.1
certifi==2023.5.7
cffi==1.15.1
charset-normalizer==2.1.1
cryptography==38.0.4
filelock==3.12.2
idna==3.4
Jinja2==3.1.2
MarkupSafe==2.1.3
numpy==1.25.0
oscrypto==1.3.0
pandas==1.5.3
pycparser==2.21
pycryptodomex==3.18.0
PyJWT==2.7.0
pyOpenSSL==22.1.0
python-dateutil==2.8.2
pytz==2023.3
PyYAML==5.4.1
requests==2.31.0
schemachange==3.5.2
six==1.16.0
snowflake-connector-python==2.9.0
typing_extensions==4.7.0
urllib3==1.26.16

What did you do?

1. Save a script with `UTF-8 with BOM` encoding (you can set the encoding in Visual Studio Code and save the file to add the invisible BOM char)
2. Try running the script via schemachange, which uses snowflake-connector-python to execute the script
3. See error (`snowflake.connector.errors.ProgrammingError: 001003 (42000): SQL compilation error: syntax error line 1 at position 0 unexpected '\ufeff-'.`)

What did you expect to see?

snowflake-connector-python should ignore the zero width no-break space char during SQL compilation.

Alternatively, if UTF-8 (without BOM) encoding is a strict requirement, an error with a message explicitly stating only UTF-8 encoding is supported should be thrown.

Can you set logging to DEBUG and collect the logs?

Unfortunately our company policies make it difficult for me to easily test this change. Hopefully the issue should be easy to reproduce by just setting a script's encoding. If not, I can try to request the change temporarily, but I'll need authorization and the process of getting it is non-trivial.
@github-actions github-actions bot changed the title Invisible UTF-8 BOM char (ufeff) at beginning of script causing error when communicating with Snowflake SNOW-1438313: Invisible UTF-8 BOM char (ufeff) at beginning of script causing error when communicating with Snowflake May 21, 2024
@dp-rp
Copy link
Author

dp-rp commented May 21, 2024

Here is a related issue I opened in the schemachange repository that I opened when I first noticed the bug: Snowflake-Labs/schemachange#250

They said it stemmed from a lack of additional validation before utilizing the snowflake-connector-python package in their own project, so I thought I'd open an issue here too to hopefully address the bug at its root for any other projects that may be using snowflake-connector-python 🙂

@sfc-gh-dszmolka sfc-gh-dszmolka self-assigned this May 23, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage Issue is under initial triage and removed needs triage labels May 23, 2024
@sfc-gh-dszmolka
Copy link
Contributor

hey and thanks for opening this issue here in the PythonConnector repo - if the issue stems from this connector, I also think it's the right thing to address it here , if possible.

Taking a look.

@sfc-gh-dszmolka
Copy link
Contributor

tried to reproduce the issue on a Windows 10 VM, and using the same PythonConnector 2.9.0 using the method you mentioned,
image

but could not reproduce it yet. Maybe it was right to raise it with Schemachange ?

Are you able to reproduce the issue without Schemachange; only using the pure PythonConnector ? If so, can you please provide the steps for it ? Otherwise if the issue cannot be reproduced with PythonConnector, the Schemachange team will take care of issue#250

@sfc-gh-dszmolka sfc-gh-dszmolka added status-information_needed Additional information is required from the reporter and removed bug labels May 24, 2024
@dp-rp
Copy link
Author

dp-rp commented May 29, 2024

Gotcha. Yep will try to reproduce from my end on my local machine (as opposed to the pipeline which won't be as easy to test with) and report back. I'm not familiar with the lib, so first off I'll make sure I can actually replicate it via schemachange locally before wrapping my head around using the lib itself.

@sfc-gh-dszmolka
Copy link
Contributor

a month passed now and since there is no reproduction available, i'm closing this one for now. please do comment and share reproduction and i can keep looking if needed

@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage_done Initial triage done, will be further handled by the driver team and removed status-information_needed Additional information is required from the reporter status-triage Issue is under initial triage labels Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

2 participants