-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-899773: Allow specification of of batch_size for batch-generating functions #1712
Comments
Hope y'all don't mind me chiming in here - lot of ML frameworks require tightly controlled batch sizes. An example pattern thats compatible with alot of these frameworks is to_torch_datapipe from It might be worth finding an expert to weigh in that uses Tensorflow/PyTorch/Jax at scale |
I found this case searching around thinking I must be misreading the docs or just not finding another command name in the connector to fetch up to a specific size. NOTE: I suggest this may be worth breaking into two requests:
To be clear the difference is #1 is about memory constraints, if some batches have 1 row and others the max that's acceptable in that use case. |
thank you for opening this request with us - we'll consider it for a possible future improvement in the connector |
is there a way to understand how batch size is being generated? |
This would be a very helpful feature. |
What is the current behavior?
Batch size is not controllable from the client when using batch-generating functions, e.g.
get_result_batches()
,fetch_arrow_batches()
,fetch_pandas_batches()
What is the desired behavior?
Allow specification of a
batch_size
parameter when making batch requests that determines the number of records returned in each batch.How would this improve
snowflake-connector-python
?Many applications require tight control over memory usage to operate reliably. This applies to essentially any service running in a remote server, i.e. not a user's laptop. Our application provides connections to multiple databases and cloud storage providers, and the only way we can provide the equivalent level of reliability (every other database and storage provider we support has this feature available in their connector) is for Snowflake to include the ability to control the size of responses for large requests.
References and other background
The text was updated successfully, but these errors were encountered: