You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sampling on a MSSQL table seems to use the passed fraction to either return the entire table data or 0 rows.
I did some local testing against a local Postgres DB, an in memory duckdb instance and an AzureSQL DB.
The result makes it clear that for mssql backends the sampling either returns the entire data or no data at all while, for other backends (only verified postgres and duckdb) the sampling behaves as expected: using the fraction to return a sample of the data
What version of ibis are you using?
9.4.0
What backend(s) are you using, if any?
MSSQL
Relevant log output
No response
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
Thanks for opening this, this is indeed a bug in our mssql implementation. Apparently in mssql, multiple calls to RAND in a single query will all result in the same value. See https://web.archive.org/web/20110829015850/http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/sql-server-set-based-random-numbers. In ibis, we expect ibis.random() to return different results for each call in a query (or each row), and the default implementation of sample() is based on random(). The proper fix here is to fixup how we implement random for mssql, which will also fix sample. Will take care of this.
What happened?
Sampling on a MSSQL table seems to use the passed fraction to either return the entire table data or 0 rows.
I did some local testing against a local Postgres DB, an in memory duckdb instance and an AzureSQL DB.
I ran
Against a table in my AzureSQL DB and against a table in my local Postgres DB. Both tables containing exactly 1000 rows.
AzureSQL result:
Postgres result:
The result makes it clear that for mssql backends the sampling either returns the entire data or no data at all while, for other backends (only verified postgres and duckdb) the sampling behaves as expected: using the
fraction
to return a sample of the dataWhat version of ibis are you using?
9.4.0
What backend(s) are you using, if any?
MSSQL
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: