Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: fix aggregate functions compatibility to MySQL8.0 #1116

Open
hidehalo opened this issue Dec 3, 2020 · 3 comments
Open

parser: fix aggregate functions compatibility to MySQL8.0 #1116

hidehalo opened this issue Dec 3, 2020 · 3 comments

Comments

@hidehalo
Copy link

hidehalo commented Dec 3, 2020

Bug Report

On MySQL 8, aggregate functions below do not support distinct options:

  • std
  • stddev
  • stddev_pop
  • stddev_samp
  • var_pop
  • var_samp
  • variance

(ref https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html)

E.g:

Executed result of SQL select std(distinct c) from t in MySQL 8 would get error feedback like:

"ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'distinct c) from t' at line 1"

but, it would execute successfully in TiDB.

@hidehalo hidehalo changed the title var_pop|stddev_pop|stddev_samp with DISTINCT should raise an syntax error parser: fix aggregate functions compatibility to MySQL8.0 Dec 9, 2020
@hidehalo
Copy link
Author

hidehalo commented Dec 9, 2020

@kennytm Shall we label it type/bug?

@kennytm
Copy link
Contributor

kennytm commented Dec 9, 2020

I don't see any point disabling support of DISTINCT. We provide the correct result with DISTINCT, compatible with PostgreSQL.

create table a (a int);
insert into a values (1), (2), (2), (3), (3), (3);
select var_pop(all a), var_pop(distinct a), var_pop(a) from a;
-- both PostgreSQL and TiDB return 0.5555…, 0.6666…, 0.5555…

The additional DISTINCT keyword, although prohibited by ISO 9075*, is well supported in PostgreSQL (and CockroachDB), BigQuery, Snowflake and Microsoft Transact-SQL and IBM DB2. Meanwhile, MySQL (and MariaDB), Oracle, SingleStore (formerly MemSQL) followed ISO 9075 and disallowed DISTINCT. SQLite3 does not have the standard deviation/variance aggregate functions.

ISO/IEC 9075-2:201? §4.16.4: Neither DISTINCT nor ALL are allowed to be specified for VAR_POP, VAR_SAMP, STDDEV_POP, or STDDEV_SAMP; redundant duplicates are not removed when computing these functions.

Edit: Oracle's non-standard VARIANCE and STDDEV functions do allow DISTINCT, but the standard ones with _POP and _SAMP suffixes don't 🤷.

@hidehalo
Copy link
Author

I don't see any point disabling support of DISTINCT. We provide the correct result with DISTINCT, compatible with PostgreSQL.

create table a (a int);
insert into a values (1), (2), (2), (3), (3), (3);
select var_pop(all a), var_pop(distinct a), var_pop(a) from a;
-- both PostgreSQL and TiDB return 0.5555…, 0.6666…, 0.5555…

The additional DISTINCT keyword, although prohibited by ISO 9075*, is well supported in PostgreSQL (and CockroachDB), BigQuery, Snowflake and Microsoft Transact-SQL and IBM DB2. Meanwhile, MySQL (and MariaDB), Oracle, SingleStore (formerly MemSQL) followed ISO 9075 and disallowed DISTINCT. SQLite3 does not have the standard deviation/variance aggregate functions.

ISO/IEC 9075-2:201? §4.16.4: Neither DISTINCT nor ALL are allowed to be specified for VAR_POP, VAR_SAMP, STDDEV_POP, or STDDEV_SAMP; redundant duplicates are not removed when computing these functions.

Edit: Oracle's non-standard VARIANCE and STDDEV functions do allow DISTINCT, but the standard ones with _POP and _SAMP suffixes don't 🤷.

@kennytm Yes, I have notice that TiDB provide the correct result with DISTINCT.
I am not familiar with those DB&standards, this issue come from pingcap/tidb#19499, I think maybe it is not necessary to TiDB.
This change is about function support, I am not suitable for making decisions, maybe we should close those relative issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants