Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand null safety #2067

Closed
wants to merge 2 commits into from
Closed

Expand null safety #2067

wants to merge 2 commits into from

Conversation

mtoy-googly-moogly
Copy link
Collaborator

@mtoy-googly-moogly mtoy-googly-moogly commented Dec 28, 2024

If we do this, it will fix #1968

NULL fields are a problem

For booleans, especially for boolean dimensions which are filters, a very common pattern in Malloy, you end up creating expressions which contain nonsense data because any column with a boolean might be null. Null is falsey, but because SQL is null infecting (all computations involving null result in null) ...

  • NOT x is null when x is null so it looks falsey when it should be truthy
  • x != y is null when x or y is null so it looks falsey when it should be truthy
  • x = y is null when x and y are null so it looks falsey when it should be truthy
  • x or y is null when x or y is null so it looks falsey when it should be truthy

We know for sure we want to protect not, we sort of half protected equality comparisons, and now we are in a weird no man's land we need to get out of.

I would like to offer two possible solutions

Proposal One: The Malloy Null Safe Truth Tables

Boolean Operations

Expression x=null x=true x=false
not x true false true
x or y y x y

Non null to nullable

Expression x=null
x = 0 false
x != 0 true
x ~ 'a' false
x !~ 'a' true
x ~ r'a' false
x !~ r'a' true

Compare two nullable

Expression x=null, y=null
x = y true
x != y false
x ~ y true
x !~ y false

Proposal Two: ==

In this proposal, we do null protect not and or, but don't null protect the equality operators, and we add a null protected equality operator suite, sort of like typescript does ...

Boolean Operations

Expression x=null x=true x=false
not x true false true
x or y y x y

Non null to nullable

Expression x=null
x == 0 false
x !== 0 true
x ~~ 'a' false
x !~~ 'a' true
x ~~ r'a' false
x !~~ r'a' true

Compare two nullable

Expression x=null, y=null
x == y true
x !== y false
x ~~ y true
x !~~ y false

Variant Proposals

Do nothing

This is to leave us where we are today which is we protect some ...

  • NOT is protected
  • Inequality is protected

.. but not all

  • Equality is not protected
  • OR is not protected

Invert ==

  • The "bare" operators are NOT coalescing, they just write themselves into the SQL stream.
  • The == operators are all NULL coalescing
  • Open question about what to do about OR and NOT since they don't have an easy == syntax
    • Coalesce them, with no "do not coalesce syntax"
    • Don't coalesce them, make users to do that
    • Invent a new version of OR and NOT syntax so we can have both coalescing and non coalescing operations

Do not protect equality

Because equality protect translates to pick false when a = null and b = null else a=b ?? null this is going to break many query optimizers, don't protect equality, but protect everything else.

Only Protect NOT

@lloydtabb
Copy link
Collaborator

x = y is null when x and y are null so it looks falsey when it should be truthy

I believe this is wrong for Malloy (and SQL). These comparisons are vector operations, not generally tests like in other languages. The question is 'should I include these records in the result set'. When you are testing for x = y, you want records where they match, if a null is involved you generally don't want the records. I have never, ever written, in SQL

 WHERE X = Y or (X IS NULL and y IS NULL)

IMO it would be a mistake to try and do this.

@lloydtabb
Copy link
Collaborator

lloydtabb commented Dec 30, 2024

Generally, in SQL is there is a null in an expression, then the results are null. In SQL 'OR' is an exception to this rule. Reading your post, I wasn't sure what you were saying (sorry, my reading skills aren't great).

image

I don't think we need to 'fix' this, unless we want to make all relational operators 'truthy' and in that, it is a coalese(x, false)

@mtoy-googly-moogly
Copy link
Collaborator Author

We have decided NOT to expand null safety.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

x != y produces wrong value (true) when x and y are both null-valued
2 participants