Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded extraneous digits in numeric literals #409

Open
hudlow opened this issue Nov 6, 2024 · 2 comments
Open

Unbounded extraneous digits in numeric literals #409

hudlow opened this issue Nov 6, 2024 · 2 comments

Comments

@hudlow
Copy link
Contributor

hudlow commented Nov 6, 2024

In various circumstances, the cel-go implementation ignores an unbounded number of extraneous digits in numeric literals. While this behavior is probably common to most programming languages, it might be unwise for a language designed for execution of untrusted code.

It's hard to contrive a legitimate reason to support these literals, but the spec is not explicit on whether these literals should be supported and how they should be rounded/truncated.

Unbounded number of leading zeroes for an integer:

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
// 1

Unbounded number of leading zeroes for a hexadecimal integer:

0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
// 1

Unbounded number of leading zeroes for a floating-point:

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.0
// 1.0

Double with more precision than can be stored:

1.0000000000000001 == 1.0
// true

Double with same number of digits as above but different rounding/truncation result:

1.0000000000000002 == 1.0
// false

Unbounded number of floating-point digits:

1.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
// 1.0

Double with unbounded number of precision digits and unbounded number of exponent zeros:

1.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001e000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
// 1.0

Recognizing that existing implementations may have to continue to (optionally) support the existing behavior, I would propose that the spec recommend:

  1. Decimal integer literals and hexadecimal integer literals must not be padded with leading zeros beyond 32 total digits.
  2. Double literal exponents must not be padded with leading zeros beyond 8 total digits.
  3. The sum of whole and fractional digits in the significand of a double literal must not be padded with any combination of leading and trailing zeros beyond 32 total digits.
  4. Ignoring trailing zeros, the significand of a double literal must not require truncation or rounding to deserialize it as an IEEE 754 value. Thinking about this even just a little more and this seems entirely impractical. It rubs me the wrong way for 4e-324 == 5e-324 to be true, but maybe I just have to accept that this is ubiquitously true for IEEE 754 floating point values?
@TristonianJones
Copy link
Collaborator

Floating point values within the IEEE 754 specification are imprecise and behavior can vary based on compiler flags in C++, so bullet 4 is just how things work in most languages, though we do normalize expected error behaviors at the edges in the conformance tests.

It happens that digit parsing doesn't require recursion so this generally isn't an issue for the parser, though may yield an error in the parser itself due to some underlying limitation of the C++, Java, Go libraries that parse numbers. I'm not sure you need to set limits here, but documenting that we round according to the precision of IEEE 754 seems perfectly reasonable.

@hudlow
Copy link
Contributor Author

hudlow commented Nov 12, 2024

@TristonianJones Another part of behavior that may be undefined today is whether nonzero literals that round to zero are tolerated. AFAICT this is not something that is consistent across programming environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants