Skip to content

Conversation

@grhoten
Copy link
Member

@grhoten grhoten commented Nov 5, 2025

This pull request covers the following changes.

  • Prevents exponential parsing with ambiguous rules that use redundant substitution types.
  • As agreed in the ICU TC, the RBNF rules that double up redundant substitution types will return an error upon parsing a string. An error is not returned at rule parse time (the constructor) to allow people that used this type of rules syntax to continue using the same syntax for formatting purposes. So instead of parsing and occasionally returning a bad value with inadequate rules. We now refuse to parse with such by returning an error. Formatting with such rules still works. Using plural rules or the orElse operator are alternatives to use instead.
  • Includes enough of the proposed CLDR fixes to allow the tests to pass. The complete fix with the plural rules will have to be done with the next CLDR sync after all of these pull requests are merged.
  • Fix parsing of "10.000.000." as an ordinal in a language where the grouping separator is a period.
  • All spellout and ordinal rules from CLDR now roundtrip. This is a first! The only exception would be fractional numbers used in ordinals, whose behavior remains undefined with such invalid data.

The provided test case for this issue now completes in a fraction of the time. This is likely due to the syntax being returned as invalid for parsing.

Checklist

  • Required: Issue filed: ICU-NNNNN
  • Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-NNNNN Fix xyz"
  • Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-NNNNN Fix xyz"
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable
  • Approver: Feel free to merge on my behalf

richgillam
richgillam previously approved these changes Nov 5, 2025
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSLGTM

…U4J RBNF. Fix roundtrip parsing of rules that use 2 of the same substitution type. Fix bad parsing of rules with delimiters that can be a part of a number.
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • tools/cldr/cldr-to-icu/src/test/java/org/unicode/icu/tool/cldrtoicu/mapper/RbnfMapperTest.java is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubber stamp.

@grhoten grhoten merged commit e501c22 into unicode-org:main Nov 5, 2025
107 checks passed
@FrankYFTang
Copy link
Contributor

@FrankYFTang
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants