Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions docs/site/downloads/cldr-48.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,7 @@ This data is also a factor in determining which languages are supported on mobil

Some of the most significant changes in this release are:
- Updated for Unicode 17, including new names and search terms for new emoji, new sort-order, Han → Latin romanization additions for many characters.
- Many additions to language data including:
- Likely Subtags, for deriving the likely script and region from the language (used in many processes).
- Language populations in countries: significant updates to improve accuracy and maintainability.
- Many additions to language data supporting Likely Subtags, for deriving the likely script and region from the language (used in many processes).
- Updated to the latest external standards and data sources, such as the language subtag registry, UN M49 macro regions, ISO 4217 currencies, etc.
- New formatting options
- Rational number formats added, allowing for formats like 5½.
Expand Down Expand Up @@ -313,7 +311,7 @@ The following files are new in the release:
## Migration

- Number patterns that did not have a specific numberSystem (such as `latn` or `arab`) had been deprecated for many releases, and were finally removed.
- Additionally, language and territory data in `languageData` and `territoryInfo` data received significant updates to improve accuracy and maintainability [CLDR-18087]
- Additionally, language and territory data in `languageData` and `territoryInfo` data received significant updates to improve accuracy and maintainability [CLDR-18087] In particular, the `territories` attribute in `languageData` was deprecated and removed, as it was unclear and prone to misunderstanding. Implementations that used this data may need to adjust accordingly, using `territoryInfo`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a period here before "In particular"?

- The likely language for Belarus changed to Russian [CLDR-14479]
- [Using Time Zone Names](https://www.unicode.org/reports/tr35/dev/tr35-dates.html#using-time-zone-names) Removed the "specific location format" and modified the fallback behavior of 'z'.
- [Unit Identifier Normalization](https://www.unicode.org/reports/tr35/dev/#tr35-general.html) Modified the normalization process.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -352,14 +352,16 @@ private static void writeBasicLanguageData(PrintWriter out, Set<RowData> sortedI
TreeSet.class));
}
if (languageInCountryData.officialStatus.isMajor()) {
// Output will look like <language type="sw" territories="TZ"/>
// This is no longer saved to output
// It used to appear like: <language type="sw" territories="TZ"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mysterious. Does the code still serve a purpose? There can be a tendency to keep code around for historical reference and/or in case it might be useful again someday. However, that leads to crufty hard-to-maintain code. So I'd rather see the code removed if unnecessary, and otherwise a comment explaining why the code is here, rather than a comment saying what it used to do or doesn't do...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it's not great to leave this code in. I started this refactor last year but I punted the more consequential work until I had a proper design to replace the old class completely with a new format.

Removing this code ended up affecting other classes that consume the data and ended up causing a big rewrite that I wasn't prepared. Would you like me to change the comment to instead say "this is not saved to the output but it is used in the BasicLanguageData class"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. Also, maybe (optional!) there could be a ticket for the larger refactoring, and you could put in a TODO comment referencing that ticket. Unless that's just wishful thinking :-)

status_territories.put(
BasicLanguageData.Type.primary, languageInCountryData.countryCode);
} else if (languageInCountryData.officialStatus.isOfficial()
|| languageInCountryData.getLanguagePopulation()
>= cutoff * languageInCountryData.countryPopulation
|| languageInCountryData.getLanguagePopulation() >= 1000000) {
// Output will look like <language type="sw" territories="CD" alt="secondary"/>
// This is no longer saved to output
// It used to appear like: <language type="sw" territories="CD" alt="secondary"/>
status_territories.put(
BasicLanguageData.Type.secondary, languageInCountryData.countryCode);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -918,7 +918,7 @@ public LanguageInfo(Factory cldrFactory) throws IOException {
; // nothing
else if ("secondary".equals(alt)) language += "*";
else language += "*" + alt;
// <language type="af" scripts="Latn" territories="ZA"/>
// <language type="af" scripts="Latn"/>
addTokens(language, attributes.get("territories"), " ", language_territories);
continue;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2456,10 +2456,9 @@ private void handleSubdivisionContainment(XPathValue parts) {

private void handleLanguageData(XPathValue parts) {
// <languageData>
// <language type="aa" scripts="Latn" territories="DJ ER ET"/> <!--
// <language type="aa" scripts="Latn"/> <!--
// Reflecting submitted data, cldrbug #1013 -->
// <language type="ab" scripts="Cyrl" territories="GE"
// alt="secondary"/>
// <language type="ab" scripts="Cyrl" alt="secondary"/>
String language = parts.getAttributeValue(2, "type");
BasicLanguageData languageData = new BasicLanguageData();
languageData.setType(
Expand Down
Loading