Skip to content

Commit

Permalink
Merge pull request #144 from Baptistemontan/reduce_icu
Browse files Browse the repository at this point in the history
Reduce ICU4X footprint
  • Loading branch information
Baptistemontan authored Oct 12, 2024
2 parents 8ada353 + 67ff54b commit 5c4b1e1
Show file tree
Hide file tree
Showing 67 changed files with 1,849 additions and 187 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,8 @@ jobs:
examples:
[
counter,
counter_icu_datagen,
counter_plurals,
counter_ranges,
interpolation,
namespaces,
Expand Down
27 changes: 21 additions & 6 deletions docs/book/src/06_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,6 @@ This feature must be enabled when building the client in csr mode

Set a cookie to remember the last chosen locale.

#### `sync`

This feature has no impact on the user.
This feature allow the crate to use sync data types such as `Mutex` or `OnceLock`.
Activated when the `actix` or `axum` feature is enabled.

#### `experimental-islands`

This feature is, as it's name says, experimental.
Expand Down Expand Up @@ -74,3 +68,24 @@ you may have noticed that if you use `cargo-leptos` with `watch-additional-files
This feature use a "trick" by using `include_bytes!()` to declare the use of a file, but I'm a bit sceptical of the impact on build time using this.
I've already checked and it does not include the bytes in the final binary, even in debug, but it may slow down compilation time.
If you use the `nightly` feature it use the [path tracking API](https://github.com/rust-lang/rust/issues/99515) so no trick using `include_bytes!` and the possible slowdown in compile times coming with it.

#### `icu_compiled_data` (Default)

ICU4X is used as a backend for formatting and plurals, they bring their own data to know what to do for each locales. This is great when starting up a project without knowing exactly what you need, this is why it is enabled by default, so things works right out of the box.
But those baked data can take quite a lot of space in the final binary as it brings informations for all possible locales, so if you want to reduce this footprint you can disable this feature and provide you own data with selected informations. See the datagen section in the reduce binary size chapter for more informations.

#### `plurals`

Allow the use of plurals in translations.

#### `format_datetime`

Allow the use of the `date`, `time` and `datetime` formatters.

#### `format_list`

Allow the use of the `list` formatter.

#### `format_nums`

Allow the use of the `number` formatter.
2 changes: 2 additions & 0 deletions docs/book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,7 @@
- [Scoping](./usage/08_scoping.md)
- [More Informations](./infos/README.md)
- [Locale Resolution](./infos/01_locale_resol.md)
- [Reduce Binary Size](./reduce_size/README.md)
- [ICU4X Datagen](./reduce_size/01_datagen.md)
- [Features](./06_features.md)
- [Appendix: `i18n Ally` extension for VSC](./appendix_i18n_ally.md)
4 changes: 4 additions & 0 deletions docs/book/src/declare/03_plurals.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,7 @@ If you need multiple counts, for example:
```

There isn't a way to represent this in a single key, you will need `Foreign keys` that you can read about in a next chapter.

## Activate the feature

To use plurals in your translations, enable the "plurals" feature.
16 changes: 14 additions & 2 deletions docs/book/src/declare/08_formatters.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@ This make the variable needed to be `impl leptos_i18n::formatting::NumberFormatt

> \* Is implemented for convenience, but uses [`FixedDecimal::try_from_f64`](https://docs.rs/fixed_decimal/latest/fixed_decimal/struct.FixedDecimal.html#method.try_from_f64) with the floating precision, you may want to use your own.
The formatter itself does'nt provide formatting options such as maximum significant digits, but those can be customize through `FixedDecimal` before being passed to the formatter.

Enable the "format_nums" feature to use the number formatter.

### Arguments

There are no arguments for this formatter at the moment.
Expand Down Expand Up @@ -88,6 +92,8 @@ This make the variable needed to be `impl leptos_i18n::formatting::DateFormatter
`IntoIcuDate` is a trait to turn a value into a `impl icu::datetime::input::DateInput` which is a trait used by `icu` to format dates. The `IntoIcuDate` trait is currently implemented for `T: DateInput<Calendar = AnyCalendar>`.
You can use `icu::datetime::{Date, DateTime}`, or implement that trait for anything you want.

Enable the "format_datetime" feature to use the date formatter.

### Arguments

There is one argument at the moment for the date formatter: `date_length`, which is based on [`icu::datetime::options::length::Date`](https://docs.rs/icu/latest/icu/datetime/options/length/enum.Date.html), that can take 4 values:
Expand Down Expand Up @@ -129,6 +135,8 @@ This make the variable needed to be `impl leptos_i18n::formatting::TimeFormatter
`IntoIcuTime` is a trait to turn a value into a `impl icu::datetime::input::TimeInput` which is a trait used by `icu` to format time. The `IntoIcuTime` trait is currently implemented for `T: IsoTimeInput`.
You can use `icu::datetime::{Time, DateTime}`, or implement that trait for anything you want.

Enable the "format_datetime" feature to use the time formatter.

### Arguments

There is one argument at the moment for the time formatter: `time_length`, which is based on [`icu::datetime::options::length::Time`](https://docs.rs/icu/latest/icu/datetime/options/length/enum.Time.html), that can take 4 values:
Expand Down Expand Up @@ -170,6 +178,8 @@ This make the variable needed to be `impl leptos_i18n::formatting::DateTimeForma
`IntoIcuDateTime` is a trait to turn a value into a `impl icu::datetime::input::DateTimeInput` which is a trait used by `icu` to format datetimes. The `IntoIcuDateTime` trait is currently implemented for `T: DateTimeInput<Calendar = AnyCalendar>`.
You can use `icu::datetime::DateTime`, or implement that trait for anything you want.

Enable the "format_datetime" feature to use the datetime formatter.

### Arguments

There is two arguments at the moment for the datetime formatter: `date_length` and `time_length` that behave exactly the same at the one above.
Expand Down Expand Up @@ -209,9 +219,11 @@ Will format the list based on the locale.
This make the variable needed to be `impl leptos_i18n::formatting::ListFormatterInputFn`, which is auto implemented for `impl Fn() -> T + Clone + 'static where T: leptos_i18n::formatting::WriteableList`.
`WriteableList` is a trait to turn a value into a `impl Iterator<Item = impl writeable::Writeable>`.

Enable the "format_list" feature to use the list formatter.

### Arguments

There is two arguments at the moment for the datetime formatter: `list_type` and `list_length`.
There is two arguments at the moment for the list formatter: `list_type` and `list_length`.

`list_type` takes 3 possible values:

Expand All @@ -229,7 +241,7 @@ See [`Intl.ListFormat`](https://developer.mozilla.org/fr/docs/Web/JavaScript/Ref

```json
{
"short_and_list_formatter": "{{ list_var, list(list_length: ; time_length: full) }}"
"short_and_list_formatter": "{{ list_var, list(list_type: and; list_length: short) }}"
}
```

Expand Down
111 changes: 111 additions & 0 deletions docs/book/src/reduce_size/01_datagen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# ICU4X Datagen

This library use ICU4X as a backend for formatters and plurals, and the default baked data provider can take quite a lot of space as it contains informations for _every possible locale_. So if you use only a few this is a complete waste.

## Disable compiled data

The first step to remove those excess informations is to disable the default data provider, it is activated by the `"icu_compiled_data"` feature that is enabled by default. So turn off default features or remove this feature.

## Custom provider

Great we lost a lot of size, but now instead of having too much informations we have 0 informations. You will now need to bring your own data provider. For that you will need multiple things.

## 1. Datagen

First generate the informations, you can use [`icu_datagen`](https://docs.rs/icu_datagen/latest/icu_datagen/) for that, either as a CLI of with a build.rs (we will come back to it later).

## 2. Load

Then you need to load those informations, this is as simple as

```rust
include!(concat!(env!("OUT_DIR"), "/baked_data/mod.rs"));

pub struct MyDataProvider;
impl_data_provider!(MyDataProvider);
```

This is explained in the `icu_datagen` doc

## 3. Supply to leptos_i18n the provider

You now just need to tell `leptos_i18n` what provider to use, for that you first need to impl `IcuDataProvider` for you provider, you can do it manually as it is straight forward, but the lib comes with a derive macro:

```rust
include!(concat!(env!("OUT_DIR"), "/baked_data/mod.rs"));

#[derive(leptos_i18n::custom_provider::IcuDataProvider)]
pub struct MyDataProvider;
impl_data_provider!(MyDataProvider);
```

And then pass it to the `set_icu_data_provider` function when the program start,
so for CSR apps in the main function:

```rust
fn main() {
leptos_i18n::custom_provider::set_icu_data_provider(MyDataProvider);
console_error_panic_hook::set_once();
leptos::mount::mount_to_body(|| leptos::view! { <App /> })
}
```

and for SSR apps in both on hydrate and on server startup:

```rust
#[wasm_bindgen::prelude::wasm_bindgen]
pub fn hydrate() {
leptos_i18n::custom_provider::set_icu_data_provider(MyDataProvider);
console_error_panic_hook::set_once();
leptos::mount::hydrate_body(App);
}
```

```rust
// example for actix
#[actix_web::main]
async fn main() -> std::io::Result<()> {
leptos_i18n::custom_provider::set_icu_data_provider(MyDataProvider);
// ..
}
```

## Build.rs datagen

The doc for ICU4X datagen can be quite intimidating, but it is actually quite straight forward. Your build.rs can look like this:

```rust
use icu_datagen::baked_exporter::*;
use icu_datagen::prelude::*;
use std::path::PathBuf;

fn main() {
println!("cargo:rerun-if-changed=build.rs");

let mod_directory = PathBuf::from(std::env::var_os("OUT_DIR").unwrap()).join("baked_data");

let exporter = BakedExporter::new(mod_directory, Default::default()).unwrap();

DatagenDriver::new()
// Keys needed for plurals
.with_keys(icu_datagen::keys(&[
"plurals/cardinal@1",
"plurals/ordinal@1",
]))
// Used locales, no fallback needed
.with_locales_no_fallback([langid!("en"), langid!("fr")], Default::default())
.export(&DatagenProvider::new_latest_tested(), exporter)
.unwrap();
}
```

Here we are generating the informations for locales `"en"` and `"fr"`, with the data needed for plurals.

## Is it worth the trouble ?

YES. With `opt-level = "z"` and `lto = true`, the plurals example is at 394ko (at the time of writing), now by just providing a custom provider tailored to the used locales ("en" and "fr"), it shrinks down to 248ko! It almost cutted in half the binary size!
I highly suggest to take the time to implement this.

## Example

You can take a look at the `counter_icu_datagen` example, this is a copy of the `counter_plurals` example but with a custom provider.
3 changes: 3 additions & 0 deletions docs/book/src/reduce_size/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# How To Reduce Binary Size

This chapter is about the few options you have to reduce the binary footprint of this library, other than compiler options such as `opt-level = "z"` and other things that are common for every builds.
14 changes: 7 additions & 7 deletions docs/expanded_macros/load_locales.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,9 +105,9 @@ pub mod i18n {
l_i18n_crate::__private::intern(s)
}

fn as_icu_locale(self) -> &'static l_i18n_crate::__private::locid::Locale {
const EN_LANGID: &'static l_i18n_crate::__private::locid::Locale = &l_i18n_crate::__private::locid::locale!("en");
const FR_LANGID: &'static l_i18n_crate::__private::locid::Locale = &l_i18n_crate::__private::locid::locale!("fr");
fn as_icu_locale(self) -> &'static l_i18n_crate::reexports::icu::locid::Locale {
const EN_LANGID: &'static l_i18n_crate::reexports::icu::locid::Locale = &l_i18n_crate::reexports::icu::locid::locale!("en");
const FR_LANGID: &'static l_i18n_crate::reexports::icu::locid::Locale = &l_i18n_crate::reexports::icu::locid::locale!("fr");
match self {
Locale::en => EN_LANGID,
Locale::fr => FR_LANGID,
Expand Down Expand Up @@ -138,14 +138,14 @@ pub mod i18n {
}
}

impl core::convert::AsRef<l_i18n_crate::__private::locid::LanguageIdentifier> for Locale {
fn as_ref(&self) -> &l_i18n_crate::__private::locid::LanguageIdentifier {
impl core::convert::AsRef<l_i18n_crate::reexports::icu::locid::LanguageIdentifier> for Locale {
fn as_ref(&self) -> &l_i18n_crate::reexports::icu::locid::LanguageIdentifier {
l_i18n_crate::Locale::as_langid(*self)
}
}

impl core::convert::AsRef<l_i18n_crate::__private::locid::Locale> for Locale {
fn as_ref(&self) -> &l_i18n_crate::__private::locid::Locale {
impl core::convert::AsRef<l_i18n_crate::reexports::icu::locid::Locale> for Locale {
fn as_ref(&self) -> &l_i18n_crate::reexports::icu::locid::Locale {
l_i18n_crate::Locale::as_icu_locale(*self)
}
}
Expand Down
9 changes: 9 additions & 0 deletions examples/csr/counter_icu_datagen/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Cargo.lock
target
dist
!.vscode
node_modules/
/test-results/
/playwright-report/
/blob-report/
/playwright/.cache/
5 changes: 5 additions & 0 deletions examples/csr/counter_icu_datagen/.vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"recommendations": [
"lokalise.i18n-ally",
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
languageIds:
- rust

usageMatchRegex:
- "[^\\w\\d]t!\\(\\s*[\\w.:]*,\\s*([\\w.]*)"
- "[^\\w\\d]td!\\(\\s*[\\w.:]*,\\s*([\\w.]*)"
- "[^\\w\\d]td_string!\\(\\s*[\\w.:]*,\\s*([\\w.]*)"
- "[^\\w\\d]td_display!\\(\\s*[\\w.:]*,\\s*([\\w.]*)"

monopoly: true
5 changes: 5 additions & 0 deletions examples/csr/counter_icu_datagen/.vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"i18n-ally.keystyle": "nested",
"i18n-ally.localesPaths": "locales",
"rust-analyzer.cargo.buildScripts.enable": true
}
32 changes: 32 additions & 0 deletions examples/csr/counter_icu_datagen/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
[package]
name = "counter_icu_datagen"
version = "0.1.0"
edition = "2021"

[dependencies]
leptos = { version = "0.7.0-gamma2", features = ["csr"] }
leptos_meta = { version = "0.7.0-gamma2" }
leptos_i18n = { path = "../../../leptos_i18n", default-features = false, features = [
"json_files",
"csr",
"plurals",
] }
serde = { version = "1", features = ["derive"] }
console_error_panic_hook = { version = "0.1" }
wasm-bindgen = { version = "0.2" }

icu = { version = "1.5", default-features = false } # turn off compiled_data
icu_provider = "1.5" # for databake
zerovec = "0.10" # for databake

[package.metadata.leptos-i18n]
default = "en"
locales = ["en", "fr"]

[build-dependencies]
icu = "1.5"
icu_datagen = "1.5"

[profile.release]
opt-level = "z"
lto = true
11 changes: 11 additions & 0 deletions examples/csr/counter_icu_datagen/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Counter Plurals Example

This example showcase how you can use plurals to display a different text based on a count.

## How to run

Simply use `trunk` to run it:

```bash
trunk serve --open
```
28 changes: 28 additions & 0 deletions examples/csr/counter_icu_datagen/build.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
use icu_datagen::baked_exporter::*;
use icu_datagen::prelude::*;
use std::path::PathBuf;

fn main() {
println!("cargo:rerun-if-changed=build.rs");

let mod_directory = PathBuf::from(std::env::var_os("OUT_DIR").unwrap()).join("baked_data");

// This is'nt really needed, but ICU4X wants the directory to be empty
// and Rust Analyzer can trigger the build.rs without cleaning the out directory.
if mod_directory.exists() {
std::fs::remove_dir_all(&mod_directory).unwrap();
}

let exporter = BakedExporter::new(mod_directory, Default::default()).unwrap();

DatagenDriver::new()
// Keys needed for plurals
.with_keys(icu_datagen::keys(&[
"plurals/cardinal@1",
"plurals/ordinal@1",
]))
// Used locales, no fallback needed
.with_locales_no_fallback([langid!("en"), langid!("fr")], Default::default())
.export(&DatagenProvider::new_latest_tested(), exporter)
.unwrap();
}
Loading

0 comments on commit 5c4b1e1

Please sign in to comment.