Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We should support running ICU4X tests with JSON data #5420

Open
sffc opened this issue Aug 21, 2024 · 20 comments
Open

We should support running ICU4X tests with JSON data #5420

sffc opened this issue Aug 21, 2024 · 20 comments
Labels
C-data-infra Component: provider, datagen, fallback, adapters

Comments

@sffc
Copy link
Member

sffc commented Aug 21, 2024

In the not-so-distant past, we ran ICU4X docs tests with the icu_testdata crate, which was backed by a mix of JSON and Baked Data files. In 1.3, we switched over to using compiled data on all locales.

While this simplified our build, I think there is still value in running tests against the JSON data:

  1. It is more understandable and debuggable. With JSON, I can change data and then immediately re-run tests without having to re-compile or re-build anything. If I want to tweak data when running a unit test, there is simply not an easy way to do that with baked data. I currently need to build custom baked data against custom CLDR JSON and then set the ICU4X_DATA_DIR variable and hope that everything is up-to-date / in-sync.
  2. Compiled data is slow to compile. Cumulatively, I've spent many hours waiting for icu_datetime and icu_datetime(test) to build in order to iterate on my semantic skeleta work. JSON data doesn't need to be compiled, so it would lead to substantially faster Edit-Build-Test-Debug loop.
  3. I still maintain that there is value in having the entire lifecycle of test data living in the repo, rather than relying on compiled data that is downloaded from GitHub into /tmp and then immediately converted into a machine-readable data format.

My proposal to fix this is:

  1. Add a datagen format that is a code module API-compatible with compiled data but backed by an FS data provider
  2. Restructure the debug JSON data files in the repo to be in this new code module
  3. Add some cargo aliases that make it easy to locally set ICU4X_DATA_DIR
  4. Run it in CI, either as the default job, an additional job, or piggy-backing on one of our specialized jobs like test-gigo

Additional notes:

  • We probably want to move away from "set of locales compiled into all components" because often we need a test locale for only one component at a time, and applying it to all components leads to a lot of redundant test data. For example, adding Finnish as a testdata locale was rejected in Add Finnish to testdata #5360 because it added 526 new files. It would have been nice to add it for only the datetime component where it was needed at the time.
  • We might be able to remove stubdata if we did this, instead pointing IDEs to the new JSON-based compiled data crates.

CC @robertbastian @Manishearth

@sffc sffc added C-data-infra Component: provider, datagen, fallback, adapters discuss Discuss at a future ICU4X-SC meeting labels Aug 21, 2024
@Manishearth
Copy link
Member

I'm open to this.

I'd love to overall reduce the amount of generated stuff in tree though.

@Manishearth
Copy link
Member

actually I'd like us to run tests on all three primary data formats (baked, blob, json), generated at CI time, so that we don't accidentally e.g. break zerovec postcard deserialization

@Manishearth
Copy link
Member

actually I'd like us to run tests on all three primary data formats (baked, blob, json), generated at CI time, so that we don't accidentally e.g. break zerovec postcard deserialization

Proposal: checked in json data, and make tasks that replace it with postcard or baked data in a way that allows transparently running tests.

We could potentially have a passthrough TestDataProvider or something

@sffc
Copy link
Member Author

sffc commented Aug 21, 2024

We should definitely have a test, if we don't already have one, that JSON data can be converted to Postcard and back and PartialEq works on the round-trip. I don't think we get too much value of running tests on top of that? If the deserialization works and produces the same data as JSON/Baked, then the tests should work.

@Manishearth
Copy link
Member

I don't think we get too much value of running tests on top of that?

There are a lot of different efficient ways of reading the data, especially for complex objects like Patterns. So I think there's some value in testing all those codepaths, but this isn't a strong opinipon of mine.

@robertbastian
Copy link
Member

Add a datagen format that is a code module API-compatible with compiled data but backed by an FS data provider

You've proposed this before and it doesn't work because of singletons.

@sffc
Copy link
Member Author

sffc commented Oct 22, 2024

@robertbastian If we make the singleton functions not be const, then we can change them to call a fn to get a &'static reference to the singleton data, and the postcard-backed compiled data can return &'static from a OnceLock.

@sffc sffc added this to the ICU4X 2.0 ⟨P1⟩ milestone Oct 22, 2024
@robertbastian
Copy link
Member

Your proposal sacrifices constness to make this work. Are you also planning to sacrifice infallibility, or do have a solution for infallibly loading postcard data?

@sffc
Copy link
Member Author

sffc commented Oct 22, 2024

Good point. The data crate could export a SingletonError type which is Infallible if it contains baked and DataError if it contains postcard.

@robertbastian
Copy link
Member

If you want to switch between compiled data and postcard without call-site changes, you cannot use .unwrap_infallible() (which is unstable anyway) at the call sites, because you have to handle the postcard error. In any case, returning Result<T, SingletonError> is far less ergonomic than returning T. I also don't think a function signature should ever depend on compile-time-arguments, which I think is your motivation for this.

@sffc
Copy link
Member Author

sffc commented Oct 22, 2024

Yeah, code that wants to work both ways would need to use plain .unwrap(). I agree this isn't great for ergonomics.

I would like to avoid having to feature-gate functions only available in baked+compiled_data mode.

Maybe the postcard deserializer can run on startup in an initialization step that populates the &'static references?

@robertbastian
Copy link
Member

Maybe the postcard deserializer can run on startup in an initialization step that populates the &'static references?

And when it fails it panics?

I don't see how this type of hardcoded postcard data solves any real-world issue.

@sffc
Copy link
Member Author

sffc commented Oct 23, 2024

Real-world issues this solves:

  1. Testing with JSON data, as justified in the OP.
  2. Giving options in the tradeoff between compile time, run time, and binary size.
  3. Allowing a single code base to be backed by different data without changing the code (ideally such a code base would take a data provider argument, but this is not always practical). If in an environment where postcard data is available, you want to be able to build a library that uses compiled_data constructors to read from the postcard data.

What in those objectives are we not aligned on?

@sffc
Copy link
Member Author

sffc commented Oct 23, 2024

For singletons: I feel it would be in the cards to break our "never panic" policy to support this use case, which I think is an important use case but not the primary one. So, ::new() can panic if there's a problem with the postcard. This at least unblocks 2.0 and we can revisit this if it becomes a more popular build configuration.

The 2.0 blocker is the constness of these functions. I don't immediately see a way to keep them const, which is a bit sad. It's also not clear to me what constness actually buys us; the same code inlining will always happen, right? I guess it helps if there is more than 1 call site.

@Manishearth
Copy link
Member

the same code inlining will always happen, right

not quite: we can't guarantee this, but users can guarantee this by wrapping things in a const or static

the ability to stick this in a static is itself rather useful IMO, outside of its ability to force constness.

@Manishearth
Copy link
Member

Manishearth commented Oct 23, 2024

Testing with JSON data, as justified in the OP.

(I suspect Robert was talking about user-facing issues?)

I think this can also be done by moving most non-docs tests over to a TestDataProvider with toggleable backend. I do agree that being able to hack baked constructors leads to the happiest ICU4X dev experience here.

I also don't find myself wanting to do this that often.

Giving options in the tradeoff between compile time, run time, and binary size.

Allowing a single code base to be backed by different data without changing the code (ideally such a code base would take a data provider argument, but this is not always practical). If in an environment where postcard data is available, you want to be able to build a library that uses compiled_data constructors to read from the postcard data.

Generally I think that we have very carefully designed our data provider architecture to make this possible in a relatively convenient way. I actually do think such codebases should take data provider arguments. I think trying to make ::new() support this type of use case is a somewhat futile endeavor, the data provider architecture is always going to be better for this because it is designed for this use case. I'm generally in favor of adding more overlap between API applicability so that you don't have to rearchitect your code when your use case expands "a little bit", but I'm not sure the cost is worth it here.

(I also consider this in the bucket of "major design change that it is too close to 2.0 to make")

What in those objectives are we not aligned on?

I'm not a fan of this framing, it makes it sound like we are not aligned on those objectives. In general I like those objectives but I think they are not as important as having const here where possible. It is possible for us to be in alignment on the objectives, but not the relative prioritization.

@robertbastian
Copy link
Member

@sffc and I agree that this can be 3.0

@Manishearth
Copy link
Member

That also gives us time to see if folks really like our const APIs or not.

@sffc
Copy link
Member Author

sffc commented Oct 23, 2024

Shane/Rob discussion:

  • @sffc I listed out 3 use cases for this.
  • @robertbastian Compile time, sure. I don't think testing is important enough to impact API ergonomics. For global blob data, the assumptions at call sites need to change: unwraps on functions that previously could not fail are now fallible, and you don't want your app to crash given bad postcard data. Your app needs to be re-written with new assumptions. We have the APIs and tooling for clients to do this themselves.
  • @sffc I still maintain that having a single code base work with both global-bake and global-postcard is a use case. I agree that the code needs to make different assumptions, which likely means different function signatures.
  • @robertbastian If you have your postcard at compile time, you can convert it to baked. There is a solution for postcard at compile time, and there's a solution for postcard at runtime, but global postcard at runtime is not something that we can shoehorn into the same API as baked data.
  • @sffc If the postcard comes from the OS, you don't have it at compile time, but in the JSON testing use case, this would work.

Conclusion:

  • Defer "global postcard" design to 3.0
  • Perhaps add tooling to convert test JSON to test bake for running ICU4X unit tests.

LGTM: @sffc @robertbastian

@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Oct 23, 2024
@Manishearth
Copy link
Member

Your app needs to be re-written with new assumptions.

This is a compelling reason for me to feel more strongly that "allowing a single option to be backed by multiple backends" should not be served by ::new(). I already said I feel that such an exercise is somewhat futile to get right, but "your fallibility requirements in real code change" is a good concrete example of this.

LGTM on the conclusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-data-infra Component: provider, datagen, fallback, adapters
Projects
None yet
Development

No branches or pull requests

3 participants