-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We should support running ICU4X tests with JSON data #5420
Comments
I'm open to this. I'd love to overall reduce the amount of generated stuff in tree though. |
actually I'd like us to run tests on all three primary data formats (baked, blob, json), generated at CI time, so that we don't accidentally e.g. break zerovec postcard deserialization |
Proposal: checked in json data, and make tasks that replace it with postcard or baked data in a way that allows transparently running tests. We could potentially have a passthrough TestDataProvider or something |
We should definitely have a test, if we don't already have one, that JSON data can be converted to Postcard and back and |
There are a lot of different efficient ways of reading the data, especially for complex objects like Patterns. So I think there's some value in testing all those codepaths, but this isn't a strong opinipon of mine. |
You've proposed this before and it doesn't work because of singletons. |
@robertbastian If we make the singleton functions not be const, then we can change them to call a fn to get a &'static reference to the singleton data, and the postcard-backed compiled data can return &'static from a OnceLock. |
Your proposal sacrifices constness to make this work. Are you also planning to sacrifice infallibility, or do have a solution for infallibly loading postcard data? |
Good point. The data crate could export a |
If you want to switch between compiled data and postcard without call-site changes, you cannot use |
Yeah, code that wants to work both ways would need to use plain I would like to avoid having to feature-gate functions only available in baked+compiled_data mode. Maybe the postcard deserializer can run on startup in an initialization step that populates the &'static references? |
And when it fails it panics? I don't see how this type of hardcoded postcard data solves any real-world issue. |
Real-world issues this solves:
What in those objectives are we not aligned on? |
For singletons: I feel it would be in the cards to break our "never panic" policy to support this use case, which I think is an important use case but not the primary one. So, The 2.0 blocker is the constness of these functions. I don't immediately see a way to keep them const, which is a bit sad. It's also not clear to me what constness actually buys us; the same code inlining will always happen, right? I guess it helps if there is more than 1 call site. |
not quite: we can't guarantee this, but users can guarantee this by wrapping things in a const or static the ability to stick this in a static is itself rather useful IMO, outside of its ability to force constness. |
(I suspect Robert was talking about user-facing issues?) I think this can also be done by moving most non-docs tests over to a TestDataProvider with toggleable backend. I do agree that being able to hack baked constructors leads to the happiest ICU4X dev experience here. I also don't find myself wanting to do this that often.
Generally I think that we have very carefully designed our data provider architecture to make this possible in a relatively convenient way. I actually do think such codebases should take data provider arguments. I think trying to make (I also consider this in the bucket of "major design change that it is too close to 2.0 to make")
I'm not a fan of this framing, it makes it sound like we are not aligned on those objectives. In general I like those objectives but I think they are not as important as having |
@sffc and I agree that this can be 3.0 |
That also gives us time to see if folks really like our const APIs or not. |
Shane/Rob discussion:
Conclusion:
LGTM: @sffc @robertbastian |
This is a compelling reason for me to feel more strongly that "allowing a single option to be backed by multiple backends" should not be served by LGTM on the conclusion. |
In the not-so-distant past, we ran ICU4X docs tests with the
icu_testdata
crate, which was backed by a mix of JSON and Baked Data files. In 1.3, we switched over to using compiled data on all locales.While this simplified our build, I think there is still value in running tests against the JSON data:
icu_datetime
andicu_datetime(test)
to build in order to iterate on my semantic skeleta work. JSON data doesn't need to be compiled, so it would lead to substantially faster Edit-Build-Test-Debug loop./tmp
and then immediately converted into a machine-readable data format.My proposal to fix this is:
ICU4X_DATA_DIR
test-gigo
Additional notes:
stubdata
if we did this, instead pointing IDEs to the new JSON-based compiled data crates.CC @robertbastian @Manishearth
The text was updated successfully, but these errors were encountered: