Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Export/import of JSON metadata #1622

Merged
merged 10 commits into from
Nov 13, 2024
Merged

feat: Export/import of JSON metadata #1622

merged 10 commits into from
Nov 13, 2024

Conversation

zrho
Copy link
Contributor

@zrho zrho commented Oct 31, 2024

Metadata in hugr-core is attached to nodes and to OpDefs, consisting of a map from string names to serde_json::Values. On top of that, OpDef also has a description field. This PR imports and exports node metadata by serializing it to a JSON string and wrapping that string with a prelude.json constructor. It also exports the metadata of OpDefs, in which case the description field can be exported as a string directly. This PR also introduces string escaping for the text format (#1549).

By wrapping the metadata in a JSON type on the hugr-model side, we leave open the option to have typed metadata via the usual term system in the future.

Closes #1631.

@zrho zrho marked this pull request as ready for review October 31, 2024 17:26
@zrho zrho requested a review from a team as a code owner October 31, 2024 17:26
@zrho zrho requested a review from doug-q October 31, 2024 17:26
Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 74.24242% with 34 lines in your changes missing coverage. Please review.

Project coverage is 85.51%. Comparing base (9962f97) to head (7eab2e2).
Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
hugr-model/src/v0/text/parse.rs 62.50% 13 Missing and 5 partials ⚠️
hugr-core/src/import.rs 59.09% 4 Missing and 5 partials ⚠️
hugr-core/src/export.rs 86.95% 5 Missing and 1 partial ⚠️
hugr-core/src/extension/op_def.rs 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1622      +/-   ##
==========================================
- Coverage   85.78%   85.51%   -0.28%     
==========================================
  Files         136      136              
  Lines       25135    25252     +117     
  Branches    22061    22164     +103     
==========================================
+ Hits        21561    21593      +32     
- Misses       2425     2455      +30     
- Partials     1149     1204      +55     
Flag Coverage Δ
python 92.42% <ø> (-0.04%) ⬇️
rust 84.54% <74.24%> (-0.31%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@doug-q doug-q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some nits

@@ -697,6 +695,44 @@ impl<'a> ParseContext<'a> {
unreachable!("expected a symbol");
}
}

fn parse_string(&self, token: Pair<'a, Rule>) -> ParseResult<&'a str> {
assert_eq!(token.as_rule(), Rule::string);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_eq!(token.as_rule(), Rule::string);
debug_assert_eq!(token.as_rule(), Rule::string);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to debug_assert that the first and last characters are ", but perhaps that is done elsewhere?

Copy link
Contributor Author

@zrho zrho Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the first and last characters are " is guaranteed by the pest grammar for the string rule.

I've made the assert into a debug assert as suggested.

'n' => string.push('\n'),
'r' => string.push('\r'),
't' => string.push('\t'),
_ => unreachable!(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the criteria for allowed escapes? https://www.ietf.org/rfc/rfc4627.txt says any character can be escaped (i.e. \x should be interpreted as x).

Copy link
Contributor Author

@zrho zrho Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.ietf.org/rfc/rfc4627.txt describes how JSON does string escaping, which is not the only way. Note that we use the string to store JSON but the string is not itself a JSON string. These escape codes are closer to how Rust does escaping (which has a nicer unicode escape sequence). More precisely it's a subset of what Rust does (not deliberately). It's also a subset of what the webassembly text format does (which is very similar to Rust strings but not entirely the same). I don't care that much about which escaping system we use so this was picked rather arbitrarily; we could adjust it to taste if you wish.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, I did not understand that the string is not itself a json string.

'r' => string.push('\r'),
't' => string.push('\t'),
_ => unreachable!(),
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this unreachable is panicking on bad user input? We should throw a ParseError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way that the pest library works, at the point we're doing the match we know that the input was valid for the string_escape rule, which is defined as follows:

string_escape  = @{ "\\" ~ ("\"" | "\\" | "n" | "r" | "t") }

So that case is indeed unreachable. This is similar to the other uses of unreachable in the parser and an idiomatic use of pest.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I agree that the case is unreachable. Nevertheless it is better to throw an error here, because there are two sources-of-truth for allowed escapes, which may diverge over time. If you feel strongly feel free to leave this as is.

Copy link
Contributor Author

@zrho zrho Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When these diverge we have a bug. A parse error would indicate that there's something wrong with the input, when in truth the error is in the code. So I think the panic is appropriate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A panic aborts the program, it's very user hostile. We should aim to minimize the impact of bugs, e.g. throw an error here.

@@ -750,6 +786,16 @@ impl ParseError {
InputLocation::Span((offset, _)) => offset,
}
}

fn custom(message: &str, span: pest::Span) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no coverage for this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what to test about this function. It constructs an error.

@@ -1,19 +1,26 @@
---
source: hugr-core/tests/model.rs
expression: "roundtrip(include_str!(\"fixtures/model-call.edn\"))"
expression: "roundtrip(include_str!(\"../../hugr-model/tests/fixtures/model-call.edn\"))"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a shame, is it possible to arrange things so that this cross-crate reference is removed?

Copy link
Contributor Author

@zrho zrho Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move the tests that are in hugr-model to hugr-core, but arguably they don't really belong there since they only test stuff within hugr-model. We can't move the tests in hugr-core to hugr-model since they use stuff from hugr-core. We could copy the fixtures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that means yes, we could duplicate them. I prefer duplication, up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this for now but create an issue so we can figure out if there is some elegant approach to this.

@@ -21,6 +21,8 @@ type FxIndexSet<T> = IndexSet<T, fxhash::FxBuildHasher>;

pub(crate) const OP_FUNC_CALL_INDIRECT: &str = "func.call-indirect";
const TERM_PARAM_TUPLE: &str = "param.tuple";
const TERM_JSON: &str = "prelude.json";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps theses should these be public? Downstream clients should not have to memorise the string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point yes, and there should be some extension declarations that define these term constructors. I've not made the names public yet since they'll probably change as we move closer to stabilising.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their propensity to change is the precise reason they should be public now.

Copy link
Collaborator

@doug-q doug-q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving up to you whether or how you address the remaining issues.

@zrho zrho added this pull request to the merge queue Nov 13, 2024
Merged via the queue into main with commit 935c61b Nov 13, 2024
21 of 22 checks passed
@zrho zrho deleted the zrho/model-metadata branch November 13, 2024 10:17
This was referenced Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert metadata between hugr-model and hugr-core
2 participants