chore(jql-parser): switch to winnow #245

yamafaktory · 2023-12-03T22:35:51Z

Why -> https://docs.rs/winnow/latest/winnow/_topic/why/index.html
Another potential benefit is the performance 🚀

epage · 2023-12-04T15:06:51Z

crates/jql-parser/src/parser.rs

@@ -33,32 +32,29 @@ use crate::{
 };

 /// Parses the provided input and map it to the first matching token.
-fn parse_fragment(input: &str) -> IResult<&str, Token> {
+fn parse_fragment<'a>(input: &mut &'a str) -> PResult<Token<'a>> {
    alt((


When all (or most performance critical) elements of a alt have unique first characters, dispatch! can offer some big performance wins

e.g. see https://github.com/toml-rs/toml/blob/main/crates/toml_edit/src/parser/value.rs#L18

Sounds really good! I've just switched to it in a local branch and it works like a charm.

epage · 2023-12-04T15:07:55Z

crates/jql-parser/src/combinators.rs

-    })(input)
+pub(crate) fn parse_number(input: &mut &str) -> PResult<Index> {
+    digit1
+        .try_map(|index: &str| index.parse::<usize>().map(Index))


nit: we do have Parser::parse_to

Oh, actually I tried it and it failed... now I get my error (simply needed to implement FromStr for Index).

epage · 2023-12-04T15:09:24Z

crates/jql-parser/src/combinators.rs

-{
-    trim(delimited(char('"'), take_until(r#"""#), char('"')))
+pub(crate) fn parse_key<'a>(input: &mut &'a str) -> PResult<&'a str> {
+    trim(delimited(DOUBLE_QUOTE, take_until0(r#"""#), DOUBLE_QUOTE)).parse_next(input)


Since you use take_until*, running cargo add winnow -F simd will re-enable optimizations that we made disabled by-default compared to nom

Good to know, thanks!

Actually, a big prevents it for &str, working on a fix

0.5.22 is now out with the fix

That was fast! Thanks a ton.

epage · 2023-12-04T15:11:02Z

crates/jql-parser/src/combinators.rs

-fn parse_keys(input: &str) -> IResult<&str, Vec<&str>> {
-    separated_list1(trim(tag(",")), trim(parse_key()))(input)
+fn parse_keys<'a>(input: &mut &'a str) -> PResult<Vec<&'a str>> {
+    separated(1.., trim(parse_key), trim(COMMA)).parse_next(input)


You likely only need to trim either parse_key or COMMA and not both and then wrap the entire expression in a trim. That'll reduce the amount of redundant operations being performed which might offer a small speed bump.

Personally, I would move that wrapping trim into the caller.

Another advantage of moving trim up is when you get to any alt if you can move the trim to the outside of the alt so it isn't re-doing a trim on every branch of the alt

Thanks a ton for the feedback! Now that I'm looking at my own code, I see how inefficient it is.

epage · 2023-12-04T15:12:55Z

crates/jql-parser/src/combinators.rs

+    trim(delimited(
+        trim(CURLY_BRACE_OPEN),
+        parse_keys,
+        trim(CURLY_BRACE_CLOSE),
+    ))


Case in point about moving the wrapping trim from parse_keys into here. That'd show that

trim(delimited( trim(CURLY_BRACE_OPEN), trim(parse_key), trim(CURLY_BRACE_CLOSE), ))

is happening which has a bunch of redundant trims.

I'd switch to

delimited( CURLY_BRACE_OPEN, trim(parse_key), CURLY_BRACE_CLOSE, )

with the caller also responsible for doing a trim

epage · 2023-12-04T15:14:05Z

crates/jql-parser/src/combinators.rs

-/// A combinator which parses an object range.
-pub(crate) fn parse_object_range<'a>()
-> impl FnMut(&'a str) -> IResult<&'a str, (Option<Index>, Option<Index>)> {
+pub(crate) fn parse_object_index(input: &mut &str) -> PResult<Vec<Index>> {
    trim(delimited(


btw if you wrap these in trace("object index", ...) calls, then debugging becomes really easy with cargo test -F winnow/debug

epage · 2023-12-04T15:14:25Z

crates/jql-parser/src/combinators.rs

 }

 /// A combinator which parses any lens value.
-pub(crate) fn parse_lens_value<'a>() -> impl FnMut(&'a str) -> IResult<&'a str, LensValue> {
+pub(crate) fn parse_lens_value<'a>(input: &mut &'a str) -> PResult<LensValue<'a>> {
    alt((


This could likely also be a dispatch!

epage · 2023-12-04T15:18:37Z

If you feel up to it, I'd love to hear an experience report of using winnow. This was done by the hcl-edit author (here) and it was a big help

Example topics

What helped out compared to past parser experience
What got in the way
What was confusing
Any suggestions for making things more obvious
Any thoughts on further improving naming

epage · 2023-12-04T21:37:00Z

crates/jql-parser/src/parser.rs

@@ -33,32 +32,29 @@ use crate::{
 };

 /// Parses the provided input and map it to the first matching token.
-fn parse_fragment(input: &str) -> IResult<&str, Token> {
+fn parse_fragment<'a>(input: &mut &'a str) -> PResult<Token<'a>> {


Forgot to call out, if you are able to parse &[u8] instead of &str, that also offers big gains

yamafaktory · 2023-12-04T22:28:04Z

Thanks again @epage, I've created a follow-up PR based on your awesome feedback :).

N.B.: since the parser gets its input from Clap as query, I'm not sure if converting it to &[u8] is a good move or not (though the CLI also accepts to read the query as query_from_file but that not the main usage).

epage · 2023-12-04T22:31:36Z

N.B.: since the parser gets its input from Clap as query, I'm not sure if converting it to &[u8] is a good move or not (though the CLI also accepts to read the query as query_from_file but that not the main usage).

The source doesn't matter (for toml_edit we call as_bytes() on a &str). You will need to re-validate any &[u8] you get out to convert it to &str. For toml_edit, whenever the grammar is ASCII only, I use unsafe to bypass the validation. Depending on you aversion to unsafe and the cost of that validation, I'm not sure how much of a gain it will be.

chore(jql-parser): switch to winnow

5909563

yamafaktory added the enhancement New feature or request label Dec 3, 2023

yamafaktory self-assigned this Dec 3, 2023

yamafaktory merged commit dab3434 into main Dec 3, 2023

yamafaktory deleted the chore/winnow branch December 3, 2023 22:41

epage reviewed Dec 4, 2023

View reviewed changes

yamafaktory mentioned this pull request Dec 4, 2023

chore(jql-parser): improve parser implementation based on @epage 's feedback #247

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(jql-parser): switch to winnow #245

chore(jql-parser): switch to winnow #245

yamafaktory commented Dec 3, 2023

epage Dec 4, 2023

yamafaktory Dec 4, 2023

epage Dec 4, 2023

yamafaktory Dec 4, 2023

epage Dec 4, 2023

yamafaktory Dec 4, 2023

epage Dec 4, 2023

epage Dec 4, 2023

yamafaktory Dec 4, 2023

epage Dec 4, 2023

epage Dec 4, 2023

yamafaktory Dec 4, 2023

epage Dec 4, 2023

epage Dec 4, 2023

epage Dec 4, 2023

epage commented Dec 4, 2023

epage Dec 4, 2023

yamafaktory commented Dec 4, 2023

epage commented Dec 4, 2023

chore(jql-parser): switch to winnow #245

chore(jql-parser): switch to winnow #245

Conversation

yamafaktory commented Dec 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

epage commented Dec 4, 2023

Choose a reason for hiding this comment

yamafaktory commented Dec 4, 2023

epage commented Dec 4, 2023