-
-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant parser slowdown for Latex with complex parameters #2445
Comments
I should note that the user story right now is that because we have sty files included in the LatexIndexableSetContributor, all these files will be parsed and hence if the parser gets stuck on one of them, the indexing gets stuck. Parsing all these files was never the intention (though perhaps a nice test for the parser...) because it's very slow and unnecessary (since our indexers are regex-based) and clutters the stub indices (#2433), so I will try to fix that before the next release anyway. |
I agree that the current solution is less than ideal. I only took a quick glance at the changes you made, so I am not 100% sure yet whether my explanation is correct. I think that by reusing certain rules in the parser you gave the key/val rule a path to the |
Well, I don't think I've worked with external rules yet so I don't know how convenient that is. But wasn't the problem you described in #2447 (comment) caused by the keyval key being too restrictive, which is why we had In any case, I prefer whatever is not too slow and is safe (I mean, small probability of it breaking everything like now) |
The problem I described was a consequence of having the choice in the parser which I considered a cleaner alternative to parsing everything - including things that clearly are no key/val pairs - as key/val pairs. Unfortunately, my fix in #2447 only restricted keys to avoid the recursion, but judging from the stack trace the issue reoccurred in values as well. Why your changes (unintentionally) fixed the issue is puzzling though. |
I don't know what would be best, but I would appreciate if you could have a look into external rules. I think the unintuitive API could be alleviated somewhat by then parsing the key-value pairs at application level manually and then providing all the extension functions to replace the parser information - I think this would be the fallback solution if external rules don't work out, as it's probably the same code but then less nicely integrated with the parser. |
I will keep you updated about any progress. |
As it turned out, the performance issue was something completely unrelated so that's good. And all of the discussion still stands, because while researching that I found some more characters that are not accepted by the parser where they should be. One example is #2692 but there are more like that. |
Interesting, it looked very much like the original parser issue. I hope for the external rules not just to be more robust, but also for the possibility to "opt-in" to key/val parsing instead of a catch-all approach because the list of commands where we actually want key/vals isn't that long at the moment. |
@PHPirates I just realized I never wrote back on this discussion, so here goes my answer, only one year late. 🤦 I did actually look into external rules, but he problem was in the hand written parsing code I couldn't find a way to get the necessary context information. So for example, I was trying to "look back" and figure out the command that was parsed previously, so I could decide whether I want to parse the parameters as key/vals or just regular parameters. I will try to ressurect my notes and post a question in the intellij support/grammarkit repo, but I have little hope that external rules are the solution to the problem |
I found a support question that looks similar to what I suggested: https://intellij-support.jetbrains.com/hc/en-us/community/posts/206123179-How-to-produce-a-slightly-different-psiTree-from-astTree-. Instead of remembering the ID though, we would have to remember the command name and then predicate the parameter parsing on that name. If it is a name known to have keyval parameters, we would parse keyval params, otherwise the regular parameters. The question is quite old though, so I will have to test whether it actually works. The topic of dynamic parsing pops up once in a while and there seem to be ways to do it, but the GrammarKit documentation leaves a lot to be desired. |
@PHPirates I saw a bit too late that the optional param key/val parsing now just parses everything as key/value pairs. So I guess there is no immediate need for the predicated parsing. I have an experiment working nonetheless that parses the optional parameters either as key/value pair or directly as |
Very interesting, thanks for the link. That does sound very interesting, though not necessarily for key-value pairs parsing. I can't think of a direct application but I have the feeling I was looking for something like that in the past. Maybe you want to share your experiment? |
Thanks, I found the PR after writing the experiment. I think the small amount of changes neccessary in @slideclimb's PR justifies the approach, especially when it works well.
Sure, I pushed the experiment to a branch in my fork: master...fberlakovich:TeXiFy-IDEA:parser-experiment. Obviously the experiment is ugly and ignores several edge cases, but you get the idea. I also think it could be useful in the future, given the high amount of dynamism in Latex. |
Thanks! I made note of it so I can find it back when needed. |
If I open the source of the
tkz-tab
package, the parser seems to get stuck (might be an infinite loop) on parsing some strict-keyval-pair. Perhaps you can have a look? You can find the file in your local latex distribution or at https://github.com/tkz-sty/tkz-tab/blob/master/latex/tkz-tab.styOriginally posted by @PHPirates in #2388 (comment)
The text was updated successfully, but these errors were encountered: