-
-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: lazy load regexes to save memory. #900
Conversation
I believe @anuraaga can help to get much more interesting numbers. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #900 +/- ##
==========================================
+ Coverage 81.56% 81.74% +0.18%
==========================================
Files 160 162 +2
Lines 9051 9082 +31
==========================================
+ Hits 7382 7424 +42
+ Misses 1417 1409 -8
+ Partials 252 249 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should look at the FTW benchmark numbers we have to see the effect on runtime. I guess it will be in the noise though, but there is overhead in sync.Once
compared to not.
More importantly, this is causing regex compilation on the hot path of first request processing - given how expensive regex compilation is, I think this could basically "guarantee a failed first request" by missing a deadline or something. I suspect it's not worth it
@anuraaga did some changes to specifically skip compilation details in here. Instead allowed connectors to plug in their own regex compiler which I think is still beneficial for single thread environment (e.g. wasm). If this sounds reasonable to you I wonder if instead of doing API changes here we should be applying this logic in https://github.com/corazawaf/coraza-wasilibs/blob/main/rx.go instead where most of the regexes are being managed. Something like
would do the trick and avoid changes in here. |
This PR aims to lazyload regexes to make sure we don't load in memory regexes that are not used (e.g. beyond our paranoia level) and only load them if we need them.
Some numbers I got
main
this branch
Code was: