-
-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: adds memoize implementation for regexes and ahocorasick #836
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
18a78cb
chore: adds memoize implementation for regexes.
jcchavezs 23382ac
docs: adds comments to code.
jcchavezs 4dae331
Merge branch 'main' into memoize_regex
jptosso be2fc24
chore: simplify the memoize package by using sync.Map.
jcchavezs b8df0c2
Merge branch 'memoize_regex' of github.com:corazawaf/coraza into memo…
jcchavezs 73be18c
feat: extends memoize to ahocorasick and allow impl for tinygo but no…
jcchavezs d43a91f
tests: covers memoize_builders in tinygo.
jcchavezs 62043be
Merge branch 'main' into memoize_regex
jptosso 3ed990b
chore: fixes nosync for tinygo.
jcchavezs 8d52562
docs: updates docs.
jcchavezs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -47,3 +47,6 @@ jobs: | |
|
||
- name: Tests | ||
run: tinygo test ./... | ||
|
||
- name: Tests memoize | ||
run: tinygo test -tags=memoize_builders ./... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ | |
"github.com/corazawaf/coraza/v3/experimental/plugins/macro" | ||
"github.com/corazawaf/coraza/v3/experimental/plugins/plugintypes" | ||
"github.com/corazawaf/coraza/v3/internal/corazarules" | ||
"github.com/corazawaf/coraza/v3/internal/memoize" | ||
"github.com/corazawaf/coraza/v3/types" | ||
"github.com/corazawaf/coraza/v3/types/variables" | ||
) | ||
|
@@ -456,7 +457,12 @@ | |
var re *regexp.Regexp | ||
if len(key) > 2 && key[0] == '/' && key[len(key)-1] == '/' { | ||
key = key[1 : len(key)-1] | ||
re = regexp.MustCompile(key) | ||
|
||
if vare, err := memoize.Do(key, func() (interface{}, error) { return regexp.Compile(key) }); err != nil { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be worth extracting a function for the two usages There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean for the regex and binaryregex? |
||
return err | ||
} else { | ||
re = vare.(*regexp.Regexp) | ||
} | ||
} | ||
|
||
if multiphaseEvaluation { | ||
|
@@ -521,7 +527,11 @@ | |
var re *regexp.Regexp | ||
if len(key) > 2 && key[0] == '/' && key[len(key)-1] == '/' { | ||
key = key[1 : len(key)-1] | ||
re = regexp.MustCompile(key) | ||
if vare, err := memoize.Do(key, func() (interface{}, error) { return regexp.Compile(key) }); err != nil { | ||
return err | ||
} else { | ||
re = vare.(*regexp.Regexp) | ||
} | ||
} | ||
// Prevent sigsev | ||
if r == nil { | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Memoize | ||
|
||
Memoize allows to cache certain expensive function calls and | ||
cache the result. The main advantage in Coraza is to memoize | ||
the regexes and aho-corasick dictionaries when the connects | ||
spins up more than one WAF in the same process and hence same | ||
regexes are being compiled over and over. | ||
|
||
Currently it is opt-in under the `memoize_builders` build tag | ||
as under a misuse (e.g. using after build time) it could lead | ||
to a memory leak as currently the cache is global. | ||
|
||
**Important:** Connectors with *live reload* functionality (e.g. Caddy) | ||
could lead to memory leaks which might or might not be negligible in | ||
most of the cases as usually config changes in a WAF are about a few | ||
rules, this is old objects will be still alive in memory until the program | ||
stops. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
// Copyright 2023 Juan Pablo Tosso and the OWASP Coraza contributors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
//go:build !memoize_builders | ||
|
||
package memoize | ||
|
||
func Do(_ string, fn func() (interface{}, error)) (interface{}, error) { | ||
return fn() | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
// Copyright 2023 Juan Pablo Tosso and the OWASP Coraza contributors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
//go:build tinygo && memoize_builders | ||
|
||
package memoize | ||
|
||
import "sync" | ||
|
||
var doer = makeDoer(new(sync.Map)) | ||
|
||
// Do executes and returns the results of the given function, unless there was a cached | ||
// value of the same key. Only one execution is in-flight for a given key at a time. | ||
// The boolean return value indicates whether v was previously stored. | ||
func Do(key string, fn func() (interface{}, error)) (interface{}, error) { | ||
value, err, _ := doer(key, fn) | ||
return value, err | ||
} | ||
|
||
// makeDoer returns a function that executes and returns the results of the given function | ||
func makeDoer(cache *sync.Map) func(string, func() (interface{}, error)) (interface{}, error, bool) { | ||
return func(key string, fn func() (interface{}, error)) (interface{}, error, bool) { | ||
// Check cache | ||
value, found := cache.Load(key) | ||
if found { | ||
return value, nil, true | ||
} | ||
|
||
data, err := fn() | ||
if err == nil { | ||
cache.Store(key, data) | ||
} | ||
|
||
return data, err, false | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
// Copyright 2023 Juan Pablo Tosso and the OWASP Coraza contributors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
//go:build tinygo && memoize_builders | ||
|
||
// https://github.com/kofalt/go-memoize/blob/master/memoize.go | ||
|
||
package memoize | ||
|
||
import ( | ||
"errors" | ||
"sync" | ||
"testing" | ||
) | ||
|
||
func TestDo(t *testing.T) { | ||
expensiveCalls := 0 | ||
|
||
// Function tracks how many times its been called | ||
expensive := func() (interface{}, error) { | ||
expensiveCalls++ | ||
return expensiveCalls, nil | ||
} | ||
|
||
// First call SHOULD NOT be cached | ||
result, err := Do("key1", expensive) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 1, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
// Second call on same key SHOULD be cached | ||
result, err = Do("key1", expensive) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 1, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
// First call on a new key SHOULD NOT be cached | ||
result, err = Do("key2", expensive) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 2, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
} | ||
|
||
func TestSuccessCall(t *testing.T) { | ||
do := makeDoer(new(sync.Map)) | ||
|
||
expensiveCalls := 0 | ||
|
||
// Function tracks how many times its been called | ||
expensive := func() (interface{}, error) { | ||
expensiveCalls++ | ||
return expensiveCalls, nil | ||
} | ||
|
||
// First call SHOULD NOT be cached | ||
result, err, cached := do("key1", expensive) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 1, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
if want, have := false, cached; want != have { | ||
t.Fatalf("unexpected caching, want %t, have %t", want, have) | ||
} | ||
|
||
// Second call on same key SHOULD be cached | ||
result, err, cached = do("key1", expensive) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 1, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
if want, have := true, cached; want != have { | ||
t.Fatalf("unexpected caching, want %t, have %t", want, have) | ||
} | ||
|
||
// First call on a new key SHOULD NOT be cached | ||
result, err, cached = do("key2", expensive) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 2, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
if want, have := false, cached; want != have { | ||
t.Fatalf("unexpected caching, want %t, have %t", want, have) | ||
} | ||
} | ||
|
||
func TestFailedCall(t *testing.T) { | ||
do := makeDoer(new(sync.Map)) | ||
|
||
calls := 0 | ||
|
||
// This function will fail IFF it has not been called before. | ||
twoForTheMoney := func() (interface{}, error) { | ||
calls++ | ||
|
||
if calls == 1 { | ||
return calls, errors.New("Try again") | ||
} else { | ||
return calls, nil | ||
} | ||
} | ||
|
||
// First call should fail, and not be cached | ||
result, err, cached := do("key1", twoForTheMoney) | ||
if err == nil { | ||
t.Fatalf("expected error") | ||
} | ||
|
||
if want, have := 1, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
if want, have := false, cached; want != have { | ||
t.Fatalf("unexpected caching, want %t, have %t", want, have) | ||
} | ||
|
||
// Second call should succeed, and not be cached | ||
result, err, cached = do("key1", twoForTheMoney) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 2, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
if want, have := false, cached; want != have { | ||
t.Fatalf("unexpected caching, want %t, have %t", want, have) | ||
} | ||
|
||
// Third call should succeed, and be cached | ||
result, err, cached = do("key1", twoForTheMoney) | ||
if err != nil { | ||
t.Fatalf("unexpected error: %s", err.Error()) | ||
} | ||
|
||
if want, have := 2, result.(int); want != have { | ||
t.Fatalf("unexpected value, want %d, have %d", want, have) | ||
} | ||
|
||
if want, have := true, cached; want != have { | ||
t.Fatalf("unexpected caching, want %t, have %t", want, have) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
// Copyright 2023 Juan Pablo Tosso and the OWASP Coraza contributors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
//go:build !tinygo && memoize_builders | ||
|
||
// https://github.com/kofalt/go-memoize/blob/master/memoize.go | ||
|
||
package memoize | ||
|
||
import ( | ||
"sync" | ||
|
||
"golang.org/x/sync/singleflight" | ||
) | ||
|
||
var doer = makeDoer(new(sync.Map), new(singleflight.Group)) | ||
|
||
// Do executes and returns the results of the given function, unless there was a cached | ||
// value of the same key. Only one execution is in-flight for a given key at a time. | ||
// The boolean return value indicates whether v was previously stored. | ||
func Do(key string, fn func() (interface{}, error)) (interface{}, error) { | ||
value, err, _ := doer(key, fn) | ||
return value, err | ||
} | ||
|
||
// makeDoer returns a function that executes and returns the results of the given function | ||
func makeDoer(cache *sync.Map, group *singleflight.Group) func(string, func() (interface{}, error)) (interface{}, error, bool) { | ||
return func(key string, fn func() (interface{}, error)) (interface{}, error, bool) { | ||
// Check cache | ||
value, found := cache.Load(key) | ||
if found { | ||
return value, nil, true | ||
} | ||
|
||
// Combine memoized function with a cache store | ||
value, err, _ := group.Do(key, func() (interface{}, error) { | ||
data, innerErr := fn() | ||
if innerErr == nil { | ||
cache.Store(key, data) | ||
} | ||
|
||
return data, innerErr | ||
}) | ||
|
||
return value, err, false | ||
} | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @anuraaga