-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1/topdown/graphql: Cache GraphQL schema parse results (#5377) #7457
base: main
Are you sure you want to change the base?
v1/topdown/graphql: Cache GraphQL schema parse results (#5377) #7457
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
f9865c2
to
77cd7a2
Compare
…gent#5377) This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once. Queries are not cached. pkg: github.com/open-policy-agent/opa/v1/topdown cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16 15519 100178 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16 371311 3383 ns/op BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16 66594 18093 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16 6429 173773 ns/op BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16 6523 170819 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16 16352 72777 ns/op BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16 16083 73548 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16 14320 83589 ns/op BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16 71486 15463 ns/op BenchmarkGraphQLParse/Trivial_Schema_-_string-16 3380 321490 ns/op BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16 13909 87633 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16 3435 327646 ns/op BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16 13844 85213 ns/op PASS ok github.com/open-policy-agent/opa/v1/topdown 112.465s Resolves: open-policy-agent#5377 Signed-off-by: Rob Myers <1243316+robmyersrobmyers@users.noreply.github.com>
77cd7a2
to
c8871c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! 😃
Just a couple of questions/comments.
// Compute a constant size key for use with the cache | ||
func cacheKeyWithPrefix(bctx BuiltinContext, operands []*ast.Term, prefix string) (string, bool) { | ||
var cacheKey ast.String | ||
var ok bool = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: superfluous type declaration.
|
||
if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_ast"); keyOk { | ||
key = k | ||
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend using a named value cache here. The benefits are:
- Caching is isolated to the GraphQL built-in, and entries aren't "competing" with other built-ins
No need to prefix the key, for above reason(looks like we need internal differentiation, though)- Named caches can be individually configured (complete with individual default settings)
See the io.jwt.* built-ins for inspiration.
var key string | ||
var schemaDoc *gqlast.SchemaDocument | ||
|
||
if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_doc"); keyOk { | ||
key = k | ||
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok { | ||
var isSchema bool | ||
schemaDoc, isSchema = val.(*gqlast.SchemaDocument) | ||
if !isSchema { | ||
return key, nil | ||
} | ||
} | ||
} | ||
return key, schemaDoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: just a bit simplified for readability
var key string | |
var schemaDoc *gqlast.SchemaDocument | |
if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_doc"); keyOk { | |
key = k | |
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok { | |
var isSchema bool | |
schemaDoc, isSchema = val.(*gqlast.SchemaDocument) | |
if !isSchema { | |
return key, nil | |
} | |
} | |
} | |
return key, schemaDoc | |
if key, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_ast"); keyOk { | |
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok { | |
if schemaAST, isAstValue := val.(ast.Value); isAstValue { | |
return key, schemaAST | |
} | |
} | |
return key, nil | |
} | |
return "", nil |
|
||
// Validate the query against the schema, erroring if there's an issue. | ||
schema, err = convertSchema(schemaDoc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, schema
could already be non-nil, e.g. if builtinGraphQLIsValid
has been called? In such case, there is no need to re-parse and re-convert the schema, and we can just create querySchema
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we still need to do the parsing, though, to create schemaASTValue
🤔 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, we're caching the schema doc in other places. But maybe storing all three cache items would make the cache balloon undesirably ..
} | ||
|
||
// Validate the query against the schema, erroring if there's an issue. | ||
schema, err = convertSchema(schemaDoc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for builtinGraphQLParse
.
} | ||
if err != nil { | ||
return iter(ast.InternedBooleanTerm(false)) | ||
// Schemas are only cached if they are valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could store a struct that keeps track of both a possibly valid schema and a boolean validity flag; to not need to re-parse broken schemas.
{"Trivial Schema with cache - string", ast.NewTerm(ast.String(employeeGQLSchema)), valueCache, ast.BooleanTerm(true)}, | ||
{"Trivial Schema - AST object", ast.NewTerm(ast.MustParseTerm(employeeGQLSchemaAST).Value.(ast.Object)), nil, ast.BooleanTerm(true)}, | ||
{"Trivial Schema with cache - AST object", ast.NewTerm(ast.MustParseTerm(employeeGQLSchemaAST).Value.(ast.Object)), valueCache, ast.BooleanTerm(true)}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readability, it'd help if these were multi-line, as in your other bench tests.
This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once.
Queries are not cached.
Resolves: #5377
Why the changes in this PR are needed?
GraphQL schema parsing can be an expensive operation, so caching can help speed things up.
What are the changes in this PR?
This PR leverages the InterQueryBuiltinValueCache to store parsed GraphQL schema data.
Notes to assist PR review:
The performance improvements are more dramatic on more complex schemas, but the complex schemas were omitted from the included test cases because they are quite large. Those numbers are captured in the issue thread.
Further comments:
The GraphQL builtin code has some repeated patterns which could probably be cleaned up, but I felt that was out of scope for adding a caching layer.