Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1/topdown/graphql: Cache GraphQL schema parse results (#5377) #7457

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

robmyersrobmyers
Copy link

@robmyersrobmyers robmyersrobmyers commented Mar 20, 2025

This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once.

Queries are not cached.

pkg: github.com/open-policy-agent/opa/v1/topdown
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16	                   15519            100178 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16               371311              3383 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16                            2133            542355 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16                 3471            528579 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-16                        7105            193325 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16            66594             18093 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16                              6429            173773 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16                   6523            170819 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16                               16352             72777 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16                    16083             73548 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16                                 14320             83589 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16                      71486             15463 ns/op
BenchmarkGraphQLParse/Trivial_Schema_-_string-16                                    3380            321490 ns/op
BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16                        13909             87633 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16                           3435            327646 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16               13844             85213 ns/op
PASS
ok      github.com/open-policy-agent/opa/v1/topdown     112.465s

Resolves: #5377

Why the changes in this PR are needed?

GraphQL schema parsing can be an expensive operation, so caching can help speed things up.

What are the changes in this PR?

This PR leverages the InterQueryBuiltinValueCache to store parsed GraphQL schema data.

Notes to assist PR review:

The performance improvements are more dramatic on more complex schemas, but the complex schemas were omitted from the included test cases because they are quite large. Those numbers are captured in the issue thread.

Further comments:

The GraphQL builtin code has some repeated patterns which could probably be cleaned up, but I felt that was out of scope for adding a caching layer.

Copy link

netlify bot commented Mar 20, 2025

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit 77cd7a2
🔍 Latest deploy log https://app.netlify.com/sites/openpolicyagent/deploys/67dc3dbd92aeb40008019b0a
😎 Deploy Preview https://deploy-preview-7457--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

…gent#5377)

This commit stores parsed GraphQL schemas to the cache, which improves
the performance of GraphQL operations that parse the schema more than once.

Queries are not cached.

pkg: github.com/open-policy-agent/opa/v1/topdown
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16	                   15519            100178 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16               371311              3383 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16            66594             18093 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16                              6429            173773 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16                   6523            170819 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16                               16352             72777 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16                    16083             73548 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16                                 14320             83589 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16                      71486             15463 ns/op
BenchmarkGraphQLParse/Trivial_Schema_-_string-16                                    3380            321490 ns/op
BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16                        13909             87633 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16                           3435            327646 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16               13844             85213 ns/op
PASS
ok      github.com/open-policy-agent/opa/v1/topdown     112.465s

Resolves: open-policy-agent#5377

Signed-off-by: Rob Myers <1243316+robmyersrobmyers@users.noreply.github.com>
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! 😃

Just a couple of questions/comments.

// Compute a constant size key for use with the cache
func cacheKeyWithPrefix(bctx BuiltinContext, operands []*ast.Term, prefix string) (string, bool) {
var cacheKey ast.String
var ok bool = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: superfluous type declaration.


if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_ast"); keyOk {
key = k
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend using a named value cache here. The benefits are:

  • Caching is isolated to the GraphQL built-in, and entries aren't "competing" with other built-ins
  • No need to prefix the key, for above reason (looks like we need internal differentiation, though)
  • Named caches can be individually configured (complete with individual default settings)

See the io.jwt.* built-ins for inspiration.

Comment on lines +579 to +592
var key string
var schemaDoc *gqlast.SchemaDocument

if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_doc"); keyOk {
key = k
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok {
var isSchema bool
schemaDoc, isSchema = val.(*gqlast.SchemaDocument)
if !isSchema {
return key, nil
}
}
}
return key, schemaDoc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: just a bit simplified for readability

Suggested change
var key string
var schemaDoc *gqlast.SchemaDocument
if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_doc"); keyOk {
key = k
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok {
var isSchema bool
schemaDoc, isSchema = val.(*gqlast.SchemaDocument)
if !isSchema {
return key, nil
}
}
}
return key, schemaDoc
if key, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_ast"); keyOk {
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok {
if schemaAST, isAstValue := val.(ast.Value); isAstValue {
return key, schemaAST
}
}
return key, nil
}
return "", nil


// Validate the query against the schema, erroring if there's an issue.
schema, err = convertSchema(schemaDoc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, schema could already be non-nil, e.g. if builtinGraphQLIsValid has been called? In such case, there is no need to re-parse and re-convert the schema, and we can just create querySchema, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we still need to do the parsing, though, to create schemaASTValue 🤔 .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, we're caching the schema doc in other places. But maybe storing all three cache items would make the cache balloon undesirably ..

}

// Validate the query against the schema, erroring if there's an issue.
schema, err = convertSchema(schemaDoc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as for builtinGraphQLParse.

}
if err != nil {
return iter(ast.InternedBooleanTerm(false))
// Schemas are only cached if they are valid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could store a struct that keeps track of both a possibly valid schema and a boolean validity flag; to not need to re-parse broken schemas.

{"Trivial Schema with cache - string", ast.NewTerm(ast.String(employeeGQLSchema)), valueCache, ast.BooleanTerm(true)},
{"Trivial Schema - AST object", ast.NewTerm(ast.MustParseTerm(employeeGQLSchemaAST).Value.(ast.Object)), nil, ast.BooleanTerm(true)},
{"Trivial Schema with cache - AST object", ast.NewTerm(ast.MustParseTerm(employeeGQLSchemaAST).Value.(ast.Object)), valueCache, ast.BooleanTerm(true)},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, it'd help if these were multi-line, as in your other bench tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible to cache graphql.parse() results?
2 participants