-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/vt/callerid: speedup callerid context access #16352
Conversation
Relatively minor, but this is called and used relatively frequently, plus a very easy win. `callerid.NewContext` combines both CallerID and VTGateCallerID into a singular `Context.WithValue`. This is slightly more performant because each `WithValue` wraps a new layer of `Context`, which also means when doing `Context.Value`, this gets searched recursively through each layer. Reducing this to 1 "tuple" means 1 less layer of Context wrapping, and these are always both set together through `NewContext`. As you can see, this ultimately makes `EffectiveCallerIDFromContext` slightly faster due to it being 1 less layer to unwrap. ``` $ benchstat {old,new}.txt goos: darwin goarch: arm64 pkg: vitess.io/vitess/go/vt/callerid │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ CallerIDContext/New-10 52.92n ± 0% 44.20n ± 3% -16.46% (p=0.000 n=10) CallerIDContext/EffectiveCallerIDFromContext-10 7.031n ± 0% 3.435n ± 0% -51.14% (p=0.000 n=10) CallerIDContext/ImmediateCallerIDFromContext-10 6.880n ± 2% 3.527n ± 1% -48.73% (p=0.000 n=10) geomean 13.68n 8.121n -40.63% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ CallerIDContext/New-10 96.00 ± 0% 64.00 ± 0% -33.33% (p=0.000 n=10) CallerIDContext/EffectiveCallerIDFromContext-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ CallerIDContext/ImmediateCallerIDFromContext-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² -12.64% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ CallerIDContext/New-10 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹ CallerIDContext/EffectiveCallerIDFromContext-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ CallerIDContext/ImmediateCallerIDFromContext-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean ``` Signed-off-by: Matt Robenolt <matt@ydekproductions.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
} | ||
return nil | ||
} | ||
|
||
// ImmediateCallerIDFromContext returns the ImmediateCallerID(querypb.VTGateCallerID) | ||
// stored in the Context, if any | ||
func ImmediateCallerIDFromContext(ctx context.Context) *querypb.VTGateCallerID { | ||
im, ok := ctx.Value(immediateCallerIDKey).(*querypb.VTGateCallerID) | ||
if ok && im != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also removed the extraneous != nil
check, which wasn't actually free. But is entirely unnecessary to check. If it's nil, can simply return nil.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #16352 +/- ##
==========================================
- Coverage 68.70% 68.69% -0.02%
==========================================
Files 1547 1547
Lines 198287 198284 -3
==========================================
- Hits 136239 136212 -27
- Misses 62048 62072 +24 ☔ View full report in Codecov by Sentry. |
ef *vtrpcpb.CallerID | ||
im *querypb.VTGateCallerID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call these effective
and immediate
for (a) readability and (b) so that people can correlate them to concepts they are already familiar with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can do that. I yanked ef
and im
since they were used throughout this file already for function args, etc.
// internal Context key for effective CallerID | ||
effectiveCallerIDKey callerIDKey = 1 | ||
) | ||
type callerIDContextKey struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the keys were integers. Now we have only one key and it is an empty struct. Is it actually more efficient to use an empty struct as the key versus an integer value of 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An int is 4 or 8 bytes of storage whereas an empty struct is 0 bytes AFAIUI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a zero struct occupies no extra space, it's relatively insignificant here since we're talking about a single integer, but a zero sized struct has a constant pointer in the Go runtime and is effectively a pointer to one address no matter the type. Specifically this runtime.zerobase
.
The comparison has to be done with an interface because of the any
in Context
, so an interface just holds a reference to what type + the value, in which case, it's effectively only a ref to the type without a value, since the value is nothing, and the comparison is just on the type, rather than an interface of {type: int, val: 1}
or whatever.
The docs also suggest using empty structs for keys: https://pkg.go.dev/context#WithValue
To avoid allocating when assigning to an interface{}, context keys often have concrete type struct{}. Alternatively, exported context key variables' static type should be a pointer or interface.
Examples of this can be seen through the stdlib codebase.
https://github.com/golang/go/blob/a71bb570d064f6d0c1cf59cef4b7a044be7a39d3/src/runtime/trace/annotation.go#L14
https://github.com/golang/go/blob/a71bb570d064f6d0c1cf59cef4b7a044be7a39d3/src/net/http/transport.go#L2809
https://github.com/golang/go/blob/a71bb570d064f6d0c1cf59cef4b7a044be7a39d3/src/runtime/pprof/label.go#L25
etc
I doubt in practice this makes a tangible difference, but it is the established pattern in stdlib for context keys. If they were to make compiler optimizations for this, I imagine they'd target this pattern specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. TiL
This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:
If no action is taken within 7 days, this PR will be closed. |
This PR was closed because it has been stale for 7 days with no activity. |
Relatively minor, but this is called and used relatively frequently, plus a very easy win.
callerid.NewContext
combines both CallerID and VTGateCallerID into a singularContext.WithValue
.This is slightly more performant because each
WithValue
wraps a new layer ofContext
, which also means when doingContext.Value
, this gets searched recursively through each layer.Reducing this to 1 "tuple" means 1 less layer of Context wrapping, and these are always both set together through
NewContext
.As you can see, this ultimately makes
EffectiveCallerIDFromContext
slightly faster due to it being 1 less layer to unwrap.Checklist