-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert cache to a ttl cache #77
base: db_main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
f.lruCache.Add(queryExpressionNormalized, queryExpressionRangeLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you confirm Add()
is thread-safe? Multiple goroutines from the HTTP server may call this function here concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it uses a mutex
@@ -159,7 +156,7 @@ func (f *FailedQueryCache) UpdateFailedQueryCache(err error, query url.Values, q | |||
queryExpressionRangeLength := getQueryRangeSeconds(query) | |||
// TODO(hc.zhu): add a flag for the threshold | |||
// The current gateway timeout is 5 minutes, so we cache the failed query running longer than 5 minutes - 10 seconds. | |||
if queryResponseTime > time.Second * (60 * 5 - 10) { | |||
if queryResponseTime > time.Second*(60*5-10) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the magic number a flag since you are touching this part? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didnt, it was due to go format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do it in a separate PR
@@ -34,19 +34,16 @@ func verifyMetricCount(t *testing.T, reg *prometheus.Registry, expectedCount int | |||
|
|||
func TestNewFailedQueryCache(t *testing.T) { | |||
reg := prometheus.NewRegistry() | |||
cache, err := NewFailedQueryCache(2, reg) | |||
cache := NewFailedQueryCache(2, 0, reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add test cases where the TTL is not 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does 0 TTL mean? Never expire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 means no ttl
@@ -48,6 +48,7 @@ type HandlerConfig struct { | |||
QueryStatsEnabled bool `yaml:"query_stats_enabled"` | |||
LogFailedQueries bool `yaml:"log_failed_queries"` | |||
FailedQueryCacheCapacity int `yaml:"failed_query_cache_capacity"` | |||
FailedQueryTTL time.Duration `yaml:"failed_query_ttl"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cachedHits prometheus.Counter | ||
cachedQueries prometheus.Gauge | ||
} | ||
|
||
func NewFailedQueryCache(capacity int, reg prometheus.Registerer) (*FailedQueryCache, error) { | ||
func NewFailedQueryCache(capacity int, ttlDuration time.Duration, reg prometheus.Registerer) *FailedQueryCache { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the eventual goal of a TTL is to cache expensive queries as well. If a query fetches too many time series, add it to the cache.
Can you rename the cache like ExpensiveQueryCache
and make it cache both failed queries (usually expensive ones), long-running ones (> 4 minutes), and the ones fetching too many time series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the comments.
Changes
Verification
pending test