Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RelayMiner] Implement relayminer query caching #1050

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

red-0ne
Copy link
Contributor

@red-0ne red-0ne commented Jan 31, 2025

Summary

Implements caching layer for query clients to reduce network calls and improve performance

Primary Changes:

  • Added generic KeyValueCache and ParamsCache interfaces with thread-safe implementations
  • Integrated caching across all query clients (Account, Application, Bank, Service, etc.)
  • Added cache clearing on new blocks via WithNewBlockCacheClearing option

Secondary Changes:

  • Replaced manual sync.Mutex implementation in accQuerier with new cache interface
  • Added cache configuration to integration tests
  • Updated relayer dependencies to include cache initialization

Issue

The RelayMiner RPC queries are not cached, which puts excessive load on the configured full node, degrading the performance of both off-chain and on-chain components.

Type of change

Select one or more from the following:

Sanity Checklist

  • I have updated the GitHub Issue assignees, reviewers, labels, project, iteration and milestone
  • For docs, I have run make docusaurus_start
  • For code, I have run make go_develop_and_test and make test_e2e
  • For code, I have added the devnet-test-e2e label to run E2E tests in CI
  • For configurations, I have update the documentation
  • I added TODOs where applicable

@red-0ne red-0ne added the relayminer Changes related to the Relayminer label Jan 31, 2025
@red-0ne red-0ne added this to the Beta TestNet Iteration milestone Jan 31, 2025
@red-0ne red-0ne requested review from Olshansk and adshmh January 31, 2025 04:38
@red-0ne red-0ne self-assigned this Jan 31, 2025
@red-0ne red-0ne added the push-image CI related - pushes images to ghcr.io label Jan 31, 2025
Copy link

The image is going to be pushed after the next commit.

You can use make trigger_ci to push an empty commit.

If you also want to run E2E tests, please add devnet-test-e2e label.

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@red-0ne I did a first partial review but have a lot of comments & questions.

Here's a high-level summary but PTAL at the actual comments as well:

  1. Need to understand if/how this can build on top of [Off-chain] feat: in-memory query cache(s) #994 w/ @bryanchriswhite
  2. See a few comments (logs + comments) that need to be addressed in multiple places
  3. I’m a bit concerned (and don’t understand) how we’re not using “height” to retrieve things from the cache, especially when values are always changing
  4. When does the cache ever get cleared?
  5. Light on tests
  6. I’d be interested to see numbers of performance improvement

pkg/client/query/accquerier.go Show resolved Hide resolved
pkg/client/query/accquerier.go Show resolved Hide resolved
pkg/client/query/accquerier.go Show resolved Hide resolved
pkg/client/query/appquerier.go Show resolved Hide resolved
pkg/client/query/appquerier.go Outdated Show resolved Hide resolved
pkg/client/query/sessionquerier.go Outdated Show resolved Hide resolved
@@ -49,11 +53,19 @@ func NewSharedQuerier(deps depinject.Config) (client.SharedQueryClient, error) {
// Once `ModuleParamsClient` is implemented, use its replay observable's `#Last()` method
// to get the most recently (asynchronously) observed (and cached) value.
func (sq *sharedQuerier) GetParams(ctx context.Context) (*sharedtypes.Params, error) {
// Get the params from the cache if they exist.
if params, found := sq.paramsCache.Get(); found {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned (and don't fully understand) the lack of a "height" param when retrieving things from the cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache implementation does not add any new functionality besides caching whatever has been queried.

It does not alter the RelayMiners current behavior

  • RelayMiner cold start
  • React to Params change

For those reasons, it does not leverage historical data that justifies the usage of height for cache querying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the cache implementations here are NOT historical; i.e. ONLY the most recently observed value is cached for each ParamsCache instance (or key, in the case of KeyValueCache).

While #994 does include historical caching as well (via the HistoricalQueryCache interface, that's an additional and distinct feature.

This shouldn't be necessary here because we're clearing the cache on every new block. The end result being, somewhat sub-optimal, but significant caching. This reformulates the number off-chain queries from being a function of API usage, to no more than one per block, per cache.

pkg/client/query/sharedquerier.go Outdated Show resolved Hide resolved
pkg/client/query/sharedquerier.go Show resolved Hide resolved
pkg/client/query/sharedquerier.go Show resolved Hide resolved
Copy link
Contributor

@bryanchriswhite bryanchriswhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one @red-0ne! 🙌

Thanks for doing this! ❤️

pkg/client/query/cache/options.go Show resolved Hide resolved
}

// KeyValueCache is an interface for a simple in-memory key-value cache implementation.
type KeyValueCache[V any] interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface is consistent with what I was calling QueryCache[T any] in #994. It should be quite straightforward to refactor #994 to use this instead. I see KeyValueCache[V any] as a subsequent iteration of QueryCache[T any] which includes generalizing the name.

pkg/client/query/interface.go Show resolved Hide resolved
pkg/client/query/cache/paramscache.go Outdated Show resolved Hide resolved
pkg/client/query/cache/paramscache.go Show resolved Hide resolved
clientConn grpc.ClientConn
sessionQuerier sessiontypes.QueryClient
sharedQueryClient client.SharedQueryClient
sessionsCache KeyValueCache[*sessiontypes.Session]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -49,11 +53,19 @@ func NewSharedQuerier(deps depinject.Config) (client.SharedQueryClient, error) {
// Once `ModuleParamsClient` is implemented, use its replay observable's `#Last()` method
// to get the most recently (asynchronously) observed (and cached) value.
func (sq *sharedQuerier) GetParams(ctx context.Context) (*sharedtypes.Params, error) {
// Get the params from the cache if they exist.
if params, found := sq.paramsCache.Get(); found {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the cache implementations here are NOT historical; i.e. ONLY the most recently observed value is cached for each ParamsCache instance (or key, in the case of KeyValueCache).

While #994 does include historical caching as well (via the HistoricalQueryCache interface, that's an additional and distinct feature.

This shouldn't be necessary here because we're clearing the cache on every new block. The end result being, somewhat sub-optimal, but significant caching. This reformulates the number off-chain queries from being a function of API usage, to no more than one per block, per cache.

pkg/deps/config/suppliers.go Outdated Show resolved Hide resolved
@@ -18,6 +19,12 @@ var _ client.ApplicationQueryClient = (*appQuerier)(nil)
type appQuerier struct {
clientConn grpc.ClientConn
applicationQuerier apptypes.QueryClient
logger polylog.Logger

// applicationsCache caches applicationQueryClient.Application requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// applicationsCache caches applicationQueryClient.Application requests
// applicationsCache caches application.Applications returned from applicationQueryClient.Application requests


// applicationsCache caches applicationQueryClient.Application requests
applicationsCache KeyValueCache[apptypes.Application]
// paramsCache caches applicationQueryClient.Params requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as 👆

(seems like other places as well)

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a partial review.

  1. @bryanchriswhite Can you please prioritize getting [Off-chain] feat: in-memory query cache(s) #994 in? It's the most mature/versatile cache, and I'd like us to just build on top of 1 thing.

  2. @red-0ne See some of my nits/edits, but in particular around using gomock for proper mocks.

Will do a full review after (1) & (2) are done.

Few notes:

  1. If you think we should take a different direction, let's jump on a call.
  2. We have other (large) parallel efforts going on, so there shouldn't be any blockers
  3. I strongly believe we should benchmark in this PR. Seems like something an LLM can help get done in a couple of hours.

c.callCount++
}

// MockServiceQueryServer is a mock implementation of the servicetypes.QueryServer interface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up until now we've always been using gomock to generate this sort of thing, which has support for call counters.

I feel strongly that we should not be changing patterns now.

config.NewSupplyKeyValueCacheFn[*sessiontypes.Session](cache.WithNewBlockCacheClearing),
config.NewSupplyKeyValueCacheFn[*cosmostypes.Coin](cache.WithNewBlockCacheClearing),

config.NewSupplySharedQueryClientFn(), // leaf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some //leaf comments before the new code and some (this one) after the new code.

What's the idea behind this code organization?

config.NewSupplyKeyValueCacheFn[apptypes.Application](cache.WithNewBlockCacheClearing),
config.NewSupplyKeyValueCacheFn[cosmostypes.AccountI](cache.WithNewBlockCacheClearing),
config.NewSupplyKeyValueCacheFn[sharedtypes.Supplier](cache.WithNewBlockCacheClearing),
config.NewSupplyKeyValueCacheFn[*sessiontypes.Session](cache.WithNewBlockCacheClearing),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#PUC

config.NewSupplySharedQueryClientFn(), // leaf

// Setup the params caches and configure them to clear on new blocks.
config.NewSupplyParamsCacheFn[sharedtypes.Params](cache.WithNewBlockCacheClearing),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#PUC

config.NewSupplySharedQueryClientFn(), // leaf

// Setup the params caches and configure them to clear on new blocks.
// TODO_TECHDEBT: Consider a flag to change client queriers caching behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to push on the fact that this is a TODO_IN_THIS_PR.

It's not that hard and I want to understand the benefit (if any) of this cache.

It feels like we've built a car but not actually checking if it works.

}

// KeyValueCache is an interface for a simple in-memory key-value cache implementation.
type KeyValueCache[V any] interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment on discord, but I feel strongly that we should finish #994 and rebase on top of it.

Now is the time to do this right.

Screenshot 2025-02-11 at 4 31 05 PM

Comment on lines +7 to +11
// Balance represents a pointer to a Cosmos SDK Coin, specifically used for bank balance queries.
// It is deliberately defined as a distinct type (not a type alias) to ensure clear dependency
// injection and to differentiate it from other coin caches in the system. This type helps
// maintain separation of concerns between different types of coin-related data in the caching
// layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Balance represents a pointer to a Cosmos SDK Coin, specifically used for bank balance queries.
// It is deliberately defined as a distinct type (not a type alias) to ensure clear dependency
// injection and to differentiate it from other coin caches in the system. This type helps
// maintain separation of concerns between different types of coin-related data in the caching
// layer.
// Balance represents a pointer to a Cosmos SDK Coin used for bank balance queries.
// It is defined as a distinct type (not an alias) to:
// - Ensure clear dependency injection
// - Differentiate from other coin caches in the system
// - Maintain separation of concerns between coin-related data in the caching layer
type Balance *sdk.Coin

@red-0ne Have you used the code-cleaner Claude project yet?

Comment on lines +3 to +7
// BlockHash represents a byte slice, specifically used for bank balance query caches.
// It is deliberately defined as a distinct type (not a type alias) to ensure clear
// dependency injection and to differentiate it from other byte slice caches in the system.
// This type helps maintain separation of concerns between different types of
// byte slice data in the caching layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// BlockHash represents a byte slice, specifically used for bank balance query caches.
// It is deliberately defined as a distinct type (not a type alias) to ensure clear
// dependency injection and to differentiate it from other byte slice caches in the system.
// This type helps maintain separation of concerns between different types of
// byte slice data in the caching layer.
// BlockHash represents a byte slice used for bank balance query caches.
// It is defined as a distinct type (not an alias) to:
// - Ensure clear dependency injection
// - Differentiate from other byte slice caches
// - Maintain separation of concerns between byte slice data in caching layer

@@ -0,0 +1,8 @@
package types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having separate files for this feels like overkill.

Can we just have a types.go?

return supplier, nil
}

logger.Debug().Msgf("cache miss for key: %s", operatorAddress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving one comment but please update everywhere. If these logs ever becomes the source for debugging, you want it to be ULTRA obvious.

Suggested change
logger.Debug().Msgf("cache miss for key: %s", operatorAddress)
logger.Debug().Msgf("cache miss for operator address key: %s", operatorAddress)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
push-image CI related - pushes images to ghcr.io relayminer Changes related to the Relayminer
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

3 participants