Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DB activity based routing #864

Merged
merged 1 commit into from
Dec 22, 2024

Conversation

nadavshatz
Copy link
Contributor

Hi all,

This is an attempt to add support for primary-replica routing based on recent writes on the table level.

We've implemented this as a separate process alongside PgPool-II before.
This will perform much better as it's incorporated into PgCat itself.

You can read all about the approach here: https://medium.com/tailor-tech/using-database-read-replicas-in-distributed-systems-d80eaf6bbf8a

@drdrsh
Copy link
Collaborator

drdrsh commented Dec 14, 2024

I have gotten around to testing the PR. It works as described. I have a few notes

  • There is an inconsistency in the units used for TTL, we should reflect the units being used in the name of the config and variables to avoid putting milliseconds in a second field.. I think unspecified unit name should mean seconds, otherwise unit should be specified
  • Documentation isn't accurate as you need query_parser_enabled as well as query_parser_read_write_splitting to see the correct routing behavior

@drdrsh
Copy link
Collaborator

drdrsh commented Dec 14, 2024

Another related note but not strictly necessary as part of this PR, is about propagation of this information to other PgCat instances in a HA deployment. I don't think it is necessary but I was curious if you had thoughts on this.

@nadavshatz nadavshatz force-pushed the enh/db-activity-routing branch from baf6ffb to e9c763c Compare December 15, 2024 09:23
@nadavshatz
Copy link
Contributor Author

Thanks for the detailed review!

Changes:

  • I've synced the fork and rebased over main.
  • I've updated the configs that are millisecond based to have ms in the their name - I completely agree with you on this.
  • I've updated the config docs about the required enabled options for this to work.

HA Note:
I mention this in the considerations section in the config.md file when enabling this feature. As a whole as mentioned in the article i linked to - you're right, it's an issue. One way to resolve this is to have a centralized cache for the activity but I think that from a performance and reliability standpoint i'd prefer to use sticky sessions and get all of the same DB's client to use the same PGCat instance (we've achieved this with istio).
I also didn't want to add any kind of dependency on an external cache.
There are probably many other way's to resolve this but this has been the best tradeoff i've come up with to get this great feature with limited impact on reliability or performance.

Would love to get this merged - let me know if there's anything else I should do here.

@drdrsh
Copy link
Collaborator

drdrsh commented Dec 21, 2024

That makes sense.

I ran the tests and clippy is complaining about

error: very complex type used. Consider factoring parts into `type` definitions
   --> src/query_router.rs:617:10
    |
617 |     ) -> Option<(Vec<Expr>, Vec<Vec<Ident>>, Option<&'a Vec<Assignment>>)> {
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#type_complexity
    = note: `-D clippy::type-complexity` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::type_complexity)]`

error: could not compile `pgcat` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
error: could not compile `pgcat` (lib test) due to 1 previous error

Exited with code exit status 101

@nadavshatz nadavshatz force-pushed the enh/db-activity-routing branch from e9c763c to 26cff3e Compare December 22, 2024 09:23
@nadavshatz
Copy link
Contributor Author

Not sure why clippy didn't show me this issue 🤔
anyway - updated accordingly 🙏🏼

@drdrsh drdrsh merged commit 3202f56 into postgresml:main Dec 22, 2024
1 of 3 checks passed
@nadavshatz nadavshatz deleted the enh/db-activity-routing branch December 22, 2024 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants