Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike on extracting connection logic from Dalli::Protocol::Binary #863

Merged
merged 4 commits into from
Dec 10, 2021

Conversation

petergoldstein
Copy link
Owner

Once the connection logic is extracted it will become much easier to implement the meta protocol. The goal is that after this PR, everything that's actually in the Dalli::Protocol::Binary class is related to the binary protocol specifically, and is not reusable for other protocols. This is basically mapping Ruby commands to memcached binary requests and processing responses, and translating back to appropriate values for the Dalli::Client.

Config parameter parsing, connection handling (initialization, state management, retries), and other shared concerns should be extracted. Binary protocol specific logic may be extracted into additional classes in the Dalli::Protocol::Binary class scope.

Once this extraction is complete, we may very well wind up inverting this a bit, where we inject the connection manager into a protocol implementation.

Once the connection logic is extracted it will become much easier to implement the meta protocol.  The goal is that after this PR, everything that's actually in the Dalli::Protocol::Binary class is related to the binary protocol specifically, and is not reusable for other protocols.  This is basically mapping Ruby commands to memcached binary requests and processing responses, and translating back to appropriate values for the Dalli::Client.

Config parameter parsing, connection handling (initialization, state management, retries), and other shared concerns should be extracted.  Binary protocol specific logic may be extracted into additional classes in the Dalli::Protocol::Binary class scope.

Once this extraction is complete, we may very well wind up inverting this a bit, where we inject the connection manager into a protocol implementation.

Start renaming multi related methods/variables to quiet

Some reorganization to make the memcached commands clearer.  Added quiet versions of a number of commands (still need tests).  Some additional comments.

Simplify the pipelined getter behavior by encapsulating bytes to advance
@petergoldstein petergoldstein force-pushed the spike/extract_connection_manager branch from 27d444d to 701edd0 Compare December 9, 2021 17:50
@petergoldstein petergoldstein marked this pull request as ready for review December 9, 2021 19:18
@petergoldstein
Copy link
Owner Author

So I don't this this PR completely accomplishes the specified goals, but it's a big step forward. Writing a meta version should be very straightforward, and not require the implementor to worry about connection details or much else unrelated to the meta request formatting and response parsing.

@petergoldstein petergoldstein merged commit 6495138 into main Dec 10, 2021
@petergoldstein petergoldstein deleted the spike/extract_connection_manager branch December 10, 2021 01:42
@casperisfine
Copy link
Contributor

👋 I just became aware of this for totally unrelated reasons, in case it helps:

I explored this last year with:

The one issue you might run into is getting the CAS token on set: memcached/memcached#546 (comment)

@petergoldstein
Copy link
Owner Author

petergoldstein commented Dec 13, 2021

@casperisfine Thanks for the pointers. I was aware of the conversation, but at that point I didn't have the bandwidth to do the requisite refactoring and implementation.

At this point much (most?) of the refactoring is done, and protocol handling is mostly isolated. I should have a version of Dalli that "speaks" the meta protocol sometime this week. It will contain no new functionality (exposing additional flags, allowing clients to build probabilistic hot caches, stales with timeouts) because that would require changing or extending the Dalli::Client interface but aside from the flushq issue (noted below) it should be feature parity with binary. Assuming all goes well, I'll likely release this as 3.2, removing the current protocol_implementation parameter and replacing it with a different option for selecting the protocol.

Regarding the CAS token on set issue, my testing indicates I can use the c (lower case) argument to get the CAS reflected back at me in the meta set response. It's not clearly documented, but at least on my local system it works. We'll see if it works more generally

Issues that I think may be problematic:

  1. Updating the API of Dalli::Client to allow use of the meta capabilities (this likely waits until 4.x)
  2. The missing flushq equivalent in the meta protocol (Meta / ASCII protocol does not have equivalent to binary protocol flushq memcached/memcached#840)
  3. Base64 key encoding

We'll see how it comes out.

# underlying socket if it is not connected, or there's been a disconnect
# because of timeout or other error. Method raises an error
# if it can't connect
raise_down_error unless ensure_connected!
Copy link

@ghiculescu ghiculescu Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this change broke tests in Rails. It's not a problem - rails/rails#43857 fixes them. I'm curious why you did this vs raise_down_error unless alive? though.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically to preserve the existing error message (which uses the @error and @msg) while encapsulating those values in the ConnectionManager. I'd really like to put all of this in ConnectionManager, but authentication makes that difficult.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but alive? just calls ensure_connected! internally.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghiculescu alive? catches the error raised by ensure_connected!. That hides the error message if there was one.

By exposing ensure_connected! it can raise that error, exposing the original error message.

For the record, I don't love this. But I was trying to keep original behavior invariant to the largest extent possible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh yep that makes sense.

Thanks for the explanation ❤️

@casperisfine
Copy link
Contributor

Regarding the CAS token on set issue, my testing indicates I can use the c (lower case) argument to get the CAS reflected back at me in the meta set response.

Yeah maybe it was fixed in memcached, that info was from over a year ago.

removing the current protocol_implementation parameter and replacing it with a different option for selecting the protocol.

Hum, could we keep it though? It could be useful to also support the v1 protocol (e.g. if you use twemproxy), or to use slightly modified protocol implementations with e.g. added instrumentation, or things like that.

@petergoldstein
Copy link
Owner Author

@casperisfine

Yeah maybe it was fixed in memcached, that info was from over a year ago.

Yep. There still appears to be an issue in append/prepend mode. I'll file a ticket, although it doesn't impact Dalli's current implementation

Hum, could we keep it though? It could be useful to also support the v1 protocol (e.g. if you use twemproxy), or to use slightly modified protocol implementations with e.g. added instrumentation, or things like that.

In my view there's no real need for it (this isn't a pluggable interface with lots of different possibilities, there are basically 2.5 implementations), there's no well-defined interface, and it's basically a recipe for slowing development because of tight, invisible coupling causing issues for users (this is already more of a problem than it ideally would be - #865). Clients of the library can either contribute back (i.e. instrumentation interfaces) or use a forked version. Moreover, it's not clear that anyone is using it in any kind of production environment.

I only incorporated the protocol_implementation originally because it was unclear to me that I (or anyone else) would have time to put the meta interface into Dalli. Given that binary is officially deprecated I considered it desirable that Dalli at least allow meta usage. Now that the bulk of issues with the binary implementation have been addressed, and we expect to have a meta-compatible implementation in the near future, the support cost for such an interface doesn't make sense.

So I'm going to remove it in 3.2.

@casperisfine
Copy link
Contributor

it's basically a recipe for slowing development because of tight, invisible coupling causing issues for users [...] support cost

Note that I'm not asking for it to be a public and stable interface, I'm totally fine with a "at your own risk" API. It's just a bit cleaner than having to monkey patch dalli to plug instrumentation and things like that.

@petergoldstein
Copy link
Owner Author

@casperisfine At the end of the day it's either an exposed and supported API or it isn't in the code base. There's no use "at your own risk" API.

There are well established techniques for maintaining your own versions of open source projects. You can obviously fork a repo and build off that. Or, as you note, since this is Ruby you can monkey patch. Both of those options are available to you. And as noted in my earlier comments, if there's a need that you have that you believe the community might as well (i.e. instrumentation interface) you can discuss in the "Discussions" section or submit a PR.

@dzuelke
Copy link

dzuelke commented Mar 14, 2022

@petergoldstein may I ask what the intention is behind the close_on_fork change? Dalli used to re-connect when it detected a fork (e.g. when the connection was already up at the time Puma forks its worker children); now it instead closes and raises, presumably to simply let the next request to the worker child re-connect... but why not just re-connect like before?

@petergoldstein
Copy link
Owner Author

@dzuelke I don't really understand the question. Do you have a specific case that you see behaving differently in the two scenarios?

While the method used to be called reconnect! that's not actually what it did -

def reconnect!(message)
. What it did was:

  1. Close the connection
  2. Potentially wait a period
  3. Raise an exception

That's the current behavior.

@dzuelke
Copy link

dzuelke commented Mar 15, 2022

Ah, interesting. Okay, thanks @petergoldstein, will keep digging then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants