[NEW] MULTI Idempotency Keys for safely retrying non-idempotent commands #1087

byroot · 2024-09-29T16:44:18Z

For context, I'm the maintainer of https://rubygems.org/gems/redis and https://rubygems.org/gems/redis-client, and I was chatting with @stockholmux a few days ago about what I think is a recurrent pain point for users.

The problem/use-case that the feature addresses

When a client perform a command, if it times out, or it hits a network error, it's pretty much impossible to know if the command was performed on the server side or if it never reached it. Because of this, it is only safe to retry idempotent commands, for any other command, retrying is unsafe which makes it hard to implement clients that are resilient to various network issues.

This isn't specific to valkey nor the RESP protocol, the same sort of issue is common with SQL clients and most datastores.

But this is particularly common in the Ruby community because the historical redis/valkey client, initially developed by Antirez, always defaulted to blindly retry anything in case of network error, and it's now very hard to change because users, especially in cloud environment, then notice a lot more network errors that were previously swept under the rug. I'm trying to course correct that in the lower level redis-client gem by making retry opt-in and requiring users to be more intentional on whether retry should happen for a given command or transaction, but it only goes so far.

So I believe it would be very valuable if there was a way to safely retry non-idempotent commands.

Description of the feature

I think a way this could be achieved, would be for the MULTI command to accept an "idempotency key", somewhat inspired from what Stripe and Paypal APIs are doing.

e.g.

MULTI KEY <random-id> TTL 30
LPUSH mylist 1 2 3
EXEC

The semantic would be that if that transaction succeed and another transaction is performed with the same random-id withing ttl seconds, the repeated transaction isn't actually executed and instead return the same results than the previous transaction, without any side effect.

One potential issue I can see with this is that if used with command that return large output, it has the potential to require saving a whole lot more data on the server, but my assumption is that generally speaking, non-idempotent queries for which this is useful tend to return smaller responses. But this is in part counter acted by the TTL, allowing the server not to need to hold on the result for too long. It might also be possible to specify that only one such result set can be held onto per connection.

Another possibility is to not return the original response on repeated transaction, and just have MULTI return some error instead of QUEUED, but it is much less convenient for the client, as sometimes you are not just interested in the side effect of the transaction. Typically if you need to retry an LPOP, the poped item is lost. But perhaps that too can be an option.

Alternatives you've considered

Another possibility could be to bake this capability in the protocol itself, so it's not limited to transactions, but I don't think it's desirable to change the protocol.

The text was updated successfully, but these errors were encountered:

PingXie · 2024-09-29T19:27:01Z

Thanks @byroot!

This is definitely a pain point, and I think your proposal is a good solution: assigning a unique ID to each request (via the form of a MULTI transaction) and caching the result for a short period. The alternative of modifying the RESP protocol that you mentioned would indeed introduce implementation-level complexity. I agree that it's not ideal for two reasons: first, it brings compatibility risks, and second, I would guess that a large portion of the workload—especially commands like SET/GET—is already idempotent. Therefore, making this mitigation opt-in via this MULTI command extension is a smart approach to avoid unnecessary overhead.

+@valkey-io/core-team to chime in.

zuiderkwast · 2024-10-02T03:55:47Z

Cool idea.

You would need to wrap all non-idempotent commands in MULTI then?

It would be great to solve this problem. I think we should try to come up with more ways to solve it.

If a protocol level feature could solve this for all commands, then I would even consider it as RESP4 (client opt-in via HELLO).

As a start, we could mark all non-idempotent write commands as such in the COMMAND response.

byroot · 2024-10-02T12:38:33Z

You would need to wrap all non-idempotent commands in MULTI then?

It would be purely optional, but if you wanted to be able to retry in all circumstances yes. But it would be easy, even for a "dumb client" like redis-client to do so transparently, or to offer a helper to make it very easy. It could even blindly do it for all commands, and let the server not do anything special if the command is idempotent. But I see how it could cause some overhead, and isn't very "clean".

I think we should try to come up with more ways to solve it.

Of course. I wanted to come up with a concrete proposal, so that it sparks a constructive discussion, but I'm not married to a specific solution.

The reason I suggested this one in particular is that it's backward compatible.

If a protocol level feature could solve this for all commands, then I would even consider it as RESP4 (client opt-in via HELLO).

Right, baking this in the protocol would indeed be way more elegant. The reason I'm not very enthusiastic about it is that my client have been among the first ones to require RESP3 and it took years for the ecosystem of cloud providers and proxies to catch up. Even today the fairly popular Envoy proxy still isn't RESP3 compatible:

So why not a RESP4, but I'd suggest to wait for more changes you'd like to make to the protocol before going with it.

zuiderkwast · 2024-10-02T13:02:59Z

OK, good point with the proxies. I wasn't aware that proxies need to understand the RESP protocol. Are those primarily for clusters? I guess a proxy still can't make a cluster appear like a standalone node, because cross-node commands can't be supported in any sane way.

So, maybe you're right that extending MULTI with this new syntax would be the simplest way.

A problem I can think of, regardless how it's done, is that the server can end up using a lot of memory for these cached replies in the worst case. Clients can be evicted, but these replies are not tied to a client id because when a client reconnects with a new id, it may want to retry the command and get the cached reply. I can't see that we can evict them either. If it turns out to be a problem, then it would be good if a client can ACK the reply in some way so the server can delete the reply ASAP.

A concrete idea for how to encode this in RESP3+ would be to allow the client to set RESP3 attributes on a command. A command is an array of strings, AKA multibulk, but the client could add an attribute to it if the server supports it. Example:

|2<CR><LF>
    +idem-key<CR><LF>
    +98usd98s<CR><LF>
    +idem-ttl<CR><LF>
    :5000<CR><LF>
*5<CR><LF>
    $5<CR><LF>
    LPUSH<CR><LF>
    $6<CR><LF>
    mylist<CR><LF>
    $1<CR><LF>
    1<CR><LF>
    $1<CR><LF>
    2<CR><LF>
    $1<CR><LF>
    3<CR><LF>

Another feature for a future RESP version could be multiplexing, i.e. the possibility to use multiple commands-reply streams, possibly with separate clients IDs, over the same connection. It was mentioned in #54 and one of the linked redis issues.

byroot · 2024-10-02T13:39:20Z

Are those primarily for clusters?

Not as far as I know, they are used for a whole lot of things.

the server can end up using a lot of memory for these cached replies in the worst case.

Absolutely, hence the TTL, and also why I think the semantic could be that only at most one such reply is saved.

I also think it would be fine to clear the saved response on any subsequent command from the same client (whether it's a MULTI or anything else).

In addition, we could make saving the response purely opt-in, e.g. <random-id> TTL 10 RECORD, if you don't specify the RECORD, on a retry you'd get some specific error that tells you the command was already performed.

Recording the response is useful when otherwise information would be lost (e.g. LPOP), but in many cases such as INCR, it's not that useful, you can just read the key again.

A concrete idea for how to encode this in RESP3+ would be to allow the client to set RESP3 attributes on a command.

That too I'm afraid would likely break some proxies.

I think you could achieve the same thing by introducing a command to hold these info:

RETRY <random-id> TTL 30 RECORD
LPOP mylist

It is maybe a little bit unusual because it's a stateful command. But it's essentially the same as the MULTI extension, but a bit more lightweight and more composable.

hwware · 2024-10-03T08:12:12Z

2 questions for your idea:

When client send and retry non-idempotent commands, how to set the random-id? If this is really random, how to guarantee the ids are not duplicated in short time?
How we can decide if the command output is not a huge output?

byroot · 2024-10-03T13:59:57Z

how to set the random-id? If this is really random, how to guarantee the ids are not duplicated in short time?

It's the responsibility of the client to use a sufficiently random source. Typically a UUID would be suggested (but I don't think it need to be enforced).

How we can decide if the command output is not a huge output?

That is indeed something that could cause problems if the feature is abused. My reasoning in that in the large majority of cases, non-idempotent commands don't generate large output or if they do it's output that have been removed from an existing data structure, so the memory usage should be mostly neutral (e.g. LPOP).

But I can't really think of a way that would guarantee this, especially given extension module can do whatever they want, and that in the case of a transactions with multiple commands, it may also include read commands that do return a lot of data.

If really saving the response is deemed too problematic, as mentioned before it's also possible to just return a specific error and let the client know it has to reconcile, it's more work for the client and doesn't cover some use cases, but would already be a huge improvement.

zuiderkwast · 2024-10-03T15:57:18Z

Regarding LPOP specifically, it appears that LMOVE was designed to solve exactly this problem of unreliable connections. It puts the responsibility on the user to use this command though, rather than the client lib. From the man page:

NAME
       lmove - Returns an element after popping it from one list and pushing it to another.  Deletes the
       list if the last element was moved.

SYNOPSIS
       LMOVE source destination <LEFT | RIGHT> <LEFT | RIGHT>

DESCRIPTION
       Atomically returns and removes the first/last element (head/tail depending on the wherefrom argu‐
       ment)  of  the list stored at source, and pushes the element at the first/last element (head/tail
       depending on the whereto argument) of the list stored at destination.

...

   Pattern: Reliable queue
       Valkey  is  often  used as a messaging server to implement processing of background jobs or other
       kinds of messaging tasks.  A simple form of queue is often obtained pushing values into a list in
       the producer side, and waiting for this values in the consumer side using RPOP  (using  polling),
       or BRPOP if the client is better served by a blocking operation.

       However  in  this context the obtained queue is not reliable as messages can be lost, for example
       in the case there is a network problem or if the consumer crashes just after the message  is  re‐
       ceived but it is still to process.

       LMOVE  (or  BLMOVE  for  the  blocking  variant) offers a way to avoid this problem: the consumer
       fetches the message and at the same time pushes it into a processing list.  It will use the  LREM
       command  in  order  to  remove  the  message  from  the processing list once the message has been
       processed.

       An additional client may monitor the processing list for items that remain  there  for  too  much
       time, and will push those timed out items into the queue again if needed.

byroot · 2024-10-03T16:19:37Z

Yes absolutely. I'm using LPOP purely as an example here, not as the sole reason I want this. And this wouldn't even replace LMOVE because LMOVE gives even more guarantees (e.g. maybe the client literally crash an will never retry, so other clients should discover the abandoned item that has been moved).

If you want a real world example that happened to us not long ago, and prompted me to think about this feature again, it was with LPUSH: Shopify/ci-queue#287

zuiderkwast · 2024-10-03T20:03:11Z

Yeah, I know the problem is real. I'm just trying to think of related topics to get the bigger picture. It's interesting that this problem was mentioned in LMOVE's docs. It means this is not the first time this problem is discussed.

madolson · 2024-10-04T17:51:54Z

Idempotency is a long standing issue with Redis and Valkey, and is a hard problem generally within computer science. It's why streams have the consumer groups so that you can do acknowledged reads/writes. I think it's definitely worth trying to make it better, but some of my initial thoughts:

How to handle the idempotency during slot migrations in cluster mode. If you commit something, and then the data is migrated to another node, the other node won't have the idempotency information.
Replicating the idempotency tokens through replication seems fairly straightforward if we are just erroring on a conflict. If we are also caching the error response, we would also need to replicate the response to replicas, which is less ideal.
We might also bake in the ability for clients to acknowledge that they got the response. It increases the network chatter, but it reduces the amount of memory that the server has to keep track of.
We have https://valkey.io/topics/command-tips/, we should probably document the idempotent vs non-idempotent commands.
As a distributed systems engineer, I feel a lot more comfortable with just erroring on idempotency failures in the database layer. In the ElastiCache service, we handle idempotency failures all over the place by checking the state after a failure, not having the database keep track of it. Redis was always known for being easy to use, but often ignored correctness to achieve that.
This feels intuitively very similar to client side tracking in terms of structure. The client indicates that it wants some state to happen on the next command. We could implement CLIENT IDEMPOTENT <token> <ttl> and CLIENT CLEAR-IDEMPOTENT <token>. Everything gets attached to the client then. Then it's decoupled from multi.

@barshaul @avifenesh Any thoughts about this from the client perspective?

byroot · 2024-10-04T21:10:29Z

I feel a lot more comfortable with just erroring on idempotency failures in the database layer.

To be clear, I totally see how saving the response can be challenging, so if it's deemed too complicated / risky I'll understand. Just having a specific error that allow to know the command was already perform would already be a huge leap forward.

The big benefit of recording the response would be to allow "dumb clients" like mine to retry transparently.

zuiderkwast · 2024-10-04T22:14:50Z

Redis was always known for being easy to use, but often ignored correctness to achieve that.

This was probably a key to success.

To keep it simple, we could continue this tradition. We could recommend folks to work around it:

use streams (with ACK) instead of lists.
wrap INCR (etc.) in MULTI-EXEC where you also write to some client-specific key that you can check later to find out if the transaction succeeded or not.

madolson · 2024-10-04T22:36:23Z

wrap INCR (etc.) in MULTI-EXEC where you also write to some client-specific key that you can check later to find out if the transaction succeeded or not.

I've done this a lot. I did a talk at Kubecon where we applied changes from a change stream and used a lua script + idempotency key to check if something was applied. We were using the non-idempotent commands like ZINCRBY. It works "ok" for CME workloads.

avifenesh · 2024-10-05T17:48:33Z

Talking from clients side (Glide) and client side only:
It does sound a nice feature that would be helpful and as everything else, if it will be introduced in ValKey we will support it, BUT -
It feels a bit like a half-baked feature that will create a lot of work to clients, potentially more than help.
If it will be delivered, the client will need to create the command block, create wrappers for users to have a pleasant UX, implement the retries, handle configurations, generate ID and track it and many more tasks.
Plus, there's what I call configuration hell, where a client has so much optional configuration so to have the right client setup you need to have a degree in it.
And I see it every day helping users (Elasticache users), manage to create the right cocktail of configuration so they stop having 10 minutes downtime.
This feature introduces another extra optional configuration as how many retries (as it different from usual command), what is the TTL and probably some more.
If the feature ends up as offering a block of multi-exec with TTL clients actually could implement a non-efficient version of it long time ago, for user who willing to scarify a bit of memory and performance for some commands they want to be safe about.

If it ends up implemented, I would actually hope that it will be a kind of transparent API, where you call it SAFER_LPUSH or similar, and accept TTL and ID as extra parameter and clients will just take care of retries.

PingXie · 2024-10-07T06:18:43Z

Redis was always known for being easy to use, but often ignored correctness to achieve that.

While ignoring correctness may have been acceptable in the past, I don't think we can continue operating that way if we want Valkey to be the default choice for application developers.

Idempotency handling is a difficult problem, and it has come up in my experience multiple times too. If we find ourselves discussing workarounds at conferences for handling it client-side, I think it’s time to seriously consider a server-side solution. Offloading idempotency handling to clients shifts complexity from the server to the clients, and there are far more clients out there. That said, while this is a real problem worth addressing, I don't believe it’s an "80% problem" either. Therefore, making significant changes to the RESP protocol feels excessive. Extending MULTI would be a much more natural and backward-compatible solution.

How to handle the idempotency during slot migrations in cluster mode.

That’s a great point. I think we’d need to track the new states at a per-slot level. It might make sense to sequence this after the RDB-based Atomic Slot Migration work so that we can easily pass along additional metadata.

As for the other comments on the issue, I believe they’re mostly tactical, concerns about memory usage efficiency, caching error responses, or the exact command structure. These are important, but I don't think they would invalidate the overall problem statement.

It feels a bit like a half-baked feature that will create a lot of work to clients, potentially more than help.
If it will be delivered, the client will need to create the command block, create wrappers for users to have a pleasant UX, implement the retries, handle configurations, generate ID and track it and many more tasks.

@avifenesh, I agree that this would increase client-side complexity, but do you think it’s still preferable to having application developers implement their own idempotency error handling? There are orders of magnitude more applications than clients, and far more clients than the server, which is just one.

avifenesh · 2024-10-07T08:45:08Z

@PingXie You right, i opened by saying that it's a good idea.

My point is that it is not whole. As a client dev id like to have fully baked and ready to eat features, which do not require me to create workarounds and special configurations.

If it was a fix to some bug, it is what it is, you deliver fast.
But if it's adding a new feature, why not take a bit more time and make the UX awesome.

PingXie added the major-decision-pending Major decision pending by TSC team label Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NEW] MULTI Idempotency Keys for safely retrying non-idempotent commands #1087

[NEW] MULTI Idempotency Keys for safely retrying non-idempotent commands #1087

byroot commented Sep 29, 2024 •

edited

Loading

PingXie commented Sep 29, 2024

zuiderkwast commented Oct 2, 2024

byroot commented Oct 2, 2024

zuiderkwast commented Oct 2, 2024

byroot commented Oct 2, 2024

hwware commented Oct 3, 2024

byroot commented Oct 3, 2024

zuiderkwast commented Oct 3, 2024

byroot commented Oct 3, 2024

zuiderkwast commented Oct 3, 2024

madolson commented Oct 4, 2024

byroot commented Oct 4, 2024

zuiderkwast commented Oct 4, 2024

madolson commented Oct 4, 2024

avifenesh commented Oct 5, 2024

PingXie commented Oct 7, 2024

avifenesh commented Oct 7, 2024

[NEW] MULTI Idempotency Keys for safely retrying non-idempotent commands #1087

[NEW] MULTI Idempotency Keys for safely retrying non-idempotent commands #1087

Comments

byroot commented Sep 29, 2024 • edited Loading

PingXie commented Sep 29, 2024

zuiderkwast commented Oct 2, 2024

byroot commented Oct 2, 2024

zuiderkwast commented Oct 2, 2024

byroot commented Oct 2, 2024

hwware commented Oct 3, 2024

byroot commented Oct 3, 2024

zuiderkwast commented Oct 3, 2024

byroot commented Oct 3, 2024

zuiderkwast commented Oct 3, 2024

madolson commented Oct 4, 2024

byroot commented Oct 4, 2024

zuiderkwast commented Oct 4, 2024

madolson commented Oct 4, 2024

avifenesh commented Oct 5, 2024

PingXie commented Oct 7, 2024

avifenesh commented Oct 7, 2024

byroot commented Sep 29, 2024 •

edited

Loading