fix(limit-req): Make Redis path atomic via EVAL + use hash key with TTL by falvaradorodriguez · Pull Request #12605 · apache/apisix

falvaradorodriguez · 2025-09-09T15:19:10Z

Description

The current limit-req Redis implementation uses two separate keys (excess and last) and updates them with multiple GET/SET operations.

Under concurrent load, this leads to race conditions:

Several workers may read stale values in parallel and overwrite each other.
As a result, the plugin allows more requests than expected, effectively bypassing the intended rate limit.

On Redis Cluster, the current approach is also problematic: atomic EVAL cannot be executed across two different keys located on different slots.

Solution

Store both values (excess and last) under a single Redis hash key, so the state is managed as one unit.
Use a single EVAL script that performs read → compute → write atomically inside Redis, removing race conditions. This approach is consistent with how the limit-count plugin already works.
Add a TTL to the key to avoid buildup of stale state.
Preserve existing semantics: the first request with no prior state does not consume tokens.

Which issue(s) this PR fixes:

Fixes #12592

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

- Switch Redis storage to a single hash key (cluster compatibility) - Perform read/compute/write atomically with EVAL - Keep first-hit behavior (no cost on missing state) - Add EX-based TTL to avoid key buildup

Baoyuantop · 2025-09-10T09:39:52Z

‌Hi @falvaradorodriguez, we need to add the test case for this fix.

Baoyuantop · 2025-09-22T08:38:10Z

Hi @falvaradorodriguez, there are failed CI that need fixing.

falvaradorodriguez · 2025-09-22T17:27:51Z

Hi @falvaradorodriguez, there are failed CI that need fixing.

Hi @Baoyuantop!

The failures are not related to the current changes. Tests of the modified plugin seems to be fine.

Please, Can you review?

Thanks

[15:23:12] t/plugin/limit-req-redis-cluster.t ......... ok    15865 ms ( 0.01 usr  0.01 sys +  0.91 cusr  2.03 csys =  2.96 CPU)
[15:23:29] t/plugin/limit-req-redis.t ................. ok    16682 ms ( 0.02 usr  0.00 sys +  0.95 cusr  2.23 csys =  3.20 CPU)
2025/09/22 15:23:37 Processed 0 requests
[15:23:41] t/plugin/limit-req.t ....................... ok    11856 ms ( 0.02 usr  0.00 sys +  0.83 cusr  1.78 csys =  2.63 CPU)
[15:23:46] t/plugin/limit-req2.t ...................... ok     5010 ms ( 0.01 usr  0.00 sys +  0.81 cusr  0.42 csys =  1.24 CPU)
[15:23:47] t/plugin/limit-req3.t ...................... ok     1290 ms ( 0.00 usr  0.00 sys +  0.39 cusr  0.22 csys =  0.61 CPU)

Baoyuantop · 2025-09-24T08:15:52Z

apisix/plugins/limit-req/util.lua

+
+  if commit == 1 then
+    redis.call("HMSET", state_key, "excess", new_excess, "last", now)
+    redis.call("EXPIRE", state_key, 60)


Can TTL be dynamically configured based on the rate-limiting window period, or allow user configuration?

In the limit-req plugin, there is no time window like in the limit-count plugin.

This configuration is simply done to prevent keys from remaining dead in redis. Currently, if a consumer stops making requests, their key remains in redis until it is deleted by an action independent of Apisix.

It could be made configurable, but in my opinion, it would not add much value to this plugin. It would also be an extra feature, not related with this fix.

In my opinion, since limit-req is always for 1 second, with a 1 minute of margin in redis, it is acceptable. If the consumer makes requests after 1 minute, the flow management would be the same as for the first request.

luarx · 2025-10-08T13:03:46Z

When will this PR be merged? I am having the same issue that this PR fixes 🙏

Baoyuantop · 2025-10-11T03:55:46Z

I will promptly urge other community maintainers to review.

Precompute the Lua script SHA1 locally and always execute scripts via EVALSHA to avoid repeated SCRIPT LOAD operations. Add a robust NOSCRIPT fallback to EVAL to ensure compatibility with both resty.redis and resty.rediscluster, especially in Redis Cluster setups where scripts are cached per node. This improves performance and makes script execution resilient to Redis node restarts, failovers, and resharding.

falvaradorodriguez · 2025-12-24T10:49:44Z

Hi @falvaradorodriguez, it appears there are test failures in CI that need to be fixed.

There was a node management issue with Redis.cluster. It seems that loading the script does not guarantee that it will be available in the next sha1 evaluation. I have changed the fallback to eval to ensure that the script is executed. Once executed with eval, it can be checked with sha1.

Baoyuantop · 2025-12-26T02:58:58Z

Hi @falvaradorodriguez, could you please fix the lint bug?

shreemaan-abhishek · 2026-01-28T03:58:52Z

apisix/plugins/limit-req/util.lua

+
+  if commit == 1 then
+    redis.call("HMSET", state_key, "excess", new_excess, "last", now)
+    local ttl = math.ceil(burst / rate) + 1


shreemaan-abhishek · 2026-01-28T04:00:20Z

apisix/plugins/limit-req/util.lua

+
+    local s
+    if type(err) == "table" then
+        s = tostring(err[1] or err.err or err.message or err.msg or err)


last time I checked, the type of err was always a string. Did you notice something different that you added err.err or err.message or err.msg or err?

~~Thanks for the note. In resty.redis the error is indeed typically returned as a plain string.~~

However, since this code path is meant to work with both resty.redis and resty.rediscluster, I added a small defensive handling for the case where errors may be propagated as a structured table (e.g. containing err/message fields) instead of a raw string.

This doesn’t change the behavior for the common case, but makes the NOSCRIPT detection more robust across Redis standalone and Redis Cluster setups, especially during redirects/failovers where error formats may differ.

With the changes you mentioned here, what I wrote above doesn't makes sense. I have replicated the way you handle the error you use in the limit-conn plugin.

shreemaan-abhishek · 2026-01-28T04:06:51Z

apisix/plugins/limit-req/util.lua

-    else
-        excess = 0
+    -- If the script isn't cached on this Redis node, fall back to EVAL.
+    if not res and is_noscript_error(err) then


I observed that reliability of redis evalsha with redis-cluster was very low.
#12872 (comment)

So I decided to not use evalsha when the policy is set to rediscluster. You can refer the PR I just linked to see how I did it.

Let me know if you have any better ideas tho.

Makes sense, and it is also good practice to keep the logic of the plugins aligned. I have tried to replicate the functionality you mention here: 05657f6

shreemaan-abhishek

please resolve the conflicts too.

…gin-redis-atomic

Copilot

Pull request overview

This PR fixes race conditions in the limit-req Redis/Redis-Cluster policies by moving the limiter state into a single Redis hash key and updating it atomically via a single Lua EVAL script (with EVALSHA fast-path for standalone Redis), aligning behavior with the intended rate-limit semantics under concurrency.

Changes:

Replaces multi-key GET/SET updates with an atomic Redis Lua script operating on one hash key (excess + last) plus TTL.
Enables EVALSHA optimization for the standalone Redis policy; uses EVAL for Redis Cluster for reliability.
Adds test cases for Redis and Redis Cluster to validate hash structure usage, TTL presence, and basic limiting behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`apisix/plugins/limit-req/util.lua`	Implements atomic Redis script (hash-based state + TTL) with `EVALSHA` fallback logic.
`apisix/plugins/limit-req/limit-req-redis.lua`	Enables `use_evalsha` for standalone Redis limiter instances.
`apisix/plugins/limit-req/limit-req-redis-cluster.lua`	Disables `use_evalsha` for cluster limiter instances (uses `EVAL`).
`t/plugin/limit-req-redis.t`	Adds tests for hash-key state + TTL and a rapid-request rejection case for Redis policy.
`t/plugin/limit-req-redis-cluster.t`	Adds analogous tests for Redis Cluster policy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-11T09:24:37Z

apisix/plugins/limit-req/util.lua

+  local rate       = tonumber(ARGV[1])   -- req/s
+  local now        = tonumber(ARGV[2])   -- ms
+  local burst      = tonumber(ARGV[3])   -- req/s


The Lua script comments label rate/burst as "req/s", but in this limiter implementation they are passed around in the same scaled units used by the original algorithm (rate = configured_rate * 1000, burst = configured_burst * 1000). Updating these comments to reflect the actual units would reduce the risk of future logic changes introducing subtle math/TTL bugs.

Suggested change

local rate = tonumber(ARGV[1]) -- req/s

local now = tonumber(ARGV[2]) -- ms

local burst = tonumber(ARGV[3]) -- req/s

local rate = tonumber(ARGV[1]) -- scaled request rate (configured_rate * 1000)

local now = tonumber(ARGV[2]) -- ms

local burst = tonumber(ARGV[3]) -- scaled burst (configured_burst * 1000)

Copilot · 2026-03-11T09:24:37Z

t/plugin/limit-req-redis.t

+            local vals = red:hmget("limit_req:{test_key}:state", "excess", "last")
+            if vals[1] and vals[2] then


resty.redis:hmget returns (vals, err). This test only captures the first return value and then unconditionally indexes vals[1]/vals[2], which will throw a Lua runtime error if hmget fails and returns nil. Capture err and guard for not vals (and ideally print the error) to make the test robust and failures diagnosable.

Suggested change

local vals = red:hmget("limit_req:{test_key}:state", "excess", "last")

if vals[1] and vals[2] then

local vals, err = red:hmget("limit_req:{test_key}:state", "excess", "last")

if not vals then

ngx.say("failed to hmget: ", err)

elseif vals[1] and vals[2] then

moonming

Hi @falvaradorodriguez, thank you for making the Redis limit-req path atomic via EVAL!

Using a Lua script with EVAL to ensure atomicity of the rate limiting operation is the correct approach — the current non-atomic multi-command flow can have race conditions under high concurrency. With 12 reviews, this has been thoroughly discussed.

To confirm readiness:

Are all 12 review comments addressed?
Has the EVAL script been tested under concurrent load to verify it resolves the race condition?
The hash key with TTL approach — does it handle key expiration correctly for sliding windows?

This is an important correctness fix. Let's get it finalized! Thank you.

Baoyuantop · 2026-03-25T06:49:33Z

Hi @falvaradorodriguez, I see no problems with the code here. Could you please fix the failing test?

falvaradorodriguez force-pushed the fix/limit-req-plugin-redis-atomic branch from 25d03d5 to 3530220 Compare September 9, 2025 15:23

fix: Make Redis path atomic via EVAL + use hash key with TTL

d60f099

- Switch Redis storage to a single hash key (cluster compatibility) - Perform read/compute/write atomically with EVAL - Keep first-hit behavior (no cost on missing state) - Add EX-based TTL to avoid key buildup

falvaradorodriguez force-pushed the fix/limit-req-plugin-redis-atomic branch from 3530220 to d60f099 Compare September 9, 2025 15:27

falvaradorodriguez changed the title ~~fix: Make Redis path atomic via EVAL + use hash key with TTL~~ fix (limit-req): Make Redis path atomic via EVAL + use hash key with TTL Sep 9, 2025

falvaradorodriguez changed the title ~~fix (limit-req): Make Redis path atomic via EVAL + use hash key with TTL~~ fix(limit-req): Make Redis path atomic via EVAL + use hash key with TTL Sep 9, 2025

falvaradorodriguez marked this pull request as ready for review September 10, 2025 09:04

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Sep 10, 2025

Baoyuantop added this to ⚡️ Apache APISIX Roadmap Sep 10, 2025

Baoyuantop moved this to 👀 In review in ⚡️ Apache APISIX Roadmap Sep 10, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Sep 11, 2025

Add tests

74d8587

falvaradorodriguez force-pushed the fix/limit-req-plugin-redis-atomic branch from 88ed813 to 74d8587 Compare September 11, 2025 10:56

Baoyuantop self-requested a review September 12, 2025 08:34

Baoyuantop assigned falvaradorodriguez Sep 12, 2025

falvaradorodriguez force-pushed the fix/limit-req-plugin-redis-atomic branch 3 times, most recently from 40a7f03 to af4d670 Compare September 22, 2025 12:30

falvaradorodriguez added 2 commits September 22, 2025 15:04

Fix linter issues

1332808

Fix test

6399925

falvaradorodriguez force-pushed the fix/limit-req-plugin-redis-atomic branch from af4d670 to 6399925 Compare September 22, 2025 13:05

Baoyuantop reviewed Sep 24, 2025

View reviewed changes

Baoyuantop requested review from AlinsRan and nic-6443 September 26, 2025 02:42

Baoyuantop added the wait for update wait for the author's response in this issue/PR label Dec 24, 2025

github-actions bot added the user responded label Dec 24, 2025

Baoyuantop added awaiting review and removed wait for update wait for the author's response in this issue/PR user responded labels Dec 25, 2025

Baoyuantop added wait for update wait for the author's response in this issue/PR and removed awaiting review labels Dec 26, 2025

falvaradorodriguez added 3 commits December 26, 2025 11:26

Add missing local declarations for lint

3a562e6

Merge branch 'master' into fix/limit-req-plugin-redis-atomic

9bc4c94

feat: redis ttl support

9079326

falvaradorodriguez requested review from Baoyuantop and membphis January 27, 2026 08:33

shreemaan-abhishek reviewed Jan 28, 2026

View reviewed changes

falvaradorodriguez added 2 commits January 28, 2026 11:39

Merge remote-tracking branch 'upstream/master' into fix/limit-req-plu…

48c93ee

…gin-redis-atomic

Use evalsha only with redis and not with redis-cluster

05657f6

falvaradorodriguez requested a review from shreemaan-abhishek January 28, 2026 12:35

Fix linter errors

b28ca4d

Baoyuantop removed the wait for update wait for the author's response in this issue/PR label Feb 24, 2026

Baoyuantop requested a review from Copilot March 11, 2026 09:16

Baoyuantop added the awaiting review label Mar 11, 2026

Copilot started reviewing on behalf of Baoyuantop March 11, 2026 09:18 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

moonming requested changes Mar 16, 2026

View reviewed changes

		local vals = red:hmget("limit_req:{test_key}:state", "excess", "last")
		if vals[1] and vals[2] then

-            local vals = red:hmget("limit_req:{test_key}:state", "excess", "last")
-            if vals[1] and vals[2] then
+            local vals, err = red:hmget("limit_req:{test_key}:state", "excess", "last")
+            if not vals then
+                ngx.say("failed to hmget: ", err)
+            elseif vals[1] and vals[2] then

Conversation

falvaradorodriguez commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Which issue(s) this PR fixes:

Checklist

Uh oh!

Baoyuantop commented Sep 10, 2025

Uh oh!

Baoyuantop commented Sep 22, 2025

Uh oh!

falvaradorodriguez commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Baoyuantop Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

falvaradorodriguez Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luarx commented Oct 8, 2025

Uh oh!

Baoyuantop commented Oct 11, 2025

Uh oh!

falvaradorodriguez commented Dec 24, 2025

Uh oh!

Baoyuantop commented Dec 26, 2025

Uh oh!

shreemaan-abhishek Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

shreemaan-abhishek Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

falvaradorodriguez Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

falvaradorodriguez Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

shreemaan-abhishek Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

falvaradorodriguez Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

shreemaan-abhishek left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

moonming left a comment

Choose a reason for hiding this comment

Uh oh!

Baoyuantop commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

falvaradorodriguez commented Sep 9, 2025 •

edited

Loading

falvaradorodriguez commented Sep 22, 2025 •

edited

Loading

falvaradorodriguez Sep 24, 2025 •

edited

Loading

falvaradorodriguez Jan 28, 2026 •

edited

Loading