-
Notifications
You must be signed in to change notification settings - Fork 703
Description
🚨 CRITICAL BUG - VPP Process Termination
CONFIRMED: VPP crashes and terminates when receiving ICMP echo request to local interface addresses. This is a 100% reproducible crash that causes DoS.
Crash Evidence
💥 VPP Process Aborted with Assertion Failure
Full stack trace:
#0 0x00007ffff6a66eef in ?? () from /usr/lib64/libc.so.6
#1 0x00007ffff6a1ad36 in raise () from /usr/lib64/libc.so.6
#2 0x00007ffff6a06177 in abort () from /usr/lib64/libc.so.6
#3 0x0000000000407763 in os_panic () at /test/fdio-vpp/src/vpp/vnet/main.c:456
#4 0x00007ffff6cf4ed9 in debugger () at /test/fdio-vpp/src/vppinfra/error.c:84
#5 0x00007ffff6cf4c92 in _clib_error (how_to_die=2, function_name=0x0, line_number=0,
fmt=0x7ffff6ec519c "%s:%d (%s) assertion `%s' fails")
at /test/fdio-vpp/src/vppinfra/error.c:143
#6 0x00007ffff6e14e65 in vlib_node_runtime_get_next_frame (vm=0x7fffb68958c0,
n=0x7fffb7570d00, next_index=14) at /test/fdio-vpp/src/vlib/node_funcs.h:370
^^^^^^^^^^^^^^^^ INVALID (garbage from wrong pool)
#7 0x00007ffff6e14be6 in vlib_get_next_frame_internal (vm=0x7fffb68958c0,
node=0x7fffb7570d00, next_index=14, allocate_new_next_frame=0)
at /test/fdio-vpp/src/vlib/main.c:341
...
#12 0x00007ffff782c44f in ip4_load_balance_node_fn_hsw (vm=0x7fffb68958c0,
node=0x7fffb7570d00, frame=0x7fffb8e6d300)
at /test/fdio-vpp/src/vnet/ip/ip4_forward.c:261
^^^ CRASH HERE
Crash Point: src/vnet/ip/ip4_forward.c:261 in ip4_load_balance_node_fn_hsw()
Assertion: Invalid next_index=14 from corrupted load_balance object
Root Cause: Type Confusion Between DPO Pools
The Bug
vnet_buffer(b)->ip.adj_index[VLIB_TX] is used to store two different types of indices in different contexts:
- In ip4-lookup: Stores
dpo0->dpoi_indexwhich may be an index toreceive_dpo_pool(for local addresses) - In ip4-load-balance: Reads it as an index to
load_balance_pool
These are completely different memory pools → memory corruption → crash.
Detailed Flow
Step 1: Incoming Ping Request (ip4-lookup)
─────────────────────────────────────────────────
dst = 192.168.1.1 (local interface)
lbi0 = ip4_fib_forwarding_lookup() → returns 50 (load_balance_pool index)
lb0 = load_balance_get(50) → valid load_balance object
dpo0 = lb0->buckets[0] → {type=DPO_RECEIVE, index=7}
^^^^^^^^^^^^^^^^^^
index to receive_dpo_pool
❌ BUG: Store wrong index type
vnet_buffer(b)->ip.adj_index[VLIB_TX] = 7; ← receive_dpo_pool index!
NOT load_balance_pool index!
Source: src/vnet/ip/ip4_forward.h:355
Step 2: ICMP Echo Request Processing
─────────────────────────────────────────────────
ip4-icmp-echo-request node:
- Swaps src/dst IP addresses
- Does NOT update adj_index[VLIB_TX] (still = 7)
- next_node = "ip4-load-balance"
Source: src/plugins/ping/ping.c:506-508
Step 3: 💥 CRASH in ip4-load-balance
─────────────────────────────────────────────────
lbi0 = vnet_buffer(b)->ip.adj_index[VLIB_TX]; // = 7
lb0 = load_balance_get(7); // ❌ WRONG! Treats 7 as load_balance_pool index!
// Gets random/invalid memory
// lb0 now contains garbage data
dpo0 = load_balance_get_bucket_i(lb0, 0); // Read garbage
next[0] = dpo0->dpoi_next_node; // next_index = 14 (invalid)
vlib_buffer_enqueue_to_next(...); // 💥 ASSERTION FAILURE → abort()
Source: src/vnet/ip/ip4_forward.c:225-227, 261
Visual Representation
Memory Layout:
┌───────────────────────────────┐
│ receive_dpo_pool │
│ [0] = {...} │
│ [1] = {...} │
│ ... │
│ [7] = {sw_if_index=1, ...} │ ← Valid receive_dpo object
│ ... │
└───────────────────────────────┘
┌───────────────────────────────┐
│ load_balance_pool │
│ [0] = {...} │
│ [1] = {...} │
│ ... │
│ [7] = GARBAGE or unrelated │ ← ❌ Wrong interpretation!
│ ... │
│ [50] = {n_buckets=1, ...} │ ← Should use THIS one
└───────────────────────────────┘
Bug: index 7 is valid for receive_dpo_pool
but INVALID/GARBAGE for load_balance_pool!
Impact Assessment
| Aspect | Severity |
|---|---|
| Crash Severity | 🔴 CRITICAL - Process termination (abort) |
| Reproducibility | 🔴 100% - Every ping to local address |
| Affected Feature | ICMP echo reply (ping) to local interfaces |
| Security Impact | 🔴 DoS Attack Vector - Remote crash trigger |
| Data Loss | All in-flight packets dropped |
| Service Impact | Complete VPP outage requires restart |
Reproduction Steps
# 1. Start VPP with local interface
vpp# create loopback interface
vpp# set interface ip address loop0 192.168.1.1/24
vpp# set interface state loop0 up
# 2. Ping the local address
$ ping 192.168.1.1
# Result: 💥 VPP crashes immediately with assertion failureGDB Debug Evidence
Before crash:
(gdb) p *dpo0
$15 = {{{dpoi_type = DPO_RECEIVE, ← Correct: this is a receive DPO
dpoi_proto = DPO_PROTO_IP4,
dpoi_next_node = 12,
dpoi_index = 7}, ← This goes to adj_index[VLIB_TX]
as_u64 = 30065557516}}Then crash when reading load_balance_pool[7] instead of receive_dpo_pool[7].
Proposed Fix (Immediate Patch Needed)
Solution: Re-lookup in FIB
File: src/plugins/ping/ping.c
Function: ip4_icmp_echo_request()
Location: After swapping addresses (around line 454), before vlib_put_next_frame
/* After swapping src/dst addresses */
ip0->src_address.data_u32 = dst0;
ip0->dst_address.data_u32 = src0;
/* ✅ FIX: Perform FIB lookup for reply packet */
ip_lookup_set_buffer_fib_index (i4m->fib_index_by_sw_if_index, p0);
u32 lbi0 = ip4_fib_forwarding_lookup (vnet_buffer (p0)->ip.fib_index,
&ip0->dst_address);
vnet_buffer (p0)->ip.adj_index[VLIB_TX] = lbi0; // Store correct load_balance index
/* Update checksums... */Alternative: Change Next Node
// In src/plugins/ping/ping.c:506-508
.n_next_nodes = 1,
.next_nodes = {
[0] = "ip4-lookup", // Changed from "ip4-load-balance"
},Code References
| File | Line | Description |
|---|---|---|
src/vnet/ip/ip4_forward.c |
261 | 💥 Crash location |
src/vnet/ip/ip4_forward.c |
225-227 | Wrong pool access |
src/vnet/ip/ip4_forward.h |
355 | Stores wrong index type |
src/plugins/ping/ping.c |
506-508 | Wrong next node config |
src/vlib/node_funcs.h |
370 | Assertion failure |
src/vnet/dpo/receive_dpo.h |
62 | receive_dpo_pool definition |
src/vnet/dpo/load_balance.h |
172 | load_balance definition |
Request for Urgent Action
- ✅ Confirmed crash with 100% reproducibility
- ✅ DoS attack vector (external trigger)
- ✅ Process termination (complete service outage)
- ✅ Root cause identified with fix proposal
Questions for Maintainers:
- Can you reproduce on your setup?
- Which branch should the fix target?
- Should we also fix IPv6 (similar pattern)?
- Any existing test infrastructure for local ping?
Environment: Linux x86_64
Reporter: Available for testing patches
Priority: 🔴 CRITICAL
Workaround: Do not ping local interface addresses