Q/BUG: process stuck in audit_backlog_wait for a longtime #144

tuxoko · 2023-06-15T21:59:13Z

System: centos 7
Kernel: Upstream 5.10.168 with some custom change unrelated to audit
auditd: 2.8.5-4.el7.x86_64

Hi, we had an issue where we were trying to kill a process to be able to umount some filesystem.
However the process seemed to stuck in audit_backlog_wait for more than 20 seconds.
We have some code triggered a panic because it couldn't umount the filesystem.

Here's the stack trace

PID: 188152  TASK: ffff8f6e30ecdac0  CPU: 8   COMMAND: "rsync"
 #0 [ffffa6218ab6bc68] __schedule at ffffffff95a00af6
 #1 [ffffa6218ab6bcf8] schedule at ffffffff95a00f3f
 #2 [ffffa6218ab6bd18] schedule_timeout at ffffffff95a049bf
 #3 [ffffa6218ab6bd98] audit_log_start at ffffffff9516e338
 #4 [ffffa6218ab6be20] audit_log_start at ffffffff9516e5af
 #5 [ffffa6218ab6be48] audit_log_name.constprop.0 at ffffffff95174b00
 #6 [ffffa6218ab6be98] audit_log_exit at ffffffff951751db
 #7 [ffffa6218ab6bee0] __audit_syscall_exit at ffffffff95176ba0
 #8 [ffffa6218ab6bf18] syscall_exit_to_user_mode at ffffffff959f7c62
 #9 [ffffa6218ab6bf38] do_syscall_64 at ffffffff959f3ae5
#10 [ffffa6218ab6bf50] entry_SYSCALL_64_after_hwframe at ffffffff95c000a9

crash> ps -m | grep rsync
[0 00:00:21.270] [UN]  PID: 188152  TASK: ffff8f6e30ecdac0  CPU: 8   COMMAND: "rsync"

The audit_queue seems quite empty at the time of panic

crash> p audit_queue.qlen
$5 = 4
crash> p audit_backlog_limit
audit_backlog_limit = $6 = 640

I think there were some issue with fairness at play here.
When processes enter audit_log_start and audit_queue.qlen is large, it will then decides to wait.
Then while kauditd is consuming the audit_queue, other threads entering audit_log_start might see audit_queue.qlen small and bypass the wait. So there's no guarantee when the process in the wait will be able to queue.

Another part of this issue is that kauditd will only wake up one process in each iteration when it process the whole queue. The comment says wake everyone but it uses wake_up not wake_up_all even though waiter uses add_wait_queue_exclusive. If the intention is wake everyone then should we change it to wake_up_all? I think if it is wake_up_all then the chances of our process stuck for 20 seconds would probably be lower.

static int kauditd_thread(void *dummy)
{
...
		/* we have processed all the queues so wake everyone */
		wake_up(&audit_backlog_wait);

The text was updated successfully, but these errors were encountered:

rgbriggs · 2023-06-16T18:38:13Z

On 2023-06-15 14:59, Chunwei Chen wrote: System: centos 7 Kernel: Upstream 5.10.168 with some custom change unrelated to audit auditd: 2.8.5-4.el7.x86_64

That's pretty old...

Hi, we had an issue where we were trying to kill a process to be able to umount some filesystem.

"umount some filesystem"... Would that happen to be a user or network filesystem? #100

…

However the process seemed to stuck in audit_backlog_wait for more than 20 seconds. We have some code triggered a panic because it couldn't umount the filesystem.

tuxoko · 2023-06-16T18:59:03Z

@rgbriggs
It's a local filesystem and it's not fuse. We know the the process stuck in audit is holding fd in the filesystem because it's shown in the vmcore dump.

pcmoore · 2023-06-16T19:03:54Z

Unfortunately it's hard to draw any conclusions using a stacktrace from a custom kernel. Are you able to reproduce the problem on a modern kernel, or at the very least one without any custom patches?

tuxoko · 2023-06-16T19:17:38Z

@pcmoore
I can try that, but there's really not much difference and nothing related to audit.
And I have the vmcore so if you think there's something to look there I can look it up.
Also I'm curious about the intention of the wake_up in kauditd_thread was it to wake everyone or only one at a time.

pcmoore · 2023-06-16T20:52:03Z

@pcmoore I can try that, but there's really not much difference and nothing related to audit.

Sometimes unrelated changes can have a surprising impact :) For that reason, as a general rule we don't provide any upstream support for Linux Kernels with custom patches.

Beyond the custom kernel changes, I'm also curious if a modern kernel resolves your problem as there have been some changes related to queue management since v5.10 was released. As Richard already mentioned, that kernel is quite old.

And I have the vmcore so if you think there's something to look there I can look it up.

At the very least the line numbers of the stacktrace would be helpful.

Also I'm curious about the intention of the wake_up in kauditd_thread was it to wake everyone or only one at a time.

There was some discussion about waking blocked processes recently on the audit mailing list:

https://lore.kernel.org/audit/20230511052116.19452-1-eiichi.tsukata@nutanix.com

tuxoko · 2023-06-16T21:06:33Z

crash> bt -l 188152
PID: 188152  TASK: ffff8f6e30ecdac0  CPU: 8   COMMAND: "rsync"
 #0 [ffffa6218ab6bc68] __schedule at ffffffff95a00af6
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/sched/core.c: 3797
 #1 [ffffa6218ab6bcf8] schedule at ffffffff95a00f3f
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/arch/x86/include/asm/bitops.h: 206
 #2 [ffffa6218ab6bd18] schedule_timeout at ffffffff95a049bf
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/time/timer.c: 1878
 #3 [ffffa6218ab6bd98] audit_log_start at ffffffff9516e338
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/audit.c: 1886
 #4 [ffffa6218ab6be20] audit_log_start at ffffffff9516e5af
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/audit.c: 1856
 #5 [ffffa6218ab6be48] audit_log_name.constprop.0 at ffffffff95174b00
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/auditsc.c: 1404
 #6 [ffffa6218ab6be98] audit_log_exit at ffffffff951751db
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/auditsc.c: 1644
 #7 [ffffa6218ab6bee0] __audit_syscall_exit at ffffffff95176ba0
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/kernel/auditsc.c: 1794
 #8 [ffffa6218ab6bf18] syscall_exit_to_user_mode at ffffffff959f7c62
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/include/linux/audit.h: 340
 #9 [ffffa6218ab6bf38] do_syscall_64 at ffffffff959f3ae5
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/arch/x86/entry/common.c: 56
#10 [ffffa6218ab6bf50] entry_SYSCALL_64_after_hwframe at ffffffff95c000a9
    /usr/src/debug/kernel-5.10.168/linux-5.10.168-1.nutanix.20230215.el7.x86_64/arch/x86/entry/entry_64.S: 132

pcmoore · 2023-06-16T21:50:57Z

Ah, I see this is a Nutanix kernel build. Do you work for Nutanix or are you a customer/user? If the former, you should definitely read the mail archive link I posted above.

tuxoko · 2023-06-16T23:00:13Z

Yeah, I noticed just now. I work for different team than Eiichi and we use different build, but it does seem very similar.

When unbind and bind the device again, kernel will dump below warning: [ 173.972130] sysfs: cannot create duplicate filename '/devices/platform/soc/4c010010.usb/software_node' [ 173.981564] CPU: 2 UID: 0 PID: 536 Comm: sh Not tainted 6.12.0-rc6-06344-g2aed7c4a5c56 #144 [ 173.989923] Hardware name: NXP i.MX95 15X15 board (DT) [ 173.995062] Call trace: [ 173.997509] dump_backtrace+0x90/0xe8 [ 174.001196] show_stack+0x18/0x24 [ 174.004524] dump_stack_lvl+0x74/0x8c [ 174.008198] dump_stack+0x18/0x24 [ 174.011526] sysfs_warn_dup+0x64/0x80 [ 174.015201] sysfs_do_create_link_sd+0xf0/0xf8 [ 174.019656] sysfs_create_link+0x20/0x40 [ 174.023590] software_node_notify+0x90/0x100 [ 174.027872] device_create_managed_software_node+0xec/0x108 ... The '4c010010.usb' device is a platform device created during the initcall and is never removed, which causes its associated software node to persist indefinitely. The existing device_create_managed_software_node() does not provide a corresponding removal function. Replace device_create_managed_software_node() with the device_add_software_node() and device_remove_software_node() pair to ensure proper addition and removal of software nodes, addressing this issue. Fixes: a9400f1 ("usb: dwc3: imx8mp: add 2 software managed quirk properties for host mode") Cc: stable@vger.kernel.org Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20241126032841.2458338-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pcmoore added the question label Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q/BUG: process stuck in audit_backlog_wait for a longtime #144

Q/BUG: process stuck in audit_backlog_wait for a longtime #144

tuxoko commented Jun 15, 2023

rgbriggs commented Jun 16, 2023 via email

tuxoko commented Jun 16, 2023

pcmoore commented Jun 16, 2023

tuxoko commented Jun 16, 2023

pcmoore commented Jun 16, 2023

tuxoko commented Jun 16, 2023

pcmoore commented Jun 16, 2023

tuxoko commented Jun 16, 2023

Q/BUG: process stuck in audit_backlog_wait for a longtime #144

Q/BUG: process stuck in audit_backlog_wait for a longtime #144

Comments

tuxoko commented Jun 15, 2023

rgbriggs commented Jun 16, 2023 via email

tuxoko commented Jun 16, 2023

pcmoore commented Jun 16, 2023

tuxoko commented Jun 16, 2023

pcmoore commented Jun 16, 2023

tuxoko commented Jun 16, 2023

pcmoore commented Jun 16, 2023

tuxoko commented Jun 16, 2023