Unable to do inode lock with lk-owner*** on any subvolume while attempting WRITE on gfid: [Transport endpoint is not connected] #4422

xyz5578 · 2024-10-14T09:18:29Z

Description of problem:
1.There is a three-way replicated environment.I couldn't write to a specified file on two of the three nodes, but all the bricks worked fine.And all files except this one can be read and written normally from all nodes.
2.The file in problem is a qemu disk, on the two nodes where I can't write to it, I can't even read basic information through "qemu-img info", it waits for dozens of minutes before it displays the content.
3.I can be sure that the file read/write exception occurred at the same time on the two nodes.
The exact command to reproduce the issue:
There's no way to reproduce it.

The full output of the command that failed:

Expected results:
This qemu disk can be read and written on all three nodes.

Mandatory info:
- The output of the gluster volume info command:
All ok.
- The output of the gluster volume status command:
All ok.
- The output of the gluster volume heal command:
All ok.
**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/
Every time I try to write this file, the volume log on both problem nodes will report this error:
W [MSGID: 108019] [afr-lk-common.c:262:afr_log_locks_failure] 0-engine-replicate-0: Unable to do inode lock with lk-owner:d876d81a6e550000 on any subvolume while attempting WRITE on gfid:749becc7-c6b8-4a0b-8aff-68819af15c85. [Transport endpoint is not connected]

After I executed "gluster volume stop/start", reading and writing to the file returned to normal on all three nodes.
During the vol restart, multiple logs of "releasing lock on 749becc7-c6b8-4a0b-8aff-68819af15c85" appear on all three nodes:
[2024-10-10 07:16:21.929053 +0000] W [inodelk.c:617:pl_inodelk_log_cleanup] 0-engine-server: releasing lock on 749becc7-c6b8-4a0b-8aff-68819af15c85 held by {client=0x55e4ad758888, pid=92863 lk-owner=3895e9e7ba550000}
[2024-10-10 07:16:21.929063 +0000] W [inodelk.c:617:pl_inodelk_log_cleanup] 0-engine-server: releasing lock on 749becc7-c6b8-4a0b-8aff-68819af15c85 held by {client=0x55e4ad758888, pid=92863 lk-owner=3895e9e7ba550000}
[2024-10-10 07:16:21.930042 +0000] W [inodelk.c:617:pl_inodelk_log_cleanup] 0-engine-server: releasing lock on 749becc7-c6b8-4a0b-8aff-68819af15c85 held by {client=0x55e4ad758978, pid=43623 lk-owner=4839ce1a27560000}

These logs are on the two nodes in problem and not on the other normal node:
[2024-10-10 07:14:01.800654 +0000] E [inodelk.c:504:__inode_unlock_lock] 0-engine-locks: Matching lock not found for unlock 0-9223372036854775807, by 1870012b76550000 on 0x55e4ad7585b8 for gfid:749becc7-c6b8-4a0b-8aff-68819af15c85

**- Is there any crash ? Provide the backtrace and coredump
No

Additional info:

- The operating system / glusterfs version:
glusterfs 10.0

Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to do inode lock with lk-owner*** on any subvolume while attempting WRITE on gfid: [Transport endpoint is not connected] #4422

Unable to do inode lock with lk-owner*** on any subvolume while attempting WRITE on gfid: [Transport endpoint is not connected] #4422

xyz5578 commented Oct 14, 2024

Unable to do inode lock with lk-owner*** on any subvolume while attempting WRITE on gfid: [Transport endpoint is not connected] #4422

Unable to do inode lock with lk-owner*** on any subvolume while attempting WRITE on gfid: [Transport endpoint is not connected] #4422

Comments

xyz5578 commented Oct 14, 2024