Skip to content

Commit

Permalink
rwx volume: use soft mode with long timeout for nfs client
Browse files Browse the repository at this point in the history
When a RWX volume is attached, a share-manager pod embedded with a userspace NFS server is created and the volume is exported. A remote exported share is hard mounted by Longhorn, and it is then provided to the workload. When the share-manager pod or embedded NFS server is somehow crashed or unreachable, the 'hard mount' option keeps the client trying to connect to NFS server and prevents data loss.

The root cause is the Linux kernel is trying to maintain filesystem stability. Linux kernel will not allow a filesystem to be unmounted until all its pending IO is written back to storage, and the system can't shut down until all file systems are
unmounted. Currently, the bug/issue is not resolved.

Longhorn 6655

Signed-off-by: Derek Su <derek.su@suse.com>
  • Loading branch information
derekbit authored and David Ko committed Sep 20, 2023
1 parent a66539f commit 09bef58
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions csi/node_server.go
Original file line number Diff line number Diff line change
Expand Up @@ -240,11 +240,11 @@ func (ns *NodeServer) nodeStageSharedVolume(volumeID, shareEndpoint, targetPath
"vers=4.1",
"noresvport",
// "sync", // sync mode is prohibitively expensive on the client, so we allow for host defaults
"intr",
"hard",
//"soft", // for this release we use soft mode, so we can always cleanup mount points
//"timeo=30", // This is tenths of a second, so a 3 second timeout, each retrans the timeout will be linearly increased, 3s, 6s, 9s
//"retrans=3", // We try the io operation for a total of 3 times, before failing, max runtime of 18s
// "intr",
//"hard",
"soft", // for this release we use soft mode, so we can always cleanup mount points
"timeo=150", // This is tenths of a second, so a 15 second timeout, each retrans the timeout will be linearly increased, 15s, 30s, 45s
"retrans=3", // We try the io operation for a total of 3 times, before failing
}
}

Expand Down

0 comments on commit 09bef58

Please sign in to comment.