Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(csi): add node info to failed mount logging. #2686

Merged
merged 1 commit into from
Mar 29, 2024

Conversation

james-munson
Copy link
Contributor

Which issue(s) this PR fixes:

Issue # longhorn/longhorn#7931

What this PR does / why we need it:

Log with kernel release and OS info when an NFS mount fails, to aid in troubleshooting.

Special notes for your reviewer:

Additional documentation or context

@james-munson james-munson requested a review from a team as a code owner March 9, 2024 00:17
csi/util.go Outdated Show resolved Hide resolved
Copy link
Member

@innobead innobead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log looks good to me, but we already have similar util functions in go-common-libs. Please use them directly.

P.S. Please also add distro name as well.

/sys/sys.go#L18-L55

func GetKernelRelease() (string, error) {
	utsname := &syscall.Utsname{}
	if err := syscall.Uname(utsname); err != nil {
		logrus.WithError(err).Warn("Failed to get kernel release")
		return "", err
	}

	// Extract the kernel release from the Utsname structure
	release := make([]byte, 0, len(utsname.Release))
	for _, b := range utsname.Release {
		if b == 0x00 {
			logrus.Trace("Found end of kernel release string [0x00]")
			break
		}
		release = append(release, byte(b))
	}
	return string(release), nil
}

// GetOSDistro reads the /etc/os-release file and returns the ID field.
func GetOSDistro(osReleaseContent string) (string, error) {
	var err error
	defer func() {
		err = errors.Wrapf(err, "failed to get host OS distro")
	}()

	lines := strings.Split(osReleaseContent, "\n")
	for _, line := range lines {
		if strings.HasPrefix(line, "ID=") {
			id := strings.TrimPrefix(line, "ID=")
			id = strings.Trim(id, `"`)
			logrus.Tracef("Found OS distro: %v", id)
			return id, nil
		}
	}

	return "", errors.Errorf("failed to find ID field in %v", types.OsReleaseFilePath)
}

@james-munson
Copy link
Contributor Author

james-munson commented Mar 11, 2024

Right, the first draft of this PR used sys.GetKernelRelease() from go-common-libs for the kernel release.
But to get the OS distro was a little trickier. ns.GetOSDistro() does an nsenter using the path /host/proc/ns/1/mnt which exists in a "normal" pod, but not in the longhorn-csi-plugin, so the call fails. There the right path would be /rootfs/proc/ns/1/mnt. So we can't use the ns pkg.
But sys.GetOSDistro(osReleaseContent string) isn't really directly usable. It is just a utiilty method for ns.GetOsDistro() which feeds it the contents of the /etc/os-release file to parse.

I figured I would just use the result from uname -v which would be something like

 #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023.

and so I borrowed the parsing code from sys.GetKernelRelease and did the equivalent of uname -rv

If I read the contents of the local /etc/os-release without doing the namespace stuff, and use that to call sys.GetOSDistro(), I'll get the info for the pod itself, which is

longhorn-csi-plugin-bchgv:/ # cat /etc/os-release 
NAME="SLES"
VERSION="15-SP5"
VERSION_ID="15.5"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP5"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp5"
DOCUMENTATION_URL="https://documentation.suse.com/"

which is clearly wrong, so I'll need to do an equivalent nsenter but to the /rootfs/proc path to get this for the host:

root@jbm-u22-pool2-ca5a5aa4-25fzl:~# cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

BTW, GetOSDistro returns the value of the ID= tag, but for purposes of this logging, I would think that PRETTY_NAME would be more informative.

@james-munson
Copy link
Contributor Author

james-munson commented Mar 15, 2024

Revised to use the go-common-libs/sys utilities. I had to make a local copy of RunFunc to use the CSI plugin's different location for /proc and some slightly modified flavors of the common-libs/ns utilities.

@james-munson
Copy link
Contributor Author

james-munson commented Mar 28, 2024

One other option: Mount the host's / on /host for the CSI deployment(s). The path /rootfs is set in this code:
https://github.com/longhorn/longhorn-manager/blame/103bcaaae5e080dc93af66bf0f8355c31d2f1b0e/csi/deployment.go#L417
and I'm not sure whether the comment is asserting that this specific path is required. It would be handy if it could follow the same pattern as other host root mounts.

It appears that it is, though, looking at, say,
https://github.com/kubernetes/utils/blob/4693a0247e57ebeacb9f261f07a03c9ecd569f44/nsenter/nsenter.go#L37
I'm not quite sure of the scope of that requirement.

@innobead
Copy link
Member

innobead commented Mar 29, 2024

One other option: Mount the host's / on /host for the CSI deployment(s). The path /rootfs is set in this code: 103bcaa/csi/deployment.go#L417 (blame) and I'm not sure whether the comment is asserting that this specific path is required. It would be handy if it could follow the same pattern as other host root mounts.

I think we can change to /host directly, since /rootfs was introduced in the older RWX implementation? You can run RWX related test cases to see if there will be any regression.

@james-munson
Copy link
Contributor Author

That's right. Manual and regression testing are happy with the change to mount host's / on /host. I have pushed up the change.

ejweber
ejweber previously approved these changes Mar 29, 2024
Copy link
Contributor

@ejweber ejweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for an inconsequential nit.

I like this a lot better than the previous implementation. It's good to standardize our mounting conventions.

csi/node_server.go Outdated Show resolved Hide resolved
…t of host root conformant to standard and ns utils

Signed-off-by: James Munson <james.munson@suse.com>
@innobead innobead merged commit ffe359d into longhorn:master Mar 29, 2024
5 checks passed
@innobead
Copy link
Member

@mergify backport v1.6.x v1.5.x

Copy link

mergify bot commented Mar 29, 2024

backport v1.6.x v1.5.x

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants