Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40.20241019.3.0 has invalid cpu devices when running on c7g.16xlarge #1828

Open
abhinavdahiya opened this issue Nov 7, 2024 · 1 comment
Labels

Comments

@abhinavdahiya
Copy link

Describe the bug

When using FCOS 40.20241019.3.0 to create nodes in k8s cluster in AWS (instance type c7g.16xlarge). the kubelet is unable to start up.
Please see attached logs for failure on reading the CPU information.

Reproduction steps

  1. Launch ec2 instance in aws with instance type c7g.16xlarge
  2. Run these following commands.
ls -lah /sys/devices/system/node/node0/cpu* | wc -l

Expected behavior

There should be 64 entries for 64 cores on the machine

Actual behavior

There is currently 16 entries only, and sometime even less.

System details

  • AWS (instance type c7g.16xlarge)
  • rpm-ostree status -b just hangs and does not return anything
  • Fedora CoreOS 40.20241006.3.0

Butane or Ignition config

No response

Additional information

Error logs from kubelet binary

Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601584    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu0, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu0/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601611    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu1, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu1/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601628    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu10, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu10/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601641    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu11, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu11/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601656    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu12, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu12/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601668    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu13, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu13/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601681    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu14, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu14/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601693    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu15, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu15/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601705    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu2, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu2/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601718    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu3, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu3/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601730    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu4, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu4/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601742    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu5, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu5/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601754    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu6, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu6/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601767    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu7, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu7/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601779    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu8, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu8/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.601791    9258 sysinfo.go:434] Cannot read core id for /sys/devices/system/node/node0/cpu9, core_id file does not exist, err: open /sys/devices/system/node/node0/cpu9/topology/core_id: no such file or directory
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602006    9258 machine.go:65] Cannot read vendor id correctly, set empty.
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602447    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu0/topology/core_id, assuming 0 for core_id of CPU 0
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602461    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu0/topology/physical_package_id, assuming 0 physical_package_id of CPU 0
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602477    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu1/topology/core_id, assuming 0 for core_id of CPU 1
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602493    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu1/topology/physical_package_id, assuming 0 physical_package_id of CPU 1
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602512    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu10/topology/core_id, assuming 0 for core_id of CPU 10
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602524    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu10/topology/physical_package_id, assuming 0 physical_package_id of CPU 10
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602539    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu11/topology/core_id, assuming 0 for core_id of CPU 11
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602549    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu11/topology/physical_package_id, assuming 0 physical_package_id of CPU 11
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602562    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu12/topology/core_id, assuming 0 for core_id of CPU 12
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602572    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu12/topology/physical_package_id, assuming 0 physical_package_id of CPU 12
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602585    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu13/topology/core_id, assuming 0 for core_id of CPU 13
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602595    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu13/topology/physical_package_id, assuming 0 physical_package_id of CPU 13
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602608    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu14/topology/core_id, assuming 0 for core_id of CPU 14
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602618    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu14/topology/physical_package_id, assuming 0 physical_package_id of CPU 14
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602631    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu15/topology/core_id, assuming 0 for core_id of CPU 15
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602641    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu15/topology/physical_package_id, assuming 0 physical_package_id of CPU 15
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602655    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu2/topology/core_id, assuming 0 for core_id of CPU 2
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602665    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu2/topology/physical_package_id, assuming 0 physical_package_id of CPU 2
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602679    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu3/topology/core_id, assuming 0 for core_id of CPU 3
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602689    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu3/topology/physical_package_id, assuming 0 physical_package_id of CPU 3
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602703    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu4/topology/core_id, assuming 0 for core_id of CPU 4
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602714    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu4/topology/physical_package_id, assuming 0 physical_package_id of CPU 4
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602732    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu5/topology/core_id, assuming 0 for core_id of CPU 5
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602743    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu5/topology/physical_package_id, assuming 0 physical_package_id of CPU 5
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602760    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu6/topology/core_id, assuming 0 for core_id of CPU 6
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602770    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu6/topology/physical_package_id, assuming 0 physical_package_id of CPU 6
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602784    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu7/topology/core_id, assuming 0 for core_id of CPU 7
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602793    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu7/topology/physical_package_id, assuming 0 physical_package_id of CPU 7
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602807    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu8/topology/core_id, assuming 0 for core_id of CPU 8
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602817    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu8/topology/physical_package_id, assuming 0 physical_package_id of CPU 8
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602830    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu9/topology/core_id, assuming 0 for core_id of CPU 9
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.602840    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu9/topology/physical_package_id, assuming 0 physical_package_id of CPU 9
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603276    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu0/topology/physical_package_id, assuming 0 for physical_package_id of CPU 0
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603288    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu0/topology/physical_package_id, assuming 0 physical_package_id of CPU 0
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603301    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu1/topology/physical_package_id, assuming 0 for physical_package_id of CPU 1
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603312    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu1/topology/physical_package_id, assuming 0 physical_package_id of CPU 1
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603325    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu10/topology/physical_package_id, assuming 0 for physical_package_id of CPU 10
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603335    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu10/topology/physical_package_id, assuming 0 physical_package_id of CPU 10
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603348    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu11/topology/physical_package_id, assuming 0 for physical_package_id of CPU 11
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603360    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu11/topology/physical_package_id, assuming 0 physical_package_id of CPU 11
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603376    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu12/topology/physical_package_id, assuming 0 for physical_package_id of CPU 12
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603394    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu12/topology/physical_package_id, assuming 0 physical_package_id of CPU 12
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603410    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu13/topology/physical_package_id, assuming 0 for physical_package_id of CPU 13
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603420    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu13/topology/physical_package_id, assuming 0 physical_package_id of CPU 13
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603433    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu14/topology/physical_package_id, assuming 0 for physical_package_id of CPU 14
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603443    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu14/topology/physical_package_id, assuming 0 physical_package_id of CPU 14
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603458    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu15/topology/physical_package_id, assuming 0 for physical_package_id of CPU 15
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603468    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu15/topology/physical_package_id, assuming 0 physical_package_id of CPU 15
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603481    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu2/topology/physical_package_id, assuming 0 for physical_package_id of CPU 2
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603491    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu2/topology/physical_package_id, assuming 0 physical_package_id of CPU 2
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603503    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu3/topology/physical_package_id, assuming 0 for physical_package_id of CPU 3
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603513    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu3/topology/physical_package_id, assuming 0 physical_package_id of CPU 3
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603527    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu4/topology/physical_package_id, assuming 0 for physical_package_id of CPU 4
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603538    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu4/topology/physical_package_id, assuming 0 physical_package_id of CPU 4
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603552    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu5/topology/physical_package_id, assuming 0 for physical_package_id of CPU 5
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603562    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu5/topology/physical_package_id, assuming 0 physical_package_id of CPU 5
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603575    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu6/topology/physical_package_id, assuming 0 for physical_package_id of CPU 6
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603586    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu6/topology/physical_package_id, assuming 0 physical_package_id of CPU 6
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603604    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu7/topology/physical_package_id, assuming 0 for physical_package_id of CPU 7
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603615    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu7/topology/physical_package_id, assuming 0 physical_package_id of CPU 7
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603631    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu8/topology/physical_package_id, assuming 0 for physical_package_id of CPU 8
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603642    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu8/topology/physical_package_id, assuming 0 physical_package_id of CPU 8
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603654    9258 sysfs.go:512] Cannot open /sys/bus/cpu/devices/cpu9/topology/physical_package_id, assuming 0 for physical_package_id of CPU 9
Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: W1107 18:03:59.603664    9258 sysfs.go:518] Cannot open /sys/bus/cpu/devices/cpu9/topology/physical_package_id, assuming 0 physical_package_id of CPU 9



....




Nov 07 18:03:59 ip-10-191-136-65 kubelet[9246]: E1107 18:03:59.640184    9258 kubelet.go:1511] "Failed to start ContainerManager" err="invalid Node Allocatable configuration. Resource \"cpu\" has an allocatable of {{1000 -3} {<nil>}  DecimalSI}, capacity of {{-1000 -3} {<nil>}  DecimalSI}"

New AMIs –

$ cat /etc/os-release
NAME="Fedora Linux"
VERSION="40.20241019.3.0 (CoreOS)"
ID=fedora
VERSION_ID=40
VERSION_CODENAME=""
PLATFORM_ID="platform:f40"
PRETTY_NAME="Fedora CoreOS 40.20241019.3.0"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:40"
HOME_URL="https://getfedora.org/coreos/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-coreos/"
SUPPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
BUG_REPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=40
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=40
SUPPORT_END=2025-05-13
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='40.20241019.3.0'
ls -lah /sys/devices/system/node/node0/cpu* | wc -l
18

## this is a 64 core machine, and it does not have cpuinfo for all the cores!!
ls -lah /sys/devices/system/node/node0/cpu0/
total 0
drwxr-xr-x.  4 root root    0 Nov  7 18:06 .
drwxr-xr-x. 24 root root    0 Nov  7 18:06 ..
-r--------.  1 root root 4.0K Nov  7 17:55 crash_notes
-r--------.  1 root root 4.0K Nov  7 17:55 crash_notes_size
lrwxrwxrwx.  1 root root    0 Nov  7 17:55 driver -> ../../../../bus/cpu/drivers/processor
lrwxrwxrwx.  1 root root    0 Nov  7 17:55 firmware_node -> ../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0007:00
drwxr-xr-x.  2 root root    0 Nov  7 17:55 hotplug
lrwxrwxrwx.  1 root root    0 Nov  7 17:55 node0 -> ../../node/node0
-rw-r--r--.  1 root root 4.0K Nov  7 17:55 online
drwxr-xr-x.  2 root root    0 Nov  7 17:55 power
lrwxrwxrwx.  1 root root    0 Nov  7 17:44 subsystem -> ../../../../bus/cpu
-rw-r--r--.  1 root root 4.0K Nov  7 17:44 uevent


### additionally the topology directory is missing in this bad AMI which is what is used by kubelet..

Old AMIs which is working fine–

$ cat /etc/os-release
NAME="Fedora Linux"
VERSION="40.20241006.3.0 (CoreOS)"
ID=fedora
VERSION_ID=40
VERSION_CODENAME=""
PLATFORM_ID="platform:f40"
PRETTY_NAME="Fedora CoreOS 40.20241006.3.0"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:40"
HOME_URL="https://getfedora.org/coreos/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-coreos/"
SUPPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
BUG_REPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=40
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=40
SUPPORT_END=2025-05-13
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='40.20241006.3.0'
ls -lah /sys/devices/system/node/node0/cpu* | wc -l
66

## this looks more correct, there are 64 core infos (plus additional 2 files)
ls -lah /sys/devices/system/node/node0/cpu0/
total 0
drwxr-xr-x.  7 root root    0 Nov  7 18:07 .
drwxr-xr-x. 72 root root    0 Nov  6 17:31 ..
drwxr-xr-x.  6 root root    0 Nov  6 17:33 cache
-r--r--r--.  1 root root 4.0K Nov  7 18:07 cpu_capacity
-r--------.  1 root root 4.0K Nov  7 18:07 crash_notes
-r--------.  1 root root 4.0K Nov  7 18:07 crash_notes_size
lrwxrwxrwx.  1 root root    0 Nov  7 18:07 driver -> ../../../../bus/cpu/drivers/processor
lrwxrwxrwx.  1 root root    0 Nov  7 18:07 firmware_node -> ../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0007:00
drwxr-xr-x.  2 root root    0 Nov  7 18:07 hotplug
lrwxrwxrwx.  1 root root    0 Nov  7 18:07 node0 -> ../../node/node0
-rw-r--r--.  1 root root 4.0K Nov  7 18:07 online
drwxr-xr-x.  2 root root    0 Nov  7 18:07 power
drwxr-xr-x.  3 root root    0 Nov  7 18:07 regs
lrwxrwxrwx.  1 root root    0 Nov  6 17:31 subsystem -> ../../../../bus/cpu
drwxr-xr-x.  2 root root    0 Nov  6 17:33 topology
-rw-r--r--.  1 root root 4.0K Nov  6 17:31 uevent
@dustymabe
Copy link
Member

I think this is a regression in kernel 6.11+.. I bisected the rawhide stream and found:

BISECT TEST RESULTS:
Last known good build: 41.20240720.91.0
First known bad build: 42.20240822.91.0

which had this transition:

kernel 6.10.0-64.fc41.aarch64 → 6.11.0-0.rc3.20240814git6b0f8db921ab.32.fc42.aarch64

So something in the early rc1/rc2/rc3 days of 6.11.

I did verify it is still a problem in very latest rawhide with kernel-6.12.0-0.rc7.59.fc42.aarch64 so I don't think the issue has been found and fixed upstream yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants