Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource: fallback to sysconf when failed to detect memory size from hwloc for branch-2025.1 #8

Merged

Conversation

syuu1228
Copy link
Contributor

@syuu1228 syuu1228 commented Feb 3, 2025

This is backported version of scylladb/seastar#2624


On Fedora 41 AMI on some aarch64 instance such as m7gd.16xlarge, Seastar program such as Scylla fails to startup with following error message:

$ /opt/scylladb/bin/scylla --log-to-stdout 1
WARNING: debug mode. Not for benchmarking or production
hwloc/linux: failed to find sysfs cpu topology directory, aborting linux discovery.
scylla: seastar/src/core/resource.cc:683: resources seastar::resource::allocate(configuration &): Assertion `!remain' failed.

It seems like hwloc is failed to initialize because of /sys/devices/system/cpu/cpu0/topology/ not available on the instance.

I debugged src/core/resource.cc to find out why assert occured, and found that alloc_from_node() is failing because node->total_memory is 0. It is likely because of failure of hwloc initialize described above.

I also found that calculate_memory() going wrong since machine->total_memory is also 0.

To avoid the error on such environment, we need to fixup memory size on both machine->total_memory and node->total_memory. We can use sysconf(_SC_PAGESIZE) * sysconf(_SC_PHYS_PAGES) for this, just like we do on non-hwloc version of allocate().

Fixes scylladb/scylladb#22382
Related scylladb/scylla-pkg#4797

(cherry picked from commit b0a9f89)

…hwloc

On Fedora 41 AMI on some aarch64 instance such as m7gd.16xlarge, Seastar
program such as Scylla fails to startup with following error message:
```
$ /opt/scylladb/bin/scylla --log-to-stdout 1
WARNING: debug mode. Not for benchmarking or production
hwloc/linux: failed to find sysfs cpu topology directory, aborting linux discovery.
scylla: seastar/src/core/resource.cc:683: resources seastar::resource::allocate(configuration &): Assertion `!remain' failed.
```

It seems like hwloc is failed to initialize because of
/sys/devices/system/cpu/cpu0/topology/ not available on the instance.

I debugged src/core/resource.cc to find out why assert occured,
and found that alloc_from_node() is failing because node->total_memory is 0.
It is likely because of failure of hwloc initialize described above.

I also found that calculate_memory() going wrong since
machine->total_memory is also 0.

To avoid the error on such environment, we need to fixup memory size on
both machine->total_memory and node->total_memory.
We can use sysconf(_SC_PAGESIZE) * sysconf(_SC_PHYS_PAGES) for this,
just like we do on non-hwloc version of allocate().

Fixes scylladb/scylladb#22382
Related scylladb/scylla-pkg#4797

(cherry picked from commit b0a9f89)
@avikivity avikivity merged commit a350b5d into scylladb:branch-2025.1 Feb 3, 2025
15 checks passed
@avikivity
Copy link
Member

Submodule update queued for 2025.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants