Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T6917: fix RPS ethernet settings for CPUs with more than 32 cores #4215

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

sever-sever
Copy link
Member

@sever-sever sever-sever commented Dec 2, 2024

Change Summary

The maximum value that could be written for the 'rpc_cpu' is 4294967295 or 0xffffffff
Ensure that the bitmask size doesn't exceed 32 bits, even if the system has more CPUs.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

Related PR(s)

Component(s) name

ethernet, rps

Proposed changes

How to test

The system with 96 CPUs

vyos@r-wan# lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          40 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   96
  On-line CPU(s) list:    0-95

Try to configure offload RPS

# set interfaces ethernet eth1 offload rps
[edit]
# commit
[ interfaces ethernet eth1 ]
VyOS had an issue completing a command.

OSError: [Errno 75] Value too large for defined data type

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/libexec/vyos/conf_mode/interfaces_ethernet.py", line 433, in <module>
    apply(c)
  File "/usr/libexec/vyos/conf_mode/interfaces_ethernet.py", line 410, in apply
    e.update(ethernet)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py", line 430, in update
    self.set_rps(dict_search('offload.rps', config) != None)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/ethernet.py", line 328, in set_rps
    self._write_sysfs(f'/sys/class/net/{self.ifname}/queues/rx-{i}/rps_cpus', f'{rps_cpus:x}')
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 130, in _write_sysfs
    write_file(filename, str(value))
  File "/usr/lib/python3/dist-packages/vyos/utils/file.py", line 69, in write_file
    raise e
  File "/usr/lib/python3/dist-packages/vyos/utils/file.py", line 61, in write_file
    with open(fname, 'w' if not append else 'a') as f:
OSError: [Errno 75] Value too large for defined data type

noteworthy:
cmd 'ethtool --features eth1 lro on'
returned (out):

returned (err):
Could not change any device features

[[interfaces ethernet eth1]] failed
Commit failed

Try to configure 33 CPUs and get the error Value too large for defined data type

>>> hex((1 << 33) -1)
'0x1ffffffff'
>>> 

root@r-wan:/home/vyos# echo 1ffffffff > /sys/class/net/eth0/queues/rx-0/rps_cpus
bash: echo: write error: Value too large for defined data type
root@r-wan:/home/vyos# 

Try 32 CPUs (for the chunk):

>>> hex((1 << 32) -1)
'0xffffffff'
>>> 
>>> (1 << 32) -1
4294967295
>>> 

root@r-wan:/home/vyos# echo ffffffff > /sys/class/net/eth0/queues/rx-0/rps_cpus
root@r-wan:/home/vyos# 
root@r-wan:/home/vyos# cat /sys/class/net/eth0/queues/rx-0/rps_cpus
00000000,00000000,ffffffff
root@r-wan:/home/vyos# 

After the fix:

vyos@r-wan# set interfaces ethernet eth0 offload rps 
[edit]
vyos@r-wan# commit
[edit]
vyos@r-wan# 
[edit]
vyos@r-wan# cat /sys/class/net/eth0/queues/rx-0/rps_cpus 
fffffffe,ffffffff,ffffffff
[edit]
vyos@r-wan# 

Smoketest result

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

@sever-sever sever-sever requested a review from a team as a code owner December 2, 2024 15:42
Copy link

github-actions bot commented Dec 2, 2024

👍
No issues in PR Title / Commit Title

@c-po
Copy link
Member

c-po commented Dec 2, 2024

What happens on:

echo 00000000,00000001,ffffffff > /sys/class/net/eth0/queues/rx-0/rps_cpus

@sever-sever
Copy link
Member Author

What happens on:

echo 00000000,00000001,ffffffff > /sys/class/net/eth0/queues/rx-0/rps_cpus

Interesting thing

root@r-wan:/home/vyos# echo 00000000,00000001,ffffffff > /sys/class/net/eth0/queues/rx-0/rps_cpus
root@r-wan:/home/vyos# 
root@r-wan:/home/vyos# cat /sys/class/net/eth0/queues/rx-0/rps_cpus
00000000,00000001,ffffffff
root@r-wan:/home/vyos# 

The maximun value theat could be written for the 'rpc_cpu'
is 4294967295 or 0xffffffff in the chunk splitted by commas
@sever-sever
Copy link
Member Author

@c-po I updated PR and logic

vyos@r-wan# cat /sys/class/net/eth0/queues/rx-0/rps_cpus 
00000000,00000000,00000000
[edit]
vyos@r-wan# 
[edit]
vyos@r-wan# set interfaces ethernet eth0 offload rps 
[edit]
vyos@r-wan# commit
[edit]
vyos@r-wan# 
[edit]
vyos@r-wan# 
[edit]
vyos@r-wan# cat /sys/class/net/eth0/queues/rx-0/rps_cpus 
fffffffe,ffffffff,ffffffff
[edit]
vyos@r-wan# 

Copy link

github-actions bot commented Dec 2, 2024

CI integration ❌ failed!

Details

CI logs

  • CLI Smoketests (no interfaces) ❌ failed
  • CLI Smoketests (interfaces only) ❌ failed
  • Config tests ❌ failed
  • RAID1 tests ❌ failed
  • TPM tests ❌ failed

@@ -310,22 +310,35 @@ def set_rps(self, state):
rps_cpus = 0
queues = len(glob(f'/sys/class/net/{self.ifname}/queues/rx-*'))
if state:
cpu_count = os.cpu_count()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.cpu_count() still returns numbers un-adjusted for SMT like Intel Hyper-Threading. We have a function that returns physical CPUs only in (vyos.utils.cpu).

I'm not completely sure if the distinction can be truly important here, but maybe use that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we do not know which cores are real cores it is useless
In this case we interested in total cores and calculate a bit mask for CPU’s
And it is not new, we used this before

this way it could be the worse thing if you have 4 CPU and 2 threads
For example real cores could be 0,2,4,6 and logical cores 1,3,5,7. What will be if you calculate only 4 cores (which is mix in this case of real/logical cores)?
In this case it will process by 2 real and 2 logical cores instead of all cores.

@sever-sever sever-sever requested a review from dmbaturin December 3, 2024 09:22
@dmbaturin dmbaturin merged commit 03a664a into vyos:current Dec 3, 2024
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

5 participants