-
Notifications
You must be signed in to change notification settings - Fork 383
T7488: add utility for automatic rollback of section on apply stage error #4552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👍 |
e7a7b7d
to
91523ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces an automatic rollback mechanism for configuration sections when an apply-stage error occurs. It adds a new error flag, emits a hint file on failure, and provides a helper script to rollback the affected section.
- Introduce ERROR_COMMIT_APPLY in vyshim and configd to signal apply-stage failures
- Create
reset_section.py
helper to rollback or retry a failed section based on a hint file - Extend
ConfigSession
with ashared
mode to prevent premature session teardown
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
src/shim/vyshim.c | Add ERROR_COMMIT_APPLY flag, parse session PID, and write a hint file |
src/services/vyos-configd | Add ERROR_COMMIT_APPLY response code and separate commit vs apply logic |
src/helpers/reset_section.py | New CLI helper for reloading or rolling back a section using the hint |
python/vyos/configsession.py | Add shared parameter to skip teardown in shared-session scenarios |
Comments suppressed due to low confidence (3)
src/helpers/reset_section.py:54
- [nitpick] The variable name 'reload' shadows a built-in and the 'rollback' variable is never used. Rename 'reload' (e.g., to 'is_reload') and remove or utilize the unused 'rollback' flag.
reload = args.reload
python/vyos/configsession.py:149
- The new 'shared' parameter alters teardown behavior but isn't documented. Please update the constructor docstring to explain its purpose and the effect on session cleanup.
def __init__(self, session_id, app=APP, shared=False):
src/helpers/reset_section.py:1
- Consider adding automated tests for the reset_section helper to verify both reload and rollback flows, including cases where the hint file is present or absent.
#!/usr/bin/env python3
Leave hint if vyos-configd encounters an error in the generate/apply stages: this only detects 'first-order' differences, meaning those originating from the called config mode script, and not its dependencies. This is useful for supporting automatic rollback for certain cases of apply stage error.
CI integration 👍 passed! Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start that will eventually be generalized to all sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works as expected:
vyos@r14# set vpp settings buffers page-size 1G
[edit]
vyos@r14#
[edit]
vyos@r14# commit
[ vpp ]
Traceback (most recent call last):
File "/usr/libexec/vyos/services/vyos-configd", line 156, in run_script
script.apply(c)
File "/usr/libexec/vyos//conf_mode/vpp.py", line 676, in apply
vpp_control = VPPControl(attempts=20, interval=500)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/vyos/vpp/control_vpp.py", line 109, in __init__
raise VPPIOError(2, 'Cannot connect to VPP API')
vpp_papi.vpp_papi.VPPIOError: [Errno 2] Cannot connect to VPP API
[[vpp]] failed
Commit failed
[edit]
vyos@r14# run show vpp interfaces
Kernel Dataplane Type IP Address MAC MTU State
-------- ----------- ------ ------------- ----------------- ----- -------
eth1 dpdk 100.64.0.1/24 52:54:00:28:23:f1 1500 up
eth1.11 dpdk 00:00:00:00:00:00 1500 up
eth1.12 dpdk 00:00:00:00:00:00 1500 up
eth1.14 dpdk 00:00:00:00:00:00 1500 up
local0 local 00:00:00:00:00:00 0 down
eth1 tap4096 virtio 02:fe:84:20:15:ae 9000 up
eth1.11 tap4096.11 virtio 00:00:00:00:00:00 0 up
eth1.12 tap4096.12 virtio 00:00:00:00:00:00 0 up
eth1.14 tap4096.14 virtio 00:00:00:00:00:00 0 up
[edit]
vyos@r14# run show conf com | match vpp
set vpp settings interface eth1 driver 'dpdk'
set vpp settings ipv6 heap-size '32G'
set vpp settings physmem max-size '100G'
set vpp settings unix poll-sleep-usec '222'
[edit]
vyos@r14# compare
No changes between working and active configurations.
[edit]
vyos@r14#
Logs:
Jun 12 19:38:48 r14 vyos-configd[8114]: commit_scripts: ['vpp']
Jun 12 19:38:48 r14 vyos-configd[8114]: Received message: {"type": "node", "last": true, "data": "/usr/libexec/vyos/conf_mode/vpp.py"}
Jun 12 19:38:48 r14 systemd[1]: Reloading.
Jun 12 19:38:48 r14 vpp[20698]: received signal SIGTERM, PC 0x7ff9aee93545
Jun 12 19:38:48 r14 vpp[20698]: received SIGTERM from PID 1 UID 0, exiting...
Jun 12 19:38:48 r14 systemd[1]: Stopping vector packet processing engine...
Jun 12 19:38:48 r14 systemd[1]: vpp.service: Deactivated successfully.
Jun 12 19:38:48 r14 systemd[1]: Stopped vector packet processing engine.
Jun 12 19:38:48 r14 systemd[1]: vpp.service: Consumed 3.693s CPU time.
Jun 12 19:38:48 r14 systemd[1]: Starting vector packet processing engine...
Jun 12 19:38:48 r14 systemd[1]: Started vector packet processing engine.
Jun 12 19:38:48 r14 vpp[21202]: vpp[21202]: vlib_physmem_shared_map_create: clib_pmalloc_create_shared_arena: unsupported page size (1048576KB)
Jun 12 19:38:48 r14 vpp[21202]: vpp[21202]: vlib_buffer_main_init: failed to allocate buffer pool(s)
Jun 12 19:38:48 r14 vpp[21202]: vlib_physmem_shared_map_create: clib_pmalloc_create_shared_arena: unsupported page size (1048576KB)
Jun 12 19:38:48 r14 vpp[21202]: vlib_buffer_main_init: failed to allocate buffer pool(s)
Jun 12 19:38:48 r14 systemd[1]: vpp.service: Deactivated successfully.
Jun 12 19:38:48 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:49 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:49 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:50 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:50 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:51 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:51 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:52 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:52 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:53 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:53 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:54 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:54 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:55 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:55 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:56 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:56 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:57 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:57 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:58 r14 python3[8114]: VPP API connection timeout: [Errno 111] Connection refused
Jun 12 19:38:58 r14 vyos-configd[8114]: Traceback (most recent call last):
Jun 12 19:38:58 r14 vyos-configd[8114]: File "/usr/libexec/vyos/services/vyos-configd", line 156, in run_script
Jun 12 19:38:58 r14 vyos-configd[8114]: script.apply(c)
Jun 12 19:38:58 r14 vyos-configd[8114]: File "/usr/libexec/vyos//conf_mode/vpp.py", line 676, in apply
Jun 12 19:38:58 r14 vyos-configd[8114]: vpp_control = VPPControl(attempts=20, interval=500)
Jun 12 19:38:58 r14 vyos-configd[8114]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 12 19:38:58 r14 vyos-configd[8114]: File "/usr/lib/python3/dist-packages/vyos/vpp/control_vpp.py", line 109, in __init__
Jun 12 19:38:58 r14 vyos-configd[8114]: raise VPPIOError(2, 'Cannot connect to VPP API')
Jun 12 19:38:58 r14 vyos-configd[8114]: vpp_papi.vpp_papi.VPPIOError: [Errno 2] Cannot connect to VPP API
Jun 12 19:38:58 r14 vyos-configd[8114]: Sending reply: ERROR_COMMIT_APPLY with output
Jun 12 19:38:58 r14 vyos-configd[8114]: scripts_called: ['vpp']
Jun 12 19:38:59 r14 systemd[1]: opt-vyatta-config-tmp-new_config_8472.mount: Deactivated successfully.
Jun 12 19:39:01 r14 vyos-configd[8114]: Received message: {"type": "init"}
Jun 12 19:39:01 r14 vyos-configd[8114]: config session pid is 8472
Jun 12 19:39:01 r14 vyos-configd[8114]: config session sudo_user is vyos
Jun 12 19:39:01 r14 vyos-configd[8114]: commit_scripts: ['vpp']
Jun 12 19:39:01 r14 vyos-configd[8114]: Received message: {"type": "node", "last": true, "data": "/usr/libexec/vyos/conf_mode/vpp.py"}
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: [1af4:1041] type 00 class 0x020000
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: reg 0x14: [mem 0xfdc80000-0xfdc80fff]
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: reg 0x20: [mem 0x383800000000-0x383800003fff 64bit pref]
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: reg 0x30: [mem 0xfdc00000-0xfdc7ffff pref]
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: BAR 6: assigned [mem 0xfdc00000-0xfdc7ffff pref]
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: BAR 4: assigned [mem 0x383800000000-0x383800003fff 64bit pref]
Jun 12 19:39:01 r14 kernel: pci 0000:07:00.0: BAR 1: assigned [mem 0xfdc80000-0xfdc80fff]
Jun 12 19:39:01 r14 vyos_net_name[21264]: Started with arguments: ['/lib/udev/vyos_net_name', 'eth1', '52:54:00:28:23:f1']
Jun 12 19:39:01 r14 vyos_net_name[21264]: boot configuration complete
Jun 12 19:39:01 r14 vyos_net_name[21264]: Finished
Jun 12 19:39:01 r14 (udev-worker)[21263]: Network interface NamePolicy= disabled on kernel command line.
Jun 12 19:39:01 r14 kernel: 8021q: adding VLAN 0 to HW filter on device eth1
Jun 12 19:39:01 r14 vyos-configd[8114]: Sending reply: SUCCESS with output
Jun 12 19:39:01 r14 vyos-configd[8114]: scripts_called: ['vpp']
Jun 12 19:39:02 r14 systemd[1]: opt-vyatta-config-tmp-new_config_8472.mount: Deactivated successfully.
Jun 12 19:39:03 r14 commit[21465]: Successful change to active configuration by user vyos on /dev/pts/0
Jun 12 19:39:03 r14 vyos-configd[8114]: Received message: {"type": "init"}
Jun 12 19:39:03 r14 vyos-configd[8114]: config session pid is 8472
Jun 12 19:39:03 r14 vyos-configd[8114]: config session sudo_user is vyos
Jun 12 19:39:03 r14 vyos-configd[8114]: commit_scripts: ['vpp']
Jun 12 19:39:03 r14 vyos-configd[8114]: Received message: {"type": "node", "last": true, "data": "/usr/libexec/vyos/conf_mode/vpp.py"}
Jun 12 19:39:03 r14 systemd[1]: Reloading.
Jun 12 19:39:03 r14 systemd[1]: Starting vector packet processing engine...
Jun 12 19:39:03 r14 systemd[1]: Started vector packet processing engine.
Change summary
This needs the corresponding PR for vyatta-cfg (vyos/vyatta-cfg#102) to have effect within a config session. The current PR will need to be merged first.
Provide a utility for automatic rollback of a config section in case of an apply stage error.
This is the required tool for the VPP restart work of PR vyos/vyos-vpp#34
Under the modern backend, this will simply be a post-commit hook, however, for current use under the legacy backend, it requires some workarounds, notably because we are constrained by the legacy locking mechanism, which prevents a post commit hook from calling commit. This is already possible under vyconf which uses distinct locks for data vs. session.
Types of changes
Related Task(s)
Related PR(s)
vyos/vyatta-cfg#102
vyos/vyos-vpp#34
How to test / Smoketest result
Tested by @natali-rs1985 in the context of vyos/vyos-vpp#34
Checklist: