Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards runtime configuration of global storage & storage routing #3834

Open
nigoroll opened this issue Aug 4, 2022 · 2 comments
Open

Towards runtime configuration of global storage & storage routing #3834

nigoroll opened this issue Aug 4, 2022 · 2 comments

Comments

@nigoroll
Copy link
Member

nigoroll commented Aug 4, 2022

This issue is intended as a platform for initial discussion towards a VSV. Please note that parts of this proposal have been suggested and discussed before, the phrase I propose does not imply that I would claim to have come up with the idea first.

Today, varnish storage can be provided in two ways:

a) configured globally using the -s${ident}=${stevedore},... command line argument to varnishd. Any such storage is available to VCL as storage.${ident}. The special ident Transient identifies the storage to use for transient objects. If no ident is given, s0, s1 ... etc are used.

b) provided by vmods by means of a function/method returning the STEVEDORE type.

b) is not used in varnish-cache core code today, but has been implemented in vmod code to provide "dynamic" stevedores which are initialized upon first use from VCL and self-destruct when empty and no longer used by any VCL.

Storage provided by method b) is only visible to VCL and not through the storage.list command. By definition, its scope is the VCL, though stored objects might (and most likely will) outlive the VCL.

Proposed new interface

I propose to add runtime addition and removal of storage with the same global scope (and CLI visibility) as configured from the command line (case a) through the CLI:

  • storage.add ${ident} ${stevedore},... to have the same effect as the -s command line argument without the =, but at runtime. The storage.${ident} VCL object becomes visible to any VCL loaded after the storage.add command has been issued.
  • storage.remove ${ident} ${stevedore-specific commands} to request the storage to be removed. The storage.${ident} VCL object is no longer visible to newly loaded VCLs. Any other policy how to implement removal remains up to the stevedore. It might stay intact as long as it contains any objects, or it might remove or migrate all objects. See below for more details on the proposed implementation.
  • storage.tell ${ident} ${stevedore-specific commands} to interact with storages in a manner analogous to the interface implemented for vmod objects in Add tell/ear interface for CLI access to vmod objects #3729

Obviously, this interface requires reference counting from VCLs to storages. In its simplest form, a VCL would take a reference on all storages visible at the time of the load (those compiled into the VCL), or we could take references only for those storages actually used in the VCL.

Regarding storage.add of persistent storage, I would suggest that we limit it, for now, to adding an empty storage or at least do not support the addition of vampire objects to a running varnish instance from storage.

storage.remove would be implemented as a two stage process:

  • When the command is issued, the storage would be signaled to begin any clean up which it deems necessary (e.g. removing objects or migrating them) and optionally refusing new allocations (see health state below).
  • When the last VCL reference is removed, the storage would be signaled again that it could unconfigure itself once no objects are in use any more.

Health state for storage & management of the admin health

Loosely related, I also propose to add health state to storages analogous to that of directors

  • a health state set by the storage
  • an admin health to override any health state set by the storage

A sick state signals that the storage does not want to take on additional objects / allocations, for example because it encountered a high error rate on I/IO devices. Even a sick storage could still receive storage requests nevertheless, which may or may not fail.
The health state has no relevance for access to stored objects, those accesses may or may not fail on a sick or healthy storage.

The health state would be output with the storage.list CLI command and could be managed with storage.set_health analogous to backend.set_health.

@nigoroll
Copy link
Member Author

nigoroll commented Aug 8, 2022

Good feedback from bugwash:

  • mgt would need to track storages to enable worker restarts
  • do we want to support vampire loads at runtime? If so, we need to modify the ban system
  • should we move persisted bans to an own storage (like a file /var/run/vanishd.bans)?
    • this has more implications, for example we would need to keep the (persisted) ban list for as long as we want to re-add an old storage...

@bsdphk
Copy link
Contributor

bsdphk commented Aug 8, 2022

Thinking about it, It may be a good idea to store bans (in the filesystem) separately from objects in stevedores.

Something like:

Adding a ban appends "{timestamp} {banspec}" to /var/run/varnishd.bans

When child (re-)starts, it reads in /var/run/varnishd.bans, then loads perisistent stevedores, and writes new /var/run/varnishd.bans with surviving bans (surviving = objects older than the ban's timestamp have been loaded)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants