Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State management #4

Open
jesteria opened this issue May 14, 2022 · 5 comments
Open

State management #4

jesteria opened this issue May 14, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request measurements
Milestone

Comments

@jesteria
Copy link
Member

Some measurement modules might require some form of state to be preserved across iterations.

(For example, speed tests might track their consumed bandwidth to avoid over-consumption.)

Questions:

  • How should this be stored by the framework?
    ➞ Most simply – serialized at {venv prefix} or {user prefix} or /var/lib/netrics + /state/{measurement} – or something; but, unclear what's best.
  • How should this be provided to modules?
    ➞ Could simply provide above path to measurement process via its environ (NETRICS_STATE) and let measurement go for it
    ➞ Could in same manner provide path to a FIFO into which state data has been written, and from which framework will update state, (moderating access to actual state file/whatever).
  • What about shared state? Needed? Allowable?
    ➞ If needed, but with moderation, could make global state available but read-only. (In this case, if implemented as above, would perhaps need either FIFO or better just a PIPE [os.pipe] – framework can provide the read descriptor of the pipe to the measurement, containing the aggregate state.)
    (This might say allow multiple separate measurements which care about bandwidth consumption to each record their own in a consistent manner and abide by the total they aggregate from shared state.)
@jesteria jesteria added the question Further information is requested label May 14, 2022
@kyle-macmillan
Copy link
Collaborator

I'm not familiar enough with the different serialization steps to know which is best so I will defer on that until I read more.

If the framework can save data in a manner that is agnostic to everything except the measurement environment that it is writing to then I think that's a fine option.

I think shared state is likely needed. Also agree that measurements should only ever read shared variables and let the controller update them. I don't think this will arise but if in the future the framework (maybe we can refer to it as the controller) is working in parallel we'll have to think about how to handle that.

@jesteria
Copy link
Member Author

jesteria commented Jun 2, 2022

(Agreed on parallelization. In current conception this could perhaps be handled simply with file locks….)

@kyle-macmillan
Copy link
Collaborator

If we've decided to run the scheduler as a service, this means that it will have to interact with some internal record anyway. In my mind this saving state for modules is totally possible. I think we still have to decide how the controller and modules interact with state. In other words, if there is some state variable that determines whether or not a test runs, does the controller pass that variable off to the module to decide whether or not the test runs or does the controller make that call on its own?

To give a concrete example, we regulate the execution of speed tests based on how much data has been consumed in a month. Assuming there is some user-defined cap, either the controller will compare the cap with the total data consumption or the module will. If the controller handles it and sees that we've hit the cap, we need to be mindful that the test is marked as complete and doesn't get shoved into the retry queue. I don't have a complete picture of how that would work because we haven't created the scheduler/controller yet but something to keep in mind. If the module handles it, it would have to exit with a special exit code to indicate that it should not be re-run.

The other thing to keep in mind is that the state variable needs to be reset periodically. This probably means designing state variable as a triple (value, reset, period) and having some function that checks whether we are past the reset deadline and when to set the next deadline according to period. Could also have some initial value in the case that value is reset to something other than 0.

@jesteria
Copy link
Member Author

Yes, I think recent conversations pushing the framework to operate as a service – or anyway maintaining its own internal state to some degree – only support this use-case of maintaining state on behalf of modules, (even though these are separate concerns).

Regarding how we might use state, I imagine two options. One, the basic feature, is outlined by this Issue: state is generated by modules, recorded by the framework, and made available again to modules during execution. This might cover a great many use-cases.

(And, yes, a module should certainly not be re-run in this case. I have imagined that re-runs must be opted into – e.g. with the exit code 42. So a module that decides that it should exit without doing anything should abide by some convention, but it needn't really matter to the framework, so long as it doesn't return code 42 – say, it exits with code 1 but doesn't make a big deal about it. Or something….)

Another option, perhaps a further enhancement, is to provide configuration hooks, e.g. a "should this module actually run?" hook, evaluated by the framework right before executing the module:

ping:
  schedule: 0 */6 * * *
  unless: state.thats_enough

I.e. perhaps this expression is evaluated with a reference, state, to the "ping" module's recorded state, into which it's written something like: {"thats_enough": false}. (And ultimately such a hook could support all kinds of things – "unless" the user has placed some lock file somewhere, "unless" load is high, etc.)

For framework-wide monthly bandwidth consumption, this is annoyingly complex, but nonetheless conceivable and a useful proof of concept:

speedtest-ookla:
  schedule: 0 */6 * * *
  unless:
    STATE |
    values |
    sum(
      "bandwidth.consumption.%s" | format(datetime.today.strftime('%YYYY-%mm')),
      start=0
    ) >= 1024

(The expression – which I'm imagining as a Jinja templating library expression – needn't be written precisely that way; but, I think this helps to make it as clear as possible.) Anyway, STATE might refer to global state object, e.g.

{
  "ping": {"thats_enough": false},
  "speedtest-ookla": {
    "bandwidth": {
      "consumption": {
        "2022-05": 204,
        "2022-06": 201,
        "2022-07": 20
      }
    }
  }
}

I.e. speedtest modules may simply accumulate their consumption by the current month, as a matter of course, as a feature, e.g.:

state =# acquire state from framework

consumption_now =# calculate test iteration's consumption

month_key = date.today().strftime('%Y-%m')

consumption_total = state.setdefault('bandwidth', {}).setdefault('consumption', {})
consumption_total[month_key] = consumption_total.get(month_key, 0) + consumption_now

# write state back to framework

And so this way, we can enforce that modules may only write to their own state, but may read from each others. So long as speedtest modules follow a common convention of writing their bandwidth consumption to the key path bandwidth.consumption.YEAR-MONTH, expressions may be constructed which sum all bandwidth consumption for the month.

All that said:

  • I'm curious what prospective users might think of these options
  • The bandwidth consumption logic is both complex enough and fundamental enough that it likely should be built into speedtest modules. Rather than forcing users to put such complex unless logic into their configuration, they might take advantage of this built-in logic, controlled instead via module-level configuration (parameters passed via stdin):
    speedtest-ookla:
      schedule: 0 */6 * * *
      param:
        cap-monthly: 1024

(And certainly there should be shared utilities for use by Python-language modules: #12.)

@jesteria jesteria added enhancement New feature or request measurements labels Aug 18, 2022
@jesteria jesteria self-assigned this Mar 14, 2023
@jesteria jesteria added this to the Beta milestone Mar 14, 2023
@jesteria jesteria removed the question Further information is requested label Mar 14, 2023
@jesteria
Copy link
Member Author

See: internet-equity/fate#19.

jesteria added a commit that referenced this issue Apr 12, 2023
Adds measurement `dev` (executable `netrics-dev`) to scan local network
and report on the number of connected devices.

Notably, this change implements Fate task state persistence, applied to
this measurement.

part of #3

part of #4

resolves #29
jesteria added a commit that referenced this issue Apr 12, 2023
Adds measurement `dev` (executable `netrics-dev`) to scan local network
and report on the number of connected devices.

Notably, this change implements Fate task state persistence, applied to
this measurement.

part of #3

part of #4

resolves #29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request measurements
Projects
None yet
Development

No branches or pull requests

2 participants