Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload config file without restart #352

Open
ulrfa opened this issue Sep 22, 2020 · 3 comments
Open

Reload config file without restart #352

ulrfa opened this issue Sep 22, 2020 · 3 comments

Comments

@ulrfa
Copy link
Contributor

ulrfa commented Sep 22, 2020

Sometimes I would like to change bazel-remote’s configuration. But restarting bazel-remote would interfer with ongoing builds.

It would be nice if a configuration file could be re-loaded by a SIGHUP signal.

Allowing any configuration parameter to change anytime, might be too complicated, but perhaps support re-loading a few specific parameters? And log a warning if other parameters also changed in configuration file?

Use cases:

  • Enable/disable access log. Today I redirect stderr to file, and the access log on stdout to /dev/null, since the later produces too much data. But ocassionally I would like to temporarily enable also the access log for a short period of time, without restarting bazel-remote.

  • Change prometheus label configuration without restarting bazel-remote (see discussion in Add AC hit rate metrics with prometheus labels #350).

  • Set, unset and change http proxy url, without restarting bazel-remote. E.g. if the parent cache goes down or is moved to another URL. Or when migrating users from one bazel-remote instance to another, by temporarily configuring the instances as proxies for each other, for smoth migration. (Ongoing remote execution builds fails if input files from client to old instance, is not avilable for remote executor on new instance)

  • Change configuration dynamically to reduce proxy load on parent cache instance. If parent cache becomes overloaded. (I'm not sure about if that make sense or not, the other use cases are more important. See Configure http proxy uploaders #351)

@mostynb
Copy link
Collaborator

mostynb commented Sep 22, 2020

We could implement this naively, but that would mean adding more mutexes around using the settings which may change at any time (or using sync/atomic to access a struct).

I wonder if there's a way to avoid that overhead at the cost of some slight downtime when reloading, for example by restarting the http/grpc servers? If that's too difficult then maybe we can implement a fast-reload option: stop accepting requests, dump the index to disk, restart bazel-remote and import the index.

@ulrfa
Copy link
Contributor Author

ulrfa commented Sep 23, 2020

Thanks Mostyn,

I'm thinking for example allow creating and replacing cache.Proxy instances at runtime. And protect the reference to current proxy instance in disk.go with a mutex.

And perhaps in a similar way creating and replacing instances of metrics.Metrics interface: (

type Metrics interface {
// TODO Document interface
IncomingRequestCompleted(kind Kind, method Method, status Status, headers map[string][]string, protocol Protocol)
}
)

As long as replaced parts have well defined interfaces, and not too much dependencies, I think they could be replaced in runtime, without too much added complexity.

For me it would not be OK with a slight downtime when reloading, since that could cause ongoing remote execution builds to fail. (Downtime could cause failed builds also in pure cache scenarios for those using “builds-without-the-bytes”, unless bazelbuild/bazel#10880 is resolved)

I will not have time to implement anything of the above now, but I wanted to raise this as background to the discussion in #350 about if Prometheus label configuration should be in a separate configuration file or not.

@mostynb
Copy link
Collaborator

mostynb commented Sep 23, 2020

I haven't thought this through but if you want to avoid downtime, could a specialized proxy work? ie receive requests from clients, and forward them on to bazel-remote, with retries if bazel-remote stops accepting requests temporarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants