Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TERM signal handling configuration #38

Open
mac-chaffee opened this issue Nov 17, 2022 · 1 comment
Open

TERM signal handling configuration #38

mac-chaffee opened this issue Nov 17, 2022 · 1 comment
Labels

Comments

@mac-chaffee
Copy link
Contributor

mac-chaffee commented Nov 17, 2022

When running coraza-spoa in an environment like Kubernetes, handling SIGTERM becomes important for zero-downtime upgrades. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

There are two main ways to deploy coraza-spoa: As a standalone pod, or as a sidecar in an haproxy pod.

Standalone coraza-spoa

When coraza-spoa is running standalone, it should respond to SIGTERM by processing existing messages but refusing new ones. In Kubernetes, since there can be slight delays between when a container enters the TERMINATING state and when traffic starts being sent to other replicas, it's also good to allow for a configurable delay before new messages start being denied (see --shutdown-grace-period here).

Sidecar coraza-spoa

When running coraza-spoa as a sidecar inside an existing haproxy pod, if you try to roll out a new version of that haproxy+coraza-spoa Pod, the old haproxy+coraza-spoa containers will be sent a SIGTERM signal simultaneously. Haproxy will typically begin draining existing connections, but those connections still need to be serviced by Coraza. In this situation, coraza-spoa should probably be configured to ignore SIGTERM entirely, or set a very long --shutdown-grace-period. That way Coraza still processes all of those remaining requests until the bitter end (SIGKILL). But you also don't want to delay shutdown of the pod for longer than necessary, so maybe there's something in SPOP where we can detect when there are no more connected haproxy instances? At that point it would be safe to shut down.

Somewhat related to #19 which might require some signal handling as well.

@Tristan971
Copy link

Tristan971 commented Nov 24, 2022

Fwiw while a configurable grace period is still preferrable, you can always emulate it at the k8s level using a pre-stop lifecycle on your pod that does something like sleep N.

The pod is immediately put in unready state when the pre-stop begins (like the TERM by default) so removed from the service endpoints, but the pid 1 TERM itself is delayed until the end of the prestop (N seconds in this case).

Then you can truly tweak your grace based on how long your requests can last no matter whether the underlying process has graceful shutdown support

@jcchavezs jcchavezs added the enhancement New feature or request label Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants