-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Support graceful shutdown (#407)
* feat: Support graceful shutdown * update docs * docs * changelog * link code in docs * increase default of datanodes to 30 min * move into constants * use new operator-rs * docs: Format 15 minutes * Use new operator-rs * improve docs * fix link * use operator-rs 0.55.0 * fixup * improve docs * set error context * Added a high level description of graceful shutdown * Revert "Added a high level description of graceful shutdown" This reverts commit 7733ec1. Moved to stackabletech/documentation#473 --------- Co-authored-by: Jim Halfpenny <jim@source321.com>
- Loading branch information
Showing
9 changed files
with
134 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
36 changes: 33 additions & 3 deletions
36
docs/modules/hdfs/pages/usage-guide/operations/graceful-shutdown.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,36 @@ | ||
= Graceful shutdown | ||
|
||
Graceful shutdown of HDFS nodes is either not supported by the product itself | ||
or we have not implemented it yet. | ||
You can configure the graceful shutdown as described in xref:concepts:operations/graceful_shutdown.adoc[]. | ||
|
||
Outstanding implementation work for the graceful shutdowns of all products where this functionality is relevant is tracked in https://github.com/stackabletech/issues/issues/357 | ||
== JournalNodes | ||
|
||
As a default, JournalNodes have `15 minutes` to terminate gracefully. | ||
|
||
The JournalNode process will always run as PID `1` and will get a `SIGTERM` once Kubernetes wants to terminate the Pod. | ||
It will log the received signal as show in the log below and initiate a graceful shutdown. | ||
After the graceful shutdown timeout is passed and the process still didn't exit, Kubernetes will issue an `SIGKILL` to force-kill the process. | ||
|
||
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L2004[This] is the relevant code that gets executed in the JournalNodes as of HDFS version `3.3.4`. | ||
|
||
[source,text] | ||
---- | ||
2023-10-10 13:37:41,525 ERROR server.JournalNode (LogAdapter.java:error(75)) - RECEIVED SIGNAL 15: SIGTERM | ||
2023-10-10 13:37:41,526 INFO server.JournalNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG: | ||
/************************************************************ | ||
SHUTDOWN_MSG: Shutting down JournalNode at hdfs-journalnode-default-0/10.244.0.38 | ||
************************************************************/ | ||
---- | ||
|
||
== NameNodes | ||
|
||
As a default, NameNodes have `15 minutes` to terminate gracefully. | ||
They go through the same mechanism as documented for the <<_journalnodes>> above. | ||
|
||
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java#L1080[This] is the relevant code that gets executed in the NameNodes as of HDFS version `3.3.4`. | ||
|
||
== DataNodes | ||
|
||
As a default, DataNodes have `30 minutes` to terminate gracefully. | ||
They go through the same mechanism as documented for the <<_journalnodes>> above. | ||
|
||
https://github.com/apache/hadoop/blob/a585a73c3e02ac62350c136643a5e7f6095a3dbb/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java#L272[This] is the relevant code that gets executed in the DataNodes as of HDFS version `3.3.4`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
use snafu::{ResultExt, Snafu}; | ||
use stackable_hdfs_crd::MergedConfig; | ||
use stackable_operator::builder::PodBuilder; | ||
|
||
#[derive(Debug, Snafu)] | ||
pub enum Error { | ||
#[snafu(display("Failed to set terminationGracePeriod"))] | ||
SetTerminationGracePeriod { | ||
source: stackable_operator::builder::pod::Error, | ||
}, | ||
} | ||
|
||
pub fn add_graceful_shutdown_config( | ||
merged_config: &(dyn MergedConfig + Send + 'static), | ||
pod_builder: &mut PodBuilder, | ||
) -> Result<(), Error> { | ||
// This must be always set by the merge mechanism, as we provide a default value, | ||
// users can not disable graceful shutdown. | ||
if let Some(graceful_shutdown_timeout) = merged_config.graceful_shutdown_timeout() { | ||
pod_builder | ||
.termination_grace_period(graceful_shutdown_timeout) | ||
.context(SetTerminationGracePeriodSnafu)?; | ||
} | ||
|
||
Ok(()) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,2 @@ | ||
pub mod graceful_shutdown; | ||
pub mod pdb; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters