-
Notifications
You must be signed in to change notification settings - Fork 9
Fix hostname parsing and improve decommission process #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice work! maybe in the tests, it might be safer to call extractNodeName()
directly instead of re-implementing the parsing logic? then the tests will break if the function ever changes unexpectedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice! But I'm not that much into this topic, especially code wise. So my review is very high level. Added just a testing related suggestion, leaving the approval up to @tomach.
} | ||
} | ||
|
||
func TestSplitHostname(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this already tested by the test for extractNodeName
as the splitHostname function is used there already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. splitHostname()
is indeed called within extractNodeName()
, so it is covered indirectly. I'd still lean toward keeping it for clarity and explicitly testing the helper in isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the splitHostname()
NOT be called anymore as its supersede by extractHostName
? If so, testing it dedicated may not make any sense. Not sure about Go, but can this method then be even declared private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'd remove the splitHostName
method at all and inline the split
call. Any tests related to this method is testing the library split
method (which doesn't make sense, this is tested elsewhere already) rather than any application logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. it makes sense to inline the call and remove both the function and the dedicated test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work! 👍
} | ||
} | ||
|
||
func TestSplitHostname(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. splitHostname()
is indeed called within extractNodeName()
, so it is covered indirectly. I'd still lean toward keeping it for clarity and explicitly testing the helper in isolation.
d64c72e
to
79cd722
Compare
The tool can read configuration from StatefulSet labels, overriding CLI parameters. This is especially useful to dynamically adjust decommission settings, without restarting the POD | ||
to pick up settings changes in the statefulset - think "short-cut" when needed. The tool has sensible defaults and can be run without parameters (except the `--reset-routing`). | ||
As already mentioned `terminationGracePeriodSeconds` MUST be set larger then `--tiemout` otherwise kubelet will SIGKILL the container before decommission finished! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type on --timeout
Summary of changes
Added
Persistent Logging: Dual logging to both STDOUT and persistent file with automatic rotation
--log-file
CLI flag (default:/resource/heapdump/dc_util.log
)PostStart Hook Detection: Intelligent detection of StatefulSet PostStart hooks
dc_util --reset-routing
NEW_PRIMARIES
routing allocation could not be reliably reset-reset-routing
) and double dash (--reset-routing
) flag formatsSingle Node Cluster Detection: Automatic detection and handling of single node clusters
Configurable Lock File Path: New
--lock-file
CLI flag/resource/heapdump/dc_util.lock
Enhanced Flag Support: Improved command-line flag handling
-reset-routing
and--reset-routing
formats now supportedMulti-Architecture Support: Automatic CPU architecture detection in hook configurations
dc_util-linux-amd64
ordc_util-linux-arm64
)Changed
Routing Allocation Logic: Enhanced PreStop process with PostStart hook detection
Replica Count Handling: Improved logic for different cluster sizes
Function Signatures: Updated internal functions to support configurable paths
createLockFile()
now accepts lock file path parameterremoveLockFile()
now accepts lock file path parameterlockFileExists()
now accepts lock file path parameterhandleResetRouting()
now accepts lock file path parameterImproved
Logging Experience: Comprehensive logging improvements
Documentation: Extensively updated README.md
Testing: Comprehensive test coverage for all new features
TestHasPostStartHookWithResetRouting
: PostStart hook detection with various scenariosTestPostStopRoutingAllocationIntegration
: Integration tests for routing allocation logicTestLoggingIntegration
: Dual logging functionality verificationTestLogRotation
: File rotation behavior validationTestSingleNodeClusterBehavior
: Single node cluster detection testsTestReplicaCountBehavior
: Comprehensive replica count handling testsChecklist
CHANGES.rst