-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return a Unique Exit Code for Out of Memory Incidents #1339
base: master
Are you sure you want to change the base?
Conversation
@@ -25,6 +25,7 @@ public enum ExitCodeType { | |||
), | |||
|
|||
FAILURE_ACCURACY_NOT_MET(15, "Detect was unable to meet the required accuracy."), | |||
FAILURE_OUT_OF_MEMORY(16, "Detect encountered an Out of Memory error. Please review memory settings and system resources."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An idea for your consideration: we could modify the enum to have a new double priority
field.
private final int exitCode;
private final String description;
private final double priority;
ExitCodeType(int exitCode, String description) {
ExitCodeType(exitCode, description, (double) exitCode);
}
ExitCodeType(int exitCode, String description, double priority) {
this.exitCode = exitCode;
this.description = description;
this.priority = priority;
}
Then you could define the new exit code as FAILURE_OUT_OF_MEMORY(16, "...", 0.5)
, and start sorting exit codes not by their value but by their priority. This would also allow us to insert any priority we want anywhere going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting insight. Surely it can be done. Let's have team's opinion on it whether we should go with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the ExitCodeType enum as per your suggestion and it worked as expected in my end. As this is planned for 10.4.0 release, the team might have their opinions whether we should go with this approach.
logger.error("Detected an issue: EXECUTABLE_TERMINATED_LIKELY_OUT_OF_MEMORY"); | ||
exitCodePublisher.publishExitCode(ExitCodeType.FAILURE_OUT_OF_MEMORY, "Executable terminated likely due to out of memory."); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might just be me but I find this error checking a bit distracting in the overall flow of the performDetectors method. Perhaps consider refactoring these lines to a small method you can call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored this error checking into a function called checkAndHandleOutOfMemoryIssue
6cd05c2
to
435951a
Compare
return second; | ||
} else { | ||
return (first.getExitCode() < second.getExitCode()) ? first : second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For your consideration: we could ensure that all exit codes have unique priorities by adding a test that iterates through all enum values and checks that. Then, this last check would not be necessary as comparing priorities would be enough to ensure deterministic results. Additionally, this would ensure that someone does not accidentally add an exit code without considering what priorities are meant for. We could also perform a similar check to ensure that exit code values are also unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great suggestion! I've implemented the proposed change and added tests to validate it.
One concern I have is how we determine the priority value for an exit code. Currently, we are assigning an arbitrary lower double value to ensure that a specific exit code takes precedence. For example:
FAILURE_OUT_OF_MEMORY(16, "Detect encountered an Out of Memory error. Please review memory settings and system resources.", 0.5)
I believe we should establish clear rules for setting priority values when deviating from the default (i.e., using the exit code itself as the priority). Assigning priorities arbitrarily doesn't seem ideal to me. What are your thoughts on defining a more structured approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should try to get to that point eventually and it's on a TODO list. If you have some concrete ideas already, feel free to write them down and share with the team!
//Completed. | ||
logger.debug("Finished running detectors."); | ||
detectorEventPublisher.publishDetectorsComplete(toolResult); | ||
|
||
return toolResult; | ||
} | ||
|
||
private void checkAndHandleOutOfMemoryIssue(List<DetectorDirectoryReport> reports) { | ||
if (detectorIssuePublisher.hasOutOfMemoryIssue(reports)) { | ||
logger.error("Detected an issue: EXECUTABLE_TERMINATED_LIKELY_OUT_OF_MEMORY"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Instead of hard-coding EXECUTABLE_TERMINATED_LIKELY_OUT_OF_MEMORY
you can get it from the enum name (EXECUTABLE_TERMINATED_LIKELY_OUT_OF_MEMORY.toString()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It should be done this way rather than hard-coding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but added some comments for your consideration.
MR is still in draft? |
@gopannabd , I believe this is so that we do not merge it too soon accidentally. This is destined for Detect 10.4. |
Ticket
IDETECT-3872
Summary
This pull request implements a solution for handling "Out of Memory" (OOM) incidents in sub-processes by introducing a unique exit code. The changes ensure that OOM-related failures are appropriately identified and prioritized in the application's exit code logic.
Key Changes
Detection of OOM Incidents
A new method,
hasOutOfMemoryIssue
, was added to theDetectIssuePublisher
class.This method inspects
DetectorDirectoryReport
objects to identify if any sub-process has crashed due to an OOM incident (mapped toDetectorStatusCode.EXECUTABLE_TERMINATED_LIKELY_OUT_OF_MEMORY
).Publishing a New Exit Code
The
performDetectors
method in theDetectorTool
class was updated to use thehasOutOfMemoryIssue
method.If an OOM incident is detected, a new exit code (
FAILURE_OUT_OF_MEMORY
) is published via theExitCodePublisher
.Handling Exit Code Prioritization
The unique exit code for OOM incidents needed to take precedence in the application's exit code evaluation.
The
getWinningExitCode
method inExitCodeManager
was modified to handle this special case, ensuringFAILURE_OUT_OF_MEMORY
is correctly prioritized.Challenges and Resolutions
-1
) was tested to prioritize OOM errors, but it was deemed unsuitable for representing the failure state.getWinningExitCode
to allow for a meaningful and unique exit code without conflicting with existing exit codes.Testing
getWinningExitCode
logic correctly prioritizes the OOM exit code over other failure types.Impact
Notes
Please review the changes, especially the modifications in
getWinningExitCode
, to confirm alignment with our design principles.