Skip to content

Improve Java kernel crash detection and error reporting #66

@aion-kelvin

Description

@aion-kelvin

If the Java kernel crashes, node_test_harness doesn't really notice. The test cases just fail and vaguely say TimeoutException because the transactions it is expecting to complete haven't completed in the allotted time.

This problem is exacerbated when running in the Jenkins CI of aionnetwork/aion because it deletes workspaces right after execution, so the heap dump log file is deleted. Even if it weren't, it'd require logging into the CI host and knowing to look for a heapdump file (since the crash message from the kernel isn't visible to node_test_harness, you'd basically have to blindly check if a crash happened by looking for the heap dump log file).

Idea: when tailing the kernel log, look for the message that the process crashed. Save the location of the heapdump log that's in the message. Tell the user a crash happened and print out the log file (or save it to some conveniently accessible place).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions