-
Notifications
You must be signed in to change notification settings - Fork 84
Enhance Functional Testing: Daemon Startup Validation, Log Improvements, and Test Restart Refactoring #734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance Functional Testing: Daemon Startup Validation, Log Improvements, and Test Restart Refactoring #734
Conversation
Davidson-Souza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain a little on each commit what's the logic behind those changes? I like the PR description, but remember that people may read each commit independently.
30058ec to
7797d1c
Compare
I improved the commit descriptions. |
jaoleal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went commit by commit and i got the feeling that they are backwards ?
I feel that the one changing the daemon startup should be the first one and the one simplifying the restart.py by last
Refactored the `restart.py` test to start a single node instead of two. The test now verifies that a node can stop and restart successfully using the same data directory. This simplifies the structure and ensures the test focuses on detecting issues like data corruption or initialization failures.
This commit resolves the timeout issue when detecting node ports. Previously, the logic read the log line by line, which caused errors if any delay occurred during node initialization, potentially skipping lines with RPC, P2P, or Electrum port information. The new implementation reads the entire log and accumulates a small cache until the ports are detected. This avoids repeatedly reading the same lines and ensures that all necessary ports are captured reliably.
Add a 1-second wait after starting the daemon to verify if it is running. If the daemon fails to start, terminate the process and log the error. This helps identify issues caused by invalid or conflicting flags passed during daemon startup, making debugging easier and more reliable.
7797d1c to
e87a4d2
Compare
Davidson-Souza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR can be merged first without any problem. #731 tends to be the final approach because it will stop reading things from the log, and #731 already excludes the |
|
ACK e87a4d2 |
|
ACK e87a4d2 |
Description and Notes
A validation step was added after node startup to ensure the daemon actually started successfully. After executing the startup command, the workflow waits for 1 second to verify if the node is running. If the daemon fails to start(such as due to misconfigured flags or command errors) the process is stopped, a relevant error message is logged (including the output of the command and the affected node), and an error is returned. This provides much clearer feedback for developers writing or running tests and makes it easier to detect incorrect flags or startup commands.
The port detection logic was also improved. Previously, the test only read the last line of the log within a specified timeout, which could cause failures if the output timing was off, resulting in missed port detection and unstable CI runs. Now, all log lines are read and checked, significantly reducing the chance of missing the port assignment. Since port information is always printed at the beginning of the logs, this change does not impact performance. Note: This is not a definitive solution to the architectural issue behind port detection; a more comprehensive fix is planned for PR #733.
Additionally, the restart test was refactored for better clarity and relevance. Instead of starting two nodes, stopping them, and checking their data directories (which did not accurately reflect a restart scenario), the test now starts a node, stops it, and then starts it again, verifying that restart works correctly and that stopping the node does not corrupt files.
How to verify the changes you have done?