Skip to content

Fix flaky XACompletionTest by adding wait for async states#1862

Open
gurpartap3697 wants to merge 2 commits intoapache:mainfrom
gurpartap3697:fix/XACompletionTest-testStatsAndBrowseAfterAckPreparedRolledback-flaky-test
Open

Fix flaky XACompletionTest by adding wait for async states#1862
gurpartap3697 wants to merge 2 commits intoapache:mainfrom
gurpartap3697:fix/XACompletionTest-testStatsAndBrowseAfterAckPreparedRolledback-flaky-test

Conversation

@gurpartap3697
Copy link
Copy Markdown
Contributor

XACompletionTest.testStatsAndBrowseAfterAckPreparedRolledback was failing intermittently:

Error:    XACompletionTest.testStatsAndBrowseAfterAckPreparedRolledback:254 size 0 expected:<0> but was:<10>

ref: https://github.com/apache/activemq/actions/runs/23623154131/job/68806975829

The test relied on immediate JMX state, which is asynchronous and can lead to flakiness.

@jbonofre jbonofre self-requested a review March 30, 2026 18:47
jbonofre
jbonofre previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I have just 2 suggestions for you.

resource.recover(XAResource.TMSTARTRSCAN);
resource.recover(XAResource.TMNOFLAGS);

Wait.waitFor(() -> proxy.getInFlightCount() == 0L && proxy.cursorSize() == 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to wrap Watit.waitFor() in a assertTrue for better diagnostics on failure:

Suggested change
Wait.waitFor(() -> proxy.getInFlightCount() == 0L && proxy.cursorSize() == 0);
assertTrue("Timed out waiting for inFlight and cursor to clear",
Wait.waitFor(() -> proxy.getInFlightCount() == 0L && proxy.cursorSize() == 0));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wait condition doesn't cover getQueueSize(). The wait checks getInFlightCount() == 0 && cursorSize() == 0, but the very next assertion also checks getQueueSize() == 10.

If getQueueSize() is also subject to async updates, it could still flake.

Worth considering whether to include it in the wait condition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — thanks for calling that out.

I initially assumed that inFlightCount == 0 would imply the queue state (including queueSize) had fully settled. I've updated the wait condition to include getQueueSize()

@gurpartap3697 gurpartap3697 force-pushed the fix/XACompletionTest-testStatsAndBrowseAfterAckPreparedRolledback-flaky-test branch from e87d689 to 6c7cab0 Compare March 30, 2026 19:16
@gurpartap3697
Copy link
Copy Markdown
Contributor Author

Thanks @jbonofre for feedback! I updated the pr with suggested changes.

@gurpartap3697 gurpartap3697 requested a review from jbonofre March 30, 2026 19:17
@jbonofre
Copy link
Copy Markdown
Member

@gurpartap3697 good job ! I will merge as soon as CI is green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants