Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix visibility timeout errors (#4812) #4831

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

danhli
Copy link
Contributor

@danhli danhli commented Aug 13, 2024

Description

The change fixes the visibility timeout errors by creating acknowledgement sets for all the messages from a SQS ReceiveMessage call before additional processing.

Issues Resolved

Resolves #4812

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
  • New functionality has javadoc added
  • [X ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Daniel Li <lhouqua@amazon.com>
kkondaka
kkondaka previously approved these changes Aug 19, 2024
@kkondaka
Copy link
Collaborator

@danhli Thanks for working on this.

Copy link
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this change makes sense after looking closely.

Basically, it seems that we started processing the messages in order, but without the acknowledgement set, we were not changing the visibility timeout. With this change, we will have acknowledgement sets that are going to trigger the change visibility timeout while processing the messages.

}

if (endToEndAcknowledgementsEnabled) {
LOG.info("Created acknowledgement sets for {} messages.", parsedMessagesToRead.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a debug level log. Otherwise, we'll have too many logs generated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I will update the log level to debug.

Signed-off-by: Daniel Li <lhouqua@amazon.com>
@danhli
Copy link
Contributor Author

danhli commented Aug 19, 2024

@kkondaka I ran the following tests to validate the changes.

  1. Deployed the OOTB Data Prepper to an EC2. Updated the pipeline configuration to use a S3 source and an OpenSearch sink. Set the visibility timeout value to 30s. Sent 60 log files (80MB each) within a minute to the S3 source bucket. Confirmed the Data Prepper logs on the EC2 had the visibility timeout errors.
  2. Deployed the updated DataPrepper to the same EC2. Used the same test configuration as those in step 1. Confirmed the Data Prepper logs didn't have the visibility timeout errors.

@dlvenable dlvenable merged commit 910533a into opensearch-project:main Aug 23, 2024
47 of 48 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 23, 2024
Fix visibility timeout errors (#4812)

Signed-off-by: Daniel Li <lhouqua@amazon.com>
(cherry picked from commit 910533a)
dlvenable pushed a commit that referenced this pull request Aug 23, 2024
Fix visibility timeout errors (#4812)

Signed-off-by: Daniel Li <lhouqua@amazon.com>
(cherry picked from commit 910533a)

Co-authored-by: Daniel Li <68623003+danhli@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants