Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add handle failed events option to parse json processors #4844

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

graytaylor0
Copy link
Member

@graytaylor0 graytaylor0 commented Aug 16, 2024

Description

The parse processors (json, ion, xml) now have support for a

handle_failed_events configuration that can either be set to skip or skip_silently. In both cases, the Event is sent downstream, but for skip_silently, the processing error will not be logged.

parse_json:
  source: "message"
  handle_failed_events: "skip_silently"

Also adds two new metrics

  • "parseErrors"
  • "processingFailures"

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

import java.util.Map;
import java.util.stream.Collectors;

public enum HandleFailedEventsOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace the existing option in drop processor with this? The drop and drop_silently options from that processor seems a decent option to provide elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally wanted to do this, but if we don't support those two options in the parse processors, then it's a bit weird to have there isn't it? Although I could just add a validation that does not allow drop or drop_silently for these processors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think a validation would be appropriate.

@@ -29,6 +31,7 @@

public abstract class AbstractParseProcessor extends AbstractProcessor<Record<Event>, Record<Event>> {
private static final Logger LOG = LoggerFactory.getLogger(AbstractParseProcessor.class);
private static final String PROCESSING_ERRORS = "processingErrors";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should differentiate our metrics between two situations. The first are expected errors. These are bad input that we know will happen. The second are unexpected errors. These are probably bugs.

I think we could have a metric for the first - parseErrors. If we want a metric for the second, this could be named something like processingFailures. Maybe we'd use this in other processors as well as time goes on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can do that

@@ -100,7 +110,10 @@ public Collection<Record<Event>> doExecute(final Collection<Record<Event>> recor
event.delete(this.source);
}
} catch (Exception e) {
LOG.error(EVENT, "An exception occurred while using the {} processor on Event [{}]", getProcessorName(), record.getData(), e);
processingErrorsCounter.increment();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit broad. This metric could occur if we have a problem that is not necessarily a parsing error (e.g. event.delete().

Related to my comment above, this might work better if we split the exceptions into two kinds.

@graytaylor0 graytaylor0 force-pushed the JsonHandleFailures branch 3 times, most recently from 130ae42 to a7a4db8 Compare August 19, 2024 18:33
Signed-off-by: Taylor Gray <tylgry@amazon.com>
"which will send the Event downstream to the next processor without logging the error. " +
"Default is 'skip'.")
@NotNull
private HandleFailedEventsOption handleFailedEventsOption = HandleFailedEventsOption.SKIP;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means, default behavior is not changed, right? ie, the failures are logged and skipped. This also, means customers will see this issue, unless they explicitly change this. Is this what is agreed upon? Should we set it to SKIP_SILENTLY in OSI?

Copy link
Member Author

@graytaylor0 graytaylor0 Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right default is to log and skip still. OSI default decision will be made separately

@graytaylor0 graytaylor0 merged commit ff2de26 into opensearch-project:main Aug 19, 2024
45 of 47 checks passed
kkondaka pushed a commit to kkondaka/kk-data-prepper-f2 that referenced this pull request Aug 20, 2024
…project#4844)

Signed-off-by: Taylor Gray <tylgry@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@dev-dsk-krishkdk-2c-bd29c437.us-west-2.amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants