-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New OpenSearch API source implementation #4603
base: main
Are you sure you want to change the base?
Conversation
List<Map<String, Object>> jsonListData = new ArrayList<>(); | ||
|
||
String requestBody = new String(httpData.toInputStream().readAllBytes(), StandardCharsets.UTF_8); | ||
List<String> jsonLines = Arrays.asList(requestBody.split(REGEX)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be better to create the Pattern
in the constructor. Then you can call splitPattern.split(requestBody))
. This should avoid compilation each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
violationRules { | ||
rule { //in addition to core projects rule | ||
limit { | ||
minimum = 0.90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new project. Can we aim for 100% coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
Thread.currentThread().interrupt(); | ||
throw new RuntimeException(ex); | ||
} | ||
LOG.info("Started OpenSearch API source on port " + sourceConfig.getPort() + "..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG.info("Started OpenSearch API source on port " + sourceConfig.getPort() + "..."); | |
LOG.info("Started OpenSearch API source on port {}.", sourceConfig.getPort()); |
Let's avoid these ellipses at the end. They are unclear and make it seem that more is coming.
Also, please use SLF4J interpolation over string concatenation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
|
||
public class OpenSearchAPISourceConfig extends BaseHttpServerConfig { | ||
|
||
static final String DEFAULT_ENDPOINT_URI = "/opensearch"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default should just be /
since this matches existing OpenSearch domains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
import java.util.Arrays; | ||
import java.util.ArrayList; | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please run a code formatter over your changes. There is a lot of whitespace that is off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
if (buffer == null) { | ||
throw new IllegalStateException("Buffer provided is null"); | ||
} | ||
if (server == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot here that is similar to the http
source. Can we extend your work from #4570 to have more of this shared in common?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
|
||
@Post("/_bulk") | ||
public HttpResponse doPostBulk(final ServiceRequestContext serviceRequestContext, final AggregatedHttpRequest aggregatedHttpRequest, | ||
@Param("pipeline") Optional<String> pipeline, @Param("routing") Optional<String> routing) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing Optional
in as a parameter is an anti-pattern. You don't actually know if it is null
or not. So you have tw checks you need to make pipeline != null && pipeline.isPresent()
. Just take in a String
and expect it to possibly be null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
@Post("/{index}/_bulk") | ||
public HttpResponse doPostBulkIndex(final ServiceRequestContext serviceRequestContext, final AggregatedHttpRequest aggregatedHttpRequest, @Param("index") Optional<String> index, | ||
@Param("pipeline") Optional<String> pipeline, @Param("routing") Optional<String> routing) throws Exception { | ||
requestsReceivedCounter.increment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's consolidate all of this logic in processBulkRequest
. The only thing that each of these methods should do is create the BulkAPIRequestParams
and then callprocessBulkRequest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dlvenable . I am going to address this.
f052e71
to
1ad4250
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactoring work!
|
||
private static final Logger LOG = LoggerFactory.getLogger(OpenSearchAPIService.class); | ||
|
||
// TODO: support other data-types as request body, e.g. json_lines, msgpack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO
really true? I think this is the only codec we need for _bulk
.
fe8659f
to
77930fc
Compare
Signed-off-by: Souvik Bose <souvik04in@gmail.com>
Signed-off-by: Souvik Bose <souvik04in@gmail.com>
Signed-off-by: Souvik Bose <souvik04in@gmail.com>
Signed-off-by: Souvik Bose <souvik04in@gmail.com>
Signed-off-by: Souvik Bose <souvik04in@gmail.com>
918dfd9
to
0e7bf17
Compare
This reverts commit c52f584.
Signed-off-by: Souvik Bose <souvbose@amazon.com>
Description
In order for DataPrepper to support all OpenSearch Document API(s), we need to build a new source similar to the existing http source. This pull request is intended to implement a new OpenSearch API source like
opensearch_api
similar tohttp
source. This source should support the Document API Bulk.This pull request includes the following:
_bulk
<index>/_bulk
pipeline
androuting
. (TODO:pipeline
parameter handling on Sink side)Example pipeline configuration with
opensearch_api
source looks like:Issues Resolved
Contributes to #248
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.