Set pipeline for bulk request in OpenSearch sink #4965
+9
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We want to use the same index, with the same data source and compute multiple text embedding fields, each one using a different model. A document in the index would look like this :
In order to do this we would need multiple pipelines, using the same data source, same target index.
The index would be created without a default ingest pipeline, and instead, we would create one pipeline per model, targetting a different field each:
but the OpenSearch sink for each data-prepper pipeline should:
The bulk endpoint has a
pipeline
parameter, which I think can be used for this, but I don't think the OpenSearch sink receives apipeline
parameter.This PR uses the
pipeline
value when doing that request, but not sure what other changes would be required to support this.