-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add convenience API key param to remote reindex #135949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5af6d91
2b67821
9fcada6
cca8a15
0c7fe47
b89508c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 135949 | ||
summary: Add convenience API key param to remote reindex | ||
area: Indices APIs | ||
type: enhancement | ||
issues: [] |
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -597,7 +597,7 @@ POST _reindex | |||||||||
{ | ||||||||||
"source": { | ||||||||||
"remote": { | ||||||||||
"host": "<OTHER_HOST_URL>:9200", | ||||||||||
"host": "<OTHER_HOST_URL>", | ||||||||||
"username": "user", | ||||||||||
"password": "pass" | ||||||||||
}, | ||||||||||
|
@@ -619,20 +619,55 @@ POST _reindex | |||||||||
% TEST[s/"username": "user",/"username": "test_admin",/] | ||||||||||
% TEST[s/"password": "pass"/"password": "x-pack-test-password"/] | ||||||||||
|
||||||||||
The `host` parameter must contain a scheme, host, port (for example, `https://otherhost:9200`), and optional path (for example, `https://otherhost:9200/proxy`). | ||||||||||
The `username` and `password` parameters are optional, and when they are present the reindex API will connect to the remote {{es}} node using basic auth. | ||||||||||
Be sure to use `https` when using basic auth or the password will be sent in plain text. There are a range of settings available to configure the behaviour of the `https` connection. | ||||||||||
The `host` parameter must contain a scheme, host, port (for example, `https://<OTHER_HOST_URL>:9200`), and optional path (for example, `https://<OTHER_HOST_URL>:9200/proxy`). | ||||||||||
|
||||||||||
When using {{ecloud}}, it is also possible to authenticate against the remote cluster through the use of a valid API key: | ||||||||||
### Using basic auth [reindex-basic-auth] | ||||||||||
|
||||||||||
To authenticate with the remote cluster using basic auth, set the `username` and `password` parameters, as in the example above. | ||||||||||
Be sure to use `https` when using basic auth, or the password will be sent in plain text. There are a [range of settings](#reindex-ssl) available to configure the behaviour of the `https` connection. | ||||||||||
|
||||||||||
### Using an API key [reindex-api-key] | ||||||||||
|
||||||||||
It is also possible (and encouraged) to authenticate with the remote cluster through the use of a valid API key: | ||||||||||
|
||||||||||
::::{applies-switch} | ||||||||||
|
||||||||||
:::{applies-item} { "stack": "ga 9.3", "serverless": } | ||||||||||
```console | ||||||||||
POST _reindex | ||||||||||
{ | ||||||||||
"source": { | ||||||||||
"remote": { | ||||||||||
"host": "<OTHER_HOST_URL>:9200", | ||||||||||
"host": "<OTHER_HOST_URL>", | ||||||||||
"api_key": "<API_KEY_VALUE>" | ||||||||||
Comment on lines
+641
to
+642
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The HTML entity encoding
Suggested change
Copilot uses AI. Check for mistakes. Positive FeedbackNegative Feedback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not 100% sure what's right here. I can foresee the docs getting hung up thinking this is an HTML tag if using an actual I'm guessing you have tested it out / copied previous examples? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I tested this locally. I recommend doing that for docs changes where you're not completely confident, by the way. Running the v3 tool is fairly easy once you've got it installed, and it does some kind of hot reloading so the feedback loop is very fast if you want to experiment. You can also you can see the preview here: https://docs-v3-preview.elastic.dev/elastic/elasticsearch/pull/135949/reference/elasticsearch/rest-apis/reindex-indices#reindex-from-remote . Interestingly, it doesn't like it if you replace the I happened to be discussing this in a thread with the docs team (because I needed advice on the version thing) so I asked there whether there's some other option that's preferred. I'll give them a little while to respond, and if they don't then I'll merge it like this, since it does work. |
||||||||||
}, | ||||||||||
"index": "my-index-000001", | ||||||||||
"query": { | ||||||||||
"match": { | ||||||||||
"test": "data" | ||||||||||
} | ||||||||||
} | ||||||||||
}, | ||||||||||
"dest": { | ||||||||||
"index": "my-new-index-000001" | ||||||||||
} | ||||||||||
} | ||||||||||
``` | ||||||||||
% TEST[setup:host] | ||||||||||
% TEST[s/^/PUT my-index-000001\n/] | ||||||||||
% TEST[s/otherhost:9200",/\${host}",/] | ||||||||||
% TEST[s/"headers": \{[^}]*\}/"username": "test_admin", "password": "x-pack-test-password"/] | ||||||||||
::: | ||||||||||
|
||||||||||
:::{applies-item} { "stack": "ga 9.0" } | ||||||||||
```console | ||||||||||
POST _reindex | ||||||||||
{ | ||||||||||
"source": { | ||||||||||
"remote": { | ||||||||||
"host": "<OTHER_HOST_URL>", | ||||||||||
"headers": { | ||||||||||
"Authorization": "ApiKey API_KEY_VALUE" | ||||||||||
"Authorization": "ApiKey <API_KEY_VALUE>" | ||||||||||
} | ||||||||||
}, | ||||||||||
"index": "my-index-000001", | ||||||||||
|
@@ -651,15 +686,26 @@ POST _reindex | |||||||||
% TEST[s/^/PUT my-index-000001\n/] | ||||||||||
% TEST[s/otherhost:9200",/\${host}",/] | ||||||||||
% TEST[s/"headers": \{[^}]*\}/"username": "test_admin", "password": "x-pack-test-password"/] | ||||||||||
::: | ||||||||||
|
||||||||||
:::: | ||||||||||
|
||||||||||
|
||||||||||
Be sure to use `https` when using an API key, or it will be sent in plain text. There are a [range of settings](#reindex-ssl) available to configure the behaviour of the `https` connection. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||||||||||
|
||||||||||
### Whitelisting remote hosts [reindex-remote-whitelist] | ||||||||||
|
||||||||||
Remote hosts have to be explicitly allowed in `elasticsearch.yml` using the `reindex.remote.whitelist` property. | ||||||||||
It can be set to a comma delimited list of allowed remote `host` and `port` combinations. | ||||||||||
It can be set to a comma-delimited list of allowed remote `host` and `port` combinations. | ||||||||||
Scheme is ignored, only the host and port are used. For example: | ||||||||||
|
||||||||||
```yaml | ||||||||||
reindex.remote.whitelist: [otherhost:9200, another:9200, 127.0.10.*:9200, localhost:*"] | ||||||||||
``` | ||||||||||
The list of allowed hosts must be configured on any nodes that will coordinate the reindex. | ||||||||||
The list of allowed hosts must be configured on any node that will coordinate the reindex. | ||||||||||
|
||||||||||
### Compatibility [reindex-remote-compatibility] | ||||||||||
|
||||||||||
This feature should work with remote clusters of any version of {{es}} you are likely to find. This should allow you to upgrade from any version of {{es}} to the current version by reindexing from a cluster of the old version. | ||||||||||
::::{warning} | ||||||||||
{{es}} does not support forward compatibility across major versions. For example, you cannot reindex from a 7.x cluster into a 6.x cluster. | ||||||||||
|
@@ -670,16 +716,18 @@ To enable queries sent to older versions of {{es}} the `query` parameter is sent | |||||||||
Reindexing from remote clusters does not support manual or automatic slicing. | ||||||||||
:::: | ||||||||||
|
||||||||||
### Tuning parameters [reindex-remote-tuning] | ||||||||||
|
||||||||||
Reindexing from a remote server uses an on-heap buffer that defaults to a maximum size of 100mb. | ||||||||||
If the remote index includes very large documents you'll need to use a smaller batch size. | ||||||||||
If the remote index includes very large documents you'll need to use a smaller batch size. | ||||||||||
The example below sets the batch size to `10` which is very, very small. | ||||||||||
|
||||||||||
```console | ||||||||||
POST _reindex | ||||||||||
{ | ||||||||||
"source": { | ||||||||||
"remote": { | ||||||||||
"host": "<OTHER_HOST_URL>:9200", | ||||||||||
"host": "<OTHER_HOST_URL>", | ||||||||||
... | ||||||||||
}, | ||||||||||
"index": "source", | ||||||||||
|
@@ -709,7 +757,7 @@ POST _reindex | |||||||||
{ | ||||||||||
"source": { | ||||||||||
"remote": { | ||||||||||
"host": "<OTHER_HOST_URL>:9200", | ||||||||||
"host": "<OTHER_HOST_URL>", | ||||||||||
..., | ||||||||||
"socket_timeout": "1m", | ||||||||||
"connect_timeout": "10s" | ||||||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -38,6 +38,7 @@ | |||||
import java.io.IOException; | ||||||
import java.net.URI; | ||||||
import java.net.URISyntaxException; | ||||||
import java.util.HashMap; | ||||||
import java.util.List; | ||||||
import java.util.Map; | ||||||
import java.util.function.Predicate; | ||||||
|
@@ -417,6 +418,11 @@ static RemoteInfo buildRemoteInfo(Map<String, Object> source) throws IOException | |||||
} | ||||||
|
||||||
Map<String, String> headers = extractStringStringMap(remote, "headers"); | ||||||
String apiKey = extractString(remote, "api_key"); | ||||||
if (apiKey != null) { | ||||||
headers = headersWithApiKey(headers, apiKey); | ||||||
} | ||||||
|
||||||
TimeValue socketTimeout = extractTimeValue(remote, "socket_timeout", RemoteInfo.DEFAULT_SOCKET_TIMEOUT); | ||||||
TimeValue connectTimeout = extractTimeValue(remote, "connect_timeout", RemoteInfo.DEFAULT_CONNECT_TIMEOUT); | ||||||
if (false == remote.isEmpty()) { | ||||||
|
@@ -493,4 +499,18 @@ static void setMaxDocsValidateIdentical(AbstractBulkByScrollRequest<?> request, | |||||
request.setMaxDocs(maxDocs); | ||||||
} | ||||||
} | ||||||
|
||||||
/** | ||||||
* Returns a headers map with the {@code Authorization} key set to the value {@code "ApiKey <apiKey>"}. If the original map is a | ||||||
* {@link HashMap}, it is mutated; if not (e.g. it is {@link java.util.Collections#EMPTY_MAP}), it is copied. If the headers already | ||||||
* include an {@code Authorization} key, an {@link IllegalArgumentException} is thrown. | ||||||
*/ | ||||||
private static Map<String, String> headersWithApiKey(Map<String, String> original, String apiKey) { | ||||||
if (original.keySet().stream().anyMatch(key -> key.equalsIgnoreCase("Authorization"))) { | ||||||
throw new IllegalArgumentException("Cannot specify both [api_key] and [headers] including [Authorization] key"); | ||||||
} | ||||||
Map<String, String> updated = (original instanceof HashMap) ? original : new HashMap<>(original); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logic mutates the original HashMap which could lead to unexpected side effects. Consider always creating a new HashMap to maintain immutability and avoid modifying the caller's data structure.
Suggested change
Copilot uses AI. Check for mistakes. Positive FeedbackNegative Feedback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can see why you have done this here to conserve memory when the update isn't required. Apparently CoPilot didn't pick up on that nuance... |
||||||
updated.put("Authorization", "ApiKey " + apiKey); | ||||||
return updated; | ||||||
} | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about this stack version annotation, I guess it runs the below command against the specified stack and version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't have version-specific docs as of 9.x, this makes the page show a selector where the reader can pick the appropriate tab and it shows the relevant snippet. You can see that in the preview.