From e1eb493bb8fec20b1000e20f816970433f6ea994 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Tue, 7 Nov 2023 10:24:25 +0400 Subject: [PATCH 1/8] Add a guide for manually interrupting indexing --- .../ingestion_server/guides/index.md | 1 + .../ingestion_server/guides/troubleshoot.md | 59 +++++++++++++++++++ 2 files changed, 60 insertions(+) create mode 100644 documentation/ingestion_server/guides/troubleshoot.md diff --git a/documentation/ingestion_server/guides/index.md b/documentation/ingestion_server/guides/index.md index 43e211488e3..c497ece023e 100644 --- a/documentation/ingestion_server/guides/index.md +++ b/documentation/ingestion_server/guides/index.md @@ -8,4 +8,5 @@ config mapping test deploy +troubleshoot ``` diff --git a/documentation/ingestion_server/guides/troubleshoot.md b/documentation/ingestion_server/guides/troubleshoot.md new file mode 100644 index 00000000000..d38837b2e20 --- /dev/null +++ b/documentation/ingestion_server/guides/troubleshoot.md @@ -0,0 +1,59 @@ +# Troubleshooting + +This guide describes various manual steps to troubleshoot issues with the +ingestion server's processes like database transfer, ES indexing. + +## Interrupt indexing + +The ingestion server performs indexing using indexer workers, whose primary +purpose it is to create documents from the API database and index them in +Elasticsearch. + +They are EC2 instances that are stopped by default when indexing is not taking +place. The indexer server raises them up, provides them with the necessary +information to perform the indexing and once they report back to the ingestion +server with a completion message, they are shut down again. + +Sometimes it is necessary to manually interrupt indexing, for example to limit +the size of a test/staging index. To do so, follow these steps. + +1. Determine the active ingestion worker machines from the AWS EC2 dashboard. + They will be named `indexer-worker-(dev|prod)` and will be in the "running" + state. + +2. SSH into the machine using it's public IP. + + ```console + $ ssh ec2-user@ + ``` + +3. Determine the name of the active `indexer_worker` container and pause it. + + ```console + $ docker ps + $ docker pause + ``` + +4. Repeat steps 2 and 3 for each active ingestion worker machine. Leave the SSH + sessions open. + +5. Wait for a few minutes and keep an eye on the document count in the + Elasticsearch index that was currently being created. It may increase a + little because of timing effects but should stop after a few minutes. + +6. From each of the open SSH sessions, send a completion notification to the + ingestion server's internal IP address. + + ```console + $ curl \ + -X POST \ + -H "Content-Type: application/json" \ + -d '{"error":false}' \ + http://:8001/worker_finished + ``` + +7. Terminate the SSH sessions and stop the indexer worker EC2 machines from the + AWS EC2 dashboard. + +8. The ingestion server will the instruct ES to start the next step of indexing, + i.e. replication. From ce555ffbb8a2cf45417716a50f770c0c15ff0c7f Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Tue, 7 Nov 2023 10:24:59 +0400 Subject: [PATCH 2/8] Partly document the ingestion server task API --- .../ingestion_server/reference/index.md | 1 + .../ingestion_server/reference/task_api.md | 95 +++++++++++++++++++ 2 files changed, 96 insertions(+) create mode 100644 documentation/ingestion_server/reference/task_api.md diff --git a/documentation/ingestion_server/reference/index.md b/documentation/ingestion_server/reference/index.md index 8b5749210fe..d4d53c4b025 100644 --- a/documentation/ingestion_server/reference/index.md +++ b/documentation/ingestion_server/reference/index.md @@ -7,4 +7,5 @@ elasticsearch safety notifications data_refresh +task_api ``` diff --git a/documentation/ingestion_server/reference/task_api.md b/documentation/ingestion_server/reference/task_api.md new file mode 100644 index 00000000000..f69d53bdab4 --- /dev/null +++ b/documentation/ingestion_server/reference/task_api.md @@ -0,0 +1,95 @@ +# Ingestion server API + +The ingestion server exposes an API at the `/task` endpoint to schedule various +tasks and get updates about their status and progress. + +New tasks can be created using the `POST` method and a payload as described +below. The response for this request provides an endpoint (containing a task's +unique ID) that can be used to retrieve the task information using the `GET` +method. + +## REINDEX + +If a complete data-refresh is not required, a new index can be created using the +`REINDEX` action. This action will create a new index for the given media type +using the data from the API database. A suffix can be provided for the index +otherwise a random UUID will be used. + +### Body + +```typescript +{ + model: "image" | "audio" + action: "REINDEX" + index_suffix: string +} +``` + +### Example + +```console +$ curl \ + -X POST \ + -H 'Content-Type: application/json' \ + -d '{"model": "image", "action": "REINDEX", "index_suffix": "20231106"}' \ + http://localhost:8001/task +``` + +## CREATE_AND_POPULATE_FILTERED_INDEX + +This endpoint creates a filtered index for a media type out of an existing +index. A `REINDEX` job must be followed by this job to ensure that the new index +has an associated filtered index as well before we promote it. + +### Body + +```typescript +{ + model: "image" | "audio" + action: "CREATE_AND_POPULATE_FILTERED_INDEX" + destination_index_suffix: string +} +``` + +```{caution} +Destination suffix here implies the suffix of the existing unfiltered index. +The filtered index will be created with "-filtered" appended to the destination +suffix. +``` + +### Example + +```console +$ curl \ + -X POST \ + -H 'Content-Type: application/json' \ + -d '{"model": "image", "action": "CREATE_AND_POPULATE_FILTERED_INDEX", "destination_index_suffix": "kw"}' \ + http://localhost:8001/task +``` + +## POINT_ALIAS + +This endpoint maps the index to a given alias. When an index is aliased to the +name of the media type (`image` or `audio`) or name + "-filtered", it becomes +the default or filtered index for that media type respectively. + +### Body + +```typescript +{ + model: "image" | "audio" + action: "POINT_ALIAS" + index_suffix: string + alias: string // should be model or model + "-filtered" +} +``` + +### Example + +```console +$ curl \ + -X POST \ + -H 'Content-Type: application/json' \ + -d '{"model": "image", "action": "POINT_ALIAS", "index_suffix": "20231106", "alias": "image-filtered"}' \ + http://localhost:8001/task +``` From d1c42aec9f73090751db57f6fb4fcc48a3a12a04 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Tue, 7 Nov 2023 15:22:27 +0400 Subject: [PATCH 3/8] Document the index upgrade and migration process --- .../ingestion_server/guides/index.md | 2 + .../ingestion_server/guides/migrate.md | 245 ++++++++++++++++++ .../ingestion_server/guides/upgrade.md | 114 ++++++++ 3 files changed, 361 insertions(+) create mode 100644 documentation/ingestion_server/guides/migrate.md create mode 100644 documentation/ingestion_server/guides/upgrade.md diff --git a/documentation/ingestion_server/guides/index.md b/documentation/ingestion_server/guides/index.md index c497ece023e..77646be9728 100644 --- a/documentation/ingestion_server/guides/index.md +++ b/documentation/ingestion_server/guides/index.md @@ -8,5 +8,7 @@ config mapping test deploy +migrate +upgrade troubleshoot ``` diff --git a/documentation/ingestion_server/guides/migrate.md b/documentation/ingestion_server/guides/migrate.md new file mode 100644 index 00000000000..760ae1056f3 --- /dev/null +++ b/documentation/ingestion_server/guides/migrate.md @@ -0,0 +1,245 @@ +# Index migration runbook + +From time to time, we will need to update our Elasticsearch indices. These +modifications can be classified into two broad-strokes categories, depending on +whether the changes affect the main consumer of the indices, the API. + +## Migration types + +### API-free + +These changes are safe modifications to the ES schema that do not affect the +API. As such they do not need any migration process. Examples: + +- addition of new fields or subfields +- removal of fields that are not referenced or used by the API +- changing the type to another compatible type (like `text` ↔ `keyword` ) + +For API-free changes, we deploy the ingestion server and let a data-refresh to +occur. The indexes will be updated to the new schema without manual intervention +and will be made available to the API. + +### API-involved + +These changes are modifications to fields that already are in use by the API and +involve code changes in both the ingestion server and the API. Examples: + +- removal of a field +- changing the type to an incompatible type +- renaming of a field + +Such kinds of changes need us to precisely deploy the API when the new index is +promoted because of these reasons: + +- If we deploy a little late, the old field the API wants will disappear. +- If we deploy a little early, the new field the API wants will not be present. + +This runbook documents guidelines and processes for API-involved migrations. + +Our goal is to break down an API-involved change into multiple small, atomic +changes with each step affecting at most one of the ingestion server or the API +and ensuring that the API and ES remain compatible throughout the process. + +## Pull request guidelines + +A change that involves modification to the ES index as well as its usage in the +API requires at least three steps, each associated with exactly one PR that +modifies exactly one of the ingestion server or the API to allow them to be +deployed independently. + +1. Change the ES index mapping in the ingestion server. Ensure that the change + is purely additive, keeping the old fields unchanged and creating new fields + that contain the data the API will need. + + This PR should only make changes within the `ingestion_server/` directory. + +2. Change the ES fields referenced by the API to use the new fields added in the + previous step. Ensure that the old fields become unreferenced. + + The PR should only make changes within the `api/` directory. + +3. Change the ES index mapping in the ingestion server to remove the old, + now-unreferenced fields. + + This PR should only make changes within the `ingestion_server/` directory. + +```{tip} +Get the PRs reviewed in advance so that the entire process has been vetted by +the team and there are no surprises or delays when the plans have been set into +motion. +``` + +```{caution} +Each PR in the chain should branch from, and point to, its predecessor in the +chain so that CI continues to pass for each PR. +``` + +### Example + +Assume we have a field `foo` with type `text` in the index. It has a subfield +`keyword` with type `keyword`. The API uses `foo.keyword` for all purposes. + +One PR that does these 3 things would be an API-involved change. So we split +them into 3 PRs. + +1. Changing `foo` to type `keyword` would be an API-free change because it is a + type change between two compatible types and does not affect the nested field + `foo.keyword` that is in use by the API. Technically the outer field can be + assumed to be "new" because it was not being used at all. + +2. Then we make an API change to use `foo` directly instead of `foo.keyword`. + Any other accommodations to make use of `foo` can be made in this step. In + this case `foo` will be the same as `foo.keyword` so no other changes will be + needed. + +3. Removal of the `foo.keyword` field would now also be an API-free change + because the field would no longer be in use. + +## Migration process + +The entire migration process can be classified into 3 phases. + +```{mermaid} +flowchart TD + subgraph api[API] + API + end + + subgraph elasticsearch[Elasticsearch] + image --> image-old + image-filtered --> image-old-filtered + audio --> audio-old + audio-filtered --> audio-old-filtered + end + + API --> image + API --> image-filtered + API --> audio + API --> audio-filtered +``` + +### Create the new fields + +1. Merge [PR number 1](#pull-request-guidelines). +2. Perform a [manual index upgrade](/ingestion_server/guides/upgrade.md). + +At the close of this phase we have all the new information for the API to use. + +```{mermaid} +flowchart TD + subgraph api[API] + API + end + + subgraph elasticsearch[Elasticsearch] + image -.-> image-old + image-filtered -.-> image-old-filtered + audio -.-> audio-old + audio-filtered -.-> audio-old-filtered + image --> image-mid + image-filtered --> image-mid-filtered + audio --> audio-mid + audio-filtered --> audio-mid-filtered + end + + API --> image + API --> image-filtered + API --> audio + API --> audio-filtered + + style image-old opacity:0.3 + style image-old-filtered opacity:0.3 + style audio-old opacity:0.3 + style audio-old-filtered opacity:0.3 +``` + +### Use the new fields instead of the old + +1. Merge [PR number 2](#pull-request-guidelines). This will automatically deploy + the API to staging. +2. Verify that the staging API continues to work. +3. [Deploy the API](/api/guides/deploy.md) to production. +4. Verify that the production API continues to work. + +At the close of this phase the API is exclusively using the new fields and the +old ones have become unreferenced. + +```{mermaid} +flowchart TD + subgraph api[API] + old[API] + new[New API] + end + + subgraph elasticsearch[Elasticsearch] + image --> image-mid + image-filtered --> image-mid-filtered + audio --> audio-mid + audio-filtered --> audio-mid-filtered + end + + old -.-> image + old -.-> image-filtered + old -.-> audio + old -.-> audio-filtered + new --> image + new --> image-filtered + new --> audio + new --> audio-filtered + + style old opacity:0.3 +``` + +### Remove the old fields + +1. Merge [PR number 3](#pull-request-guidelines). +2. Perform a [manual index upgrade](/ingestion_server/guides/upgrade.md). + +```{mermaid} +flowchart TD + subgraph api[API] + new[New API] + end + + subgraph elasticsearch[Elasticsearch] + image -.-> image-mid + image-filtered -.-> image-mid-filtered + audio -.-> audio-mid + audio-filtered -.-> audio-mid-filtered + image --> image-final + image-filtered --> image-final-filtered + audio --> audio-final + audio-filtered --> audio-final-filtered + end + + new --> image + new --> image-filtered + new --> audio + new --> audio-filtered + + style image-mid opacity:0.3 + style image-mid-filtered opacity:0.3 + style audio-mid opacity:0.3 + style audio-mid-filtered opacity:0.3 +``` + +You're done! + +```{mermaid} +flowchart TD + subgraph api[API] + new[New API] + end + + subgraph elasticsearch[Elasticsearch] + image --> image-final + image-filtered --> image-final-filtered + audio --> audio-final + audio-filtered --> audio-final-filtered + end + + new --> image + new --> image-filtered + new --> audio + new --> audio-filtered +``` diff --git a/documentation/ingestion_server/guides/upgrade.md b/documentation/ingestion_server/guides/upgrade.md new file mode 100644 index 00000000000..e4fb3b5f925 --- /dev/null +++ b/documentation/ingestion_server/guides/upgrade.md @@ -0,0 +1,114 @@ +# Manual index upgrade runbook + +A manual index upgrade is similar to a data-refresh except for two key +differences. + +- Each step of the process, from index creation, filtered index creation and + then promotion is done manually by SSH-ing into the ingestion server and + indexer workers. +- It is faster than a complete data-refresh as there is no transfer of data from + the catalog database to the API database. + +## Steps + +### Staging deployment + +1. [Deploy the ingestion server](/ingestion_server/guides/deploy.md) to staging. + This step ensures that the latest schema will be used for the new indices. + +2. Determine the real names of the indexes behind the following aliases. + + - `image` + - `image-filtered` + - `audio` + - `audio-filtered` + + This information is useful, to know what index to use if a rollback is needed + and to know what index to delete once the upgrade is complete. You can use + [Elasticvue](https://elasticvue.com) for this. + + ```{tip} + In staging, the filtered indices most likely also point to the default index. + ``` + +3. Perform [reindexing](/ingestion_server/reference/task_api.md#reindex) of all + media types. Let's say you use the suffix `abcd` for these indices. New + indices for each media type, like `image-abcd` and `audio-abcd`, will be + created. + + ```{caution} + Staging indices are supposed to be smaller and should not the full size of + the production dataset. You can + [interrupt the indexing process](/ingestion_server/guides/troubleshoot.md#interrupt-indexing) + once a satisfactory fraction (like ~50%) has been indexed. + ``` + + Wait for the indices to be replicated (and status green) before proceeding. + +4. [Point aliases](/ingestion_server/reference/task_api.md#point-alias) (both + default and filtered) for each media type to the new index. + + - `image` → `image-abcd` + - `image-filtered` → `image-abcd` + - `audio` → `audio-abcd` + - `audio-filtered` → `audio-abcd` + +5. Verify that the staging API continues to work. + + - If the staging API reports errors, immediately switch back the aliases to + the old indices. + - If the staging API works, delete the old indices to recover the free space. + +### Production deployment + +1. [Deploy the ingestion server](/ingestion_server/guides/deploy.md) to + production. This step ensures that the latest schema will be used for the new + indices. + +2. Determine the real names of the indexes behind the following aliases. + + - `image` + - `image-filtered` + - `audio` + - `audio-filtered` + + This information is useful, to know what index to use if a rollback is needed + and to know what index to delete once the upgrade is complete. You can use + [Elasticvue](https://elasticvue.com) for this. + +3. Perform [reindexing](/ingestion_server/reference/task_api.md#reindex) of both + media types. Let's say you use the suffix `abcd` for these indices. New + indices for each media type, like `image-abcd` and `audio-abcd`, will be + created. + + Wait for the indices to be replicated (and status green) before proceeding. + +4. Perform + [creation of filtered indices](/ingestion_server/reference/task_api.md#create-and-populate-filtered-index) + for all media types. New filtered indices for each media type, like + `image-abcd-filtered` and `audio-abcd-filtered`, will be created. + + Wait for the indices to be replicated (and status green) before proceeding. + +5. [Point aliases](/ingestion_server/reference/task_api.md#point-alias) for each + media type to the new default and filtered indices. + + - `image` → `image-abcd` + - `image-filtered` → `image-abcd-filtered` + - `audio` → `audio-abcd` + - `audio-filtered` → `audio-abcd-filtered` + +6. Verify that the production API continues to work. + + - If the production API reports errors, immediately switch back the aliases + to the old indices. + - If the production API works, delete the old indices to recover the free + space. + +## Rollback + +In this process, we are creating the new indices first, remapping the aliases, +and then removing the old indices. So if there is an issue with the new indices, +we can immediately switch back the aliases to the old ones and restore +functionality. Then we can investigate into the new indices as they will still +be present, just unused. From 128da0b7adf1dfa68e59731ad1783b81f67bc2ee Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Tue, 7 Nov 2023 20:45:40 +0400 Subject: [PATCH 4/8] Implement suggestion from code-review Co-authored-by: Olga Bulat --- .../ingestion_server/guides/migrate.md | 24 ++++++++++++------- .../ingestion_server/guides/troubleshoot.md | 8 +++---- .../ingestion_server/guides/upgrade.md | 4 ++-- 3 files changed, 22 insertions(+), 14 deletions(-) diff --git a/documentation/ingestion_server/guides/migrate.md b/documentation/ingestion_server/guides/migrate.md index 760ae1056f3..f4654d87b3d 100644 --- a/documentation/ingestion_server/guides/migrate.md +++ b/documentation/ingestion_server/guides/migrate.md @@ -28,11 +28,13 @@ involve code changes in both the ingestion server and the API. Examples: - changing the type to an incompatible type - renaming of a field -Such kinds of changes need us to precisely deploy the API when the new index is -promoted because of these reasons: +Such kinds of changes need us to precisely deploy the API simultaneously with +the promotion of new index because of these reasons: -- If we deploy a little late, the old field the API wants will disappear. -- If we deploy a little early, the new field the API wants will not be present. +- If the API deployment lags behind index promotion, the old field that the API + uses will disappear. +- If the API deployment leads ahead of index promotion, the new field the API + uses will not be present. This runbook documents guidelines and processes for API-involved migrations. @@ -51,17 +53,23 @@ deployed independently. is purely additive, keeping the old fields unchanged and creating new fields that contain the data the API will need. - This PR should only make changes within the `ingestion_server/` directory. + This PR should make changes only within the `ingestion_server/` directory, + more specifically the following two files concerned with ES mappings and + document schemas: -2. Change the ES fields referenced by the API to use the new fields added in the + - [`es_mappings.py`](https://github.com/WordPress/openverse/tree/main/ingestion_server/ingestion_server/es_mapping.py) + - [`elasticsearch_models.py`](https://github.com/WordPress/openverse/tree/main/ingestion_server/ingestion_server/elasticsearch_models.py) + +2. Update the API code to reference and use the new ES fields added in the previous step. Ensure that the old fields become unreferenced. - The PR should only make changes within the `api/` directory. + The PR should make changes only within the `api/` directory. 3. Change the ES index mapping in the ingestion server to remove the old, now-unreferenced fields. - This PR should only make changes within the `ingestion_server/` directory. + Like PR number 1, this PR should also make changes only within the + `ingestion_server/` directory. ```{tip} Get the PRs reviewed in advance so that the entire process has been vetted by diff --git a/documentation/ingestion_server/guides/troubleshoot.md b/documentation/ingestion_server/guides/troubleshoot.md index d38837b2e20..334119a4439 100644 --- a/documentation/ingestion_server/guides/troubleshoot.md +++ b/documentation/ingestion_server/guides/troubleshoot.md @@ -9,10 +9,10 @@ The ingestion server performs indexing using indexer workers, whose primary purpose it is to create documents from the API database and index them in Elasticsearch. -They are EC2 instances that are stopped by default when indexing is not taking -place. The indexer server raises them up, provides them with the necessary -information to perform the indexing and once they report back to the ingestion -server with a completion message, they are shut down again. +Indexer workers are EC2 instances that are stopped by default when indexing is +not taking place. The indexer server raises them up, provides them with the +necessary information to perform the indexing and once they report back to the +ingestion server with a completion message, they are shut down again. Sometimes it is necessary to manually interrupt indexing, for example to limit the size of a test/staging index. To do so, follow these steps. diff --git a/documentation/ingestion_server/guides/upgrade.md b/documentation/ingestion_server/guides/upgrade.md index e4fb3b5f925..d28fb711dc8 100644 --- a/documentation/ingestion_server/guides/upgrade.md +++ b/documentation/ingestion_server/guides/upgrade.md @@ -23,7 +23,7 @@ differences. - `audio` - `audio-filtered` - This information is useful, to know what index to use if a rollback is needed + This information is useful to know what index to use if a rollback is needed and to know what index to delete once the upgrade is complete. You can use [Elasticvue](https://elasticvue.com) for this. @@ -72,7 +72,7 @@ differences. - `audio` - `audio-filtered` - This information is useful, to know what index to use if a rollback is needed + This information is useful to know what index to use if a rollback is needed and to know what index to delete once the upgrade is complete. You can use [Elasticvue](https://elasticvue.com) for this. From 36caecd858a7ca18f64f7d260e74fbe2fb63ca0c Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Tue, 7 Nov 2023 21:01:45 +0400 Subject: [PATCH 5/8] Fix anchors and resolve build warnings --- documentation/ingestion_server/guides/upgrade.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/documentation/ingestion_server/guides/upgrade.md b/documentation/ingestion_server/guides/upgrade.md index d28fb711dc8..83b5e5ad3ae 100644 --- a/documentation/ingestion_server/guides/upgrade.md +++ b/documentation/ingestion_server/guides/upgrade.md @@ -45,7 +45,7 @@ differences. Wait for the indices to be replicated (and status green) before proceeding. -4. [Point aliases](/ingestion_server/reference/task_api.md#point-alias) (both +4. [Point aliases](/ingestion_server/reference/task_api.md#point_alias) (both default and filtered) for each media type to the new index. - `image` → `image-abcd` @@ -84,13 +84,13 @@ differences. Wait for the indices to be replicated (and status green) before proceeding. 4. Perform - [creation of filtered indices](/ingestion_server/reference/task_api.md#create-and-populate-filtered-index) + [creation of filtered indices](/ingestion_server/reference/task_api.md#create_and_populate_filtered_index) for all media types. New filtered indices for each media type, like `image-abcd-filtered` and `audio-abcd-filtered`, will be created. Wait for the indices to be replicated (and status green) before proceeding. -5. [Point aliases](/ingestion_server/reference/task_api.md#point-alias) for each +5. [Point aliases](/ingestion_server/reference/task_api.md#point_alias) for each media type to the new default and filtered indices. - `image` → `image-abcd` From d4df318bcab1b8fdd6b54c5e81a7aa10ac3ddeb1 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 9 Nov 2023 16:10:25 +0400 Subject: [PATCH 6/8] Use the correct plural of 'index' --- documentation/ingestion_server/guides/upgrade.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation/ingestion_server/guides/upgrade.md b/documentation/ingestion_server/guides/upgrade.md index 83b5e5ad3ae..c17f12ab114 100644 --- a/documentation/ingestion_server/guides/upgrade.md +++ b/documentation/ingestion_server/guides/upgrade.md @@ -16,7 +16,7 @@ differences. 1. [Deploy the ingestion server](/ingestion_server/guides/deploy.md) to staging. This step ensures that the latest schema will be used for the new indices. -2. Determine the real names of the indexes behind the following aliases. +2. Determine the real names of the indices behind the following aliases. - `image` - `image-filtered` @@ -65,7 +65,7 @@ differences. production. This step ensures that the latest schema will be used for the new indices. -2. Determine the real names of the indexes behind the following aliases. +2. Determine the real names of the indices behind the following aliases. - `image` - `image-filtered` From 66134fa8eac3d83b51f6e78f3df952437c07ddd2 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 9 Nov 2023 16:14:27 +0400 Subject: [PATCH 7/8] Implement suggestion from code-review Co-authored-by: Madison Swain-Bowden --- .../ingestion_server/guides/migrate.md | 19 +++++++++++-------- .../ingestion_server/guides/troubleshoot.md | 2 +- .../ingestion_server/guides/upgrade.md | 6 +++--- 3 files changed, 15 insertions(+), 12 deletions(-) diff --git a/documentation/ingestion_server/guides/migrate.md b/documentation/ingestion_server/guides/migrate.md index f4654d87b3d..9de9fa58b62 100644 --- a/documentation/ingestion_server/guides/migrate.md +++ b/documentation/ingestion_server/guides/migrate.md @@ -15,9 +15,13 @@ API. As such they do not need any migration process. Examples: - removal of fields that are not referenced or used by the API - changing the type to another compatible type (like `text` ↔ `keyword` ) -For API-free changes, we deploy the ingestion server and let a data-refresh to -occur. The indexes will be updated to the new schema without manual intervention -and will be made available to the API. +For API-free changes, we deploy the ingestion server and perform one of the two: + +- standard data-refresh (either triggered manually or as scheduled) +- [manual index upgrade](/ingestion_server/guides/upgrade.md) + +The indices will be updated to the new schema and will be made available to the +API. ### API-involved @@ -28,7 +32,7 @@ involve code changes in both the ingestion server and the API. Examples: - changing the type to an incompatible type - renaming of a field -Such kinds of changes need us to precisely deploy the API simultaneously with +Such kinds of changes need us to precisely deploy the API in coordination with the promotion of new index because of these reasons: - If the API deployment lags behind index promotion, the old field that the API @@ -85,10 +89,9 @@ chain so that CI continues to pass for each PR. ### Example Assume we have a field `foo` with type `text` in the index. It has a subfield -`keyword` with type `keyword`. The API uses `foo.keyword` for all purposes. - -One PR that does these 3 things would be an API-involved change. So we split -them into 3 PRs. +`keyword` with type `keyword`. The API uses `foo.keyword` for all purposes. We +want the `foo` field to have type `keyword` and for the API to use `foo` instead +of `foo.keyword`. To accomplish this without downtime, we need three PRs: 1. Changing `foo` to type `keyword` would be an API-free change because it is a type change between two compatible types and does not affect the nested field diff --git a/documentation/ingestion_server/guides/troubleshoot.md b/documentation/ingestion_server/guides/troubleshoot.md index 334119a4439..7a50af95bfa 100644 --- a/documentation/ingestion_server/guides/troubleshoot.md +++ b/documentation/ingestion_server/guides/troubleshoot.md @@ -10,7 +10,7 @@ purpose it is to create documents from the API database and index them in Elasticsearch. Indexer workers are EC2 instances that are stopped by default when indexing is -not taking place. The indexer server raises them up, provides them with the +not taking place. The ingestion server raises them up, provides them with the necessary information to perform the indexing and once they report back to the ingestion server with a completion message, they are shut down again. diff --git a/documentation/ingestion_server/guides/upgrade.md b/documentation/ingestion_server/guides/upgrade.md index c17f12ab114..467019a1c01 100644 --- a/documentation/ingestion_server/guides/upgrade.md +++ b/documentation/ingestion_server/guides/upgrade.md @@ -28,7 +28,7 @@ differences. [Elasticvue](https://elasticvue.com) for this. ```{tip} - In staging, the filtered indices most likely also point to the default index. + In staging, the filtered indices may also point to the default index. ``` 3. Perform [reindexing](/ingestion_server/reference/task_api.md#reindex) of all @@ -37,8 +37,8 @@ differences. created. ```{caution} - Staging indices are supposed to be smaller and should not the full size of - the production dataset. You can + Staging indices are supposed to be smaller and should not have the same + number of documents as the production dataset. You can [interrupt the indexing process](/ingestion_server/guides/troubleshoot.md#interrupt-indexing) once a satisfactory fraction (like ~50%) has been indexed. ``` From 5cf2dd36f1c2280ef48e8850ad7f14c6deabcc2c Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 9 Nov 2023 16:21:55 +0400 Subject: [PATCH 8/8] Document the `DELETE_INDEX` task in the ingestion server API --- .../ingestion_server/guides/upgrade.md | 9 ++++-- .../ingestion_server/reference/task_api.md | 32 ++++++++++++++++++- 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/documentation/ingestion_server/guides/upgrade.md b/documentation/ingestion_server/guides/upgrade.md index 467019a1c01..08e7711ed0a 100644 --- a/documentation/ingestion_server/guides/upgrade.md +++ b/documentation/ingestion_server/guides/upgrade.md @@ -57,7 +57,9 @@ differences. - If the staging API reports errors, immediately switch back the aliases to the old indices. - - If the staging API works, delete the old indices to recover the free space. + - If the staging API works, + [delete the old indices](/ingestion_server/reference/task_api.md#delete_index) + to recover the free space. ### Production deployment @@ -102,8 +104,9 @@ differences. - If the production API reports errors, immediately switch back the aliases to the old indices. - - If the production API works, delete the old indices to recover the free - space. + - If the production API works, + [delete the old indices](/ingestion_server/reference/task_api.md#delete_index) + to recover the free space. ## Rollback diff --git a/documentation/ingestion_server/reference/task_api.md b/documentation/ingestion_server/reference/task_api.md index f69d53bdab4..b2e23b3526f 100644 --- a/documentation/ingestion_server/reference/task_api.md +++ b/documentation/ingestion_server/reference/task_api.md @@ -63,7 +63,7 @@ suffix. $ curl \ -X POST \ -H 'Content-Type: application/json' \ - -d '{"model": "image", "action": "CREATE_AND_POPULATE_FILTERED_INDEX", "destination_index_suffix": "kw"}' \ + -d '{"model": "image", "action": "CREATE_AND_POPULATE_FILTERED_INDEX", "destination_index_suffix": "20231106"}' \ http://localhost:8001/task ``` @@ -93,3 +93,33 @@ $ curl \ -d '{"model": "image", "action": "POINT_ALIAS", "index_suffix": "20231106", "alias": "image-filtered"}' \ http://localhost:8001/task ``` + +## DELETE_INDEX + +This endpoint deletes the given index for a given media type. + +```{danger} +Index deletion is an irreversible destructive operation. Please ensure that you +do not delete an index that is currently in use as the default or filtered index +for a media type. +``` + +### Body + +```typescript +{ + model: "image" | "audio" + action: "DELETE_INDEX" + index_suffix: string +} +``` + +### Example + +```console +$ curl \ + -X POST \ + -H 'Content-Type: application/json' \ + -d '{"model": "image", "action": "DELETE_INDEX", "index_suffix": "20231106"}' \ + http://localhost:8001/task +```