Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate indexer #841

Merged
merged 14 commits into from
May 16, 2024
Merged

Separate indexer #841

merged 14 commits into from
May 16, 2024

Conversation

joverlee521
Copy link
Contributor

@joverlee521 joverlee521 commented Apr 26, 2024

Description of proposed changes

This PR starts the use of versioned resource index JSONs so that we no longer run into out-of-sync issues between the canary and production servers.

Major changes:

  • the scheduled index resources workflow updates both canary and production resource index JSONs if they are different versions
  • the CI workflow rebuilds the resource index before deployment of canary
  • the CI workflow builds a /tmp/<branch> prefixed resource index for review apps

Related issue(s)

Related to #829

Checklist

@nextstrain-bot nextstrain-bot temporarily deployed to nextstrain-s-separate-i-yb1c3h April 26, 2024 00:29 Inactive
@joverlee521 joverlee521 temporarily deployed to nextstrain-s-separate-i-yb1c3h April 26, 2024 00:32 Inactive
@joverlee521

This comment was marked as outdated.

@joverlee521

This comment was marked as outdated.

@joverlee521

This comment was marked as outdated.

get-resource-index Outdated Show resolved Hide resolved
@joverlee521 joverlee521 temporarily deployed to nextstrain-s-separate-i-yb1c3h May 1, 2024 20:00 Inactive
@joverlee521 joverlee521 marked this pull request as ready for review May 1, 2024 20:53
@joverlee521 joverlee521 requested a review from a team May 1, 2024 20:53
@tsibley tsibley self-requested a review May 1, 2024 21:02
@jameshadfield jameshadfield self-requested a review May 1, 2024 22:06
Copy link
Member

@tsibley tsibley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour will be good to have! Checkpointing my review commentary for now. Will finish up tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use terraform providers lock to pre-populate hashes for different platforms.¹

Looks like the added hashes are for darwin_arm64 (and existing ones are for linux_amd64), so we're just missing darwin_amd64 if any devs are running that combo. I think fine to leave it as-is for now, but good to know about terraform providers lock!

.github/workflows/index-resources.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/index-resources.yml Outdated Show resolved Hide resolved
@joverlee521 joverlee521 temporarily deployed to nextstrain-s-separate-i-yb1c3h May 2, 2024 00:37 Inactive
Copy link
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be a nice improvement. I got a bit lost in the second half of the commits as the interconnectedness ramps up, so I'll try to re-read them tomorrow if I get a change.

One missing piece seems to be documentation on incrementing the resource id¹ (e.g. v1.json to v2.json), and the relationship between the test/production environments vs local indexes once you do increment it.

¹ I still think it shouldn't be called v* as it's overloading the word "version". But I won't keep pressing on this.

@@ -52,7 +52,8 @@
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::nextstrain-inventories/resources.json.gz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this change, there's a not-in-terraform lifecycle delete old versions of the index. The prefix on this lifecycle needs to be updated. I'm happy to help here, and it's not blocking the merge of this PR (in fact it's easier to do it after merge).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH good to know! I added this to the PR checklist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have that sort of thing in Terraform for exactly this reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, the lifecycle only accepts a prefix and not regex/pattern matching. So I've updated the lifecycle delete old versions of the index to the prefix resources to capture both the old resources.json.gz and the new resources/*.json.gz files.

.github/workflows/get-heroku-app-commit-index.yml Outdated Show resolved Hide resolved
jq_filter='[ env.PRODUCTION_COMMIT, env.CANARY_COMMIT ]'
fi

ref_matrix=$(jq -c --null-input "$jq_filter")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm admittedly not great at jq so could you explain what's hapenning here? My reading/testing is that we're just setting a string with identical contents to that of $jq_filter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, bad naming on my part! Like you said, this is just my way of ensuring I am building a properly formatted JSON array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case it's clarifying, the difference between:

ref_matrix=$(jq -c --null-input "[ env.PRODUCTION_COMMIT, env.CANARY_COMMIT ]")

and something like:

ref_matrix="[ \"$PRODUCTION_COMMIT\", \"$CANARY_COMMIT\" ]"

is that the former ensures a valid JSON value and the latter does not.

@@ -16,13 +16,25 @@ on:
If not provided, will use commits of the latest slug of nextstrain-server and nextstrain-canary Heroku apps.
type: string
required: false
resource_index_prefix:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for running this manually? (I can't think of one, but maybe there is?) If we are going to call this manually I think we need to add documentation explaining why we would do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for running this manually?

The one use case I can think of is if we create a review app directly within Heroku without a PR that triggers the automated CI workflow rebuild of the index. We would manually run the index-resources.yml to upload the tmp index.


workflow_call:
inputs:
ref:
description: Ref of the repo to use to generate the resource index.
type: string
required: true
resource_index_prefix:
description: |
A prefix to add to the uploaded resource index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the language here needs to be a bit more precise. It's not a prefix in the S3 sense of the word, it's a string that'll be added within the final s3 key, akin to a directory name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relevant for this PR since I'm going to do the review-app stuff separately.

permissions:
id-token: write # needed to interact with GitHub's OIDC Token endpoint
contents: read
uses: ./.github/workflows/index-resources.yml
with:
ref: ${{ github.sha }}
resource_index_prefix: ${{ needs.get_branch.outputs.github_branch }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this line quite confusing and (for me) it would be easier to understand if you renamed the get_branch job toget_resource_index_prefix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relevant for this PR since I'm going to do the review-app stuff separately.

Procfile Outdated Show resolved Hide resolved
Procfile Outdated
@@ -1 +1 @@
web: RESOURCE_INDEX=$(RESOURCE_INDEX_PREFIX="$HEROKU_BRANCH" ./get-resource-index) node server.js --verbose
web: RESOURCE_INDEX=$(RESOURCE_INDEX_PREFIX="$HEROKU_BRANCH" node ./scripts/get-resource-index.js) node server.js --verbose
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some docs on HEROKU_BRANCH? I can only find this page which implies it's only for review apps, but also implies that it's not stable and is likely to change in the future.

It'd be clearer to prepend the server call with a env-setting function, like set-review-app-variables which could self-document the steps that are going on, and perhaps query other env variables to ensure we are indeed running on a review app.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we already use one of those variables, HEROKU_PR_NUMBER, to detect review app-ness:

/**
* Running as a Heroku Review app?
*
* These run in production mode but do not have access to sensitive information
* (private S3 buckets, production Redis, production encryption keys, etc).
*
* See {@link https://devcenter.heroku.com/articles/github-integration-review-apps#injected-environment-variables}.
*
* @type {boolean}
*/
export const REVIEW_APP = !!process.env.HEROKU_PR_NUMBER;

and adjust config:

/**
* Flag indicating if Redis (via REDIS_URL) is required.
*
* Defaults to true if {@link PRODUCTION} and not {@link REVIEW_APP}, otherwise
* false.
*
* @type {boolean}
*/
/* XXX TODO: Provision lowest-tier Heroku Redis addon for review apps and
* remove the !REVIEW_APP condition.
*
* Heroku has good support for this, but to enable it I think we need to switch
* how we configure review apps. Currently they're configured via the Heroku
* Dashboard¹. I think we need to switch to the app.json file² instead, as
* described in the review app documentation.³
* -trs, 12 Oct 2023
*
* ¹ <https://dashboard.heroku.com/pipelines/38f67fc7-d93c-40c6-a182-501da2f89d9d/settings>
* ² <https://devcenter.heroku.com/articles/app-json-schema>
* ³ <https://devcenter.heroku.com/articles/github-integration-review-apps>
*/
export const REDIS_REQUIRED = fromEnvOrConfig("REDIS_REQUIRED", PRODUCTION && !REVIEW_APP);

Copy link
Member

@jameshadfield jameshadfield May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we already have the concept of heroku review apps in the config.js, and that's also where the index location is defined (as in, the location definition the server's actually using), let's modify the logic of the following section (and related docs) to account for running on heroku review apps:

/**
* Location of the JSON file to be used as the resource collection index. Can be
* a S3 address or a local filepath, if S3 then the file must be gzipped.
*
* If sourced from the config file, relative paths are resolved relative to the
* repo root directory ("nextstrain.org").
*
* Falsey values result in the resource collection functionality not being used.
*/
export const RESOURCE_INDEX = fromEnvOrConfig("RESOURCE_INDEX", null);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hesitant to suggest that because it means that the value passed in as RESOURCE_INDEX won't be what's actually used depending on unseen other values (i.e. HEROKU_PR_NUMBER), and that seemed undesirable/confusing and a first for the config. I think RESOURCE_INDEX for review apps should still be set outside of the app, like it is for production.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relevant for this PR since I'm going to do the review-app stuff separately.

Copy link
Member

@tsibley tsibley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be helpful to have separate indexes!

I wonder though about approaching the machinery for setting them up a bit differently, as noted in my comments.

.github/workflows/get-heroku-app-commit-index.yml Outdated Show resolved Hide resolved
.github/workflows/get-heroku-app-commit-index.yml Outdated Show resolved Hide resolved
.github/workflows/index-resources.yml Outdated Show resolved Hide resolved
Procfile Outdated Show resolved Hide resolved
scripts/get-resource-index.js Outdated Show resolved Hide resolved
Procfile Outdated
@@ -1 +1 @@
web: RESOURCE_INDEX=$(RESOURCE_INDEX_PREFIX="$HEROKU_BRANCH" ./get-resource-index) node server.js --verbose
web: RESOURCE_INDEX=$(RESOURCE_INDEX_PREFIX="$HEROKU_BRANCH" node ./scripts/get-resource-index.js) node server.js --verbose
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we already use one of those variables, HEROKU_PR_NUMBER, to detect review app-ness:

/**
* Running as a Heroku Review app?
*
* These run in production mode but do not have access to sensitive information
* (private S3 buckets, production Redis, production encryption keys, etc).
*
* See {@link https://devcenter.heroku.com/articles/github-integration-review-apps#injected-environment-variables}.
*
* @type {boolean}
*/
export const REVIEW_APP = !!process.env.HEROKU_PR_NUMBER;

and adjust config:

/**
* Flag indicating if Redis (via REDIS_URL) is required.
*
* Defaults to true if {@link PRODUCTION} and not {@link REVIEW_APP}, otherwise
* false.
*
* @type {boolean}
*/
/* XXX TODO: Provision lowest-tier Heroku Redis addon for review apps and
* remove the !REVIEW_APP condition.
*
* Heroku has good support for this, but to enable it I think we need to switch
* how we configure review apps. Currently they're configured via the Heroku
* Dashboard¹. I think we need to switch to the app.json file² instead, as
* described in the review app documentation.³
* -trs, 12 Oct 2023
*
* ¹ <https://dashboard.heroku.com/pipelines/38f67fc7-d93c-40c6-a182-501da2f89d9d/settings>
* ² <https://devcenter.heroku.com/articles/app-json-schema>
* ³ <https://devcenter.heroku.com/articles/github-integration-review-apps>
*/
export const REDIS_REQUIRED = fromEnvOrConfig("REDIS_REQUIRED", PRODUCTION && !REVIEW_APP);

joverlee521 added a commit that referenced this pull request May 7, 2024
Prevent our scheduled rebuild of the index resources from uploading
to the incorrect S3 destination by parsing the `RESOURCE_INDEX` from
the config module.

This does not address the edge case of the config.RESOURCE_INDEX in the
live server being different than the config.RESOURCE_INDEX in GitHub
Actions, which can happen if we set the `RESOURCE_INDEX` envvar on
in Heroku.

We can improve on this in the future by surfacing the deployment metadata
from our servers via our own API endpoint as suggested by @tsibley in
<#841 (comment)>
@joverlee521 joverlee521 force-pushed the separate-indexer branch 5 times, most recently from 0c7d1cd to a523dfb Compare May 7, 2024 22:20
joverlee521 added a commit that referenced this pull request May 7, 2024
Prevent our scheduled rebuild of the index resources from uploading
to the incorrect S3 destination by parsing the `RESOURCE_INDEX` from
the config module.

This does not address the edge case of the config.RESOURCE_INDEX in the
live server being different than the config.RESOURCE_INDEX in GitHub
Actions, which can happen if we set the `RESOURCE_INDEX` envvar on
in Heroku.

We can improve on this in the future by surfacing the deployment metadata
from our servers via our own API endpoint as suggested by @tsibley in
<#841 (comment)>
.github/workflows/ci.yml Outdated Show resolved Hide resolved
scripts/get-resource-index-ref-matrix Outdated Show resolved Hide resolved
scripts/get-resource-index-ref-matrix Outdated Show resolved Hide resolved
scripts/get-resource-index-ref-matrix Outdated Show resolved Hide resolved
Comment on lines 2 to 6
# Checks the Nextstrain production and canary Heroku apps to determine
# if they are using the same `config.RESOURCE_INDEX` and returns a JSON array
# of the the commit hashes for the different RESOURCE_INDEX.
#
# Intended to be used within the `index-resources.yml` GH Action workflow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking

This whole thing would be nice to simplify away with #853.

scripts/get-resource-index-ref-matrix Outdated Show resolved Hide resolved
.github/workflows/index-resources.yml Outdated Show resolved Hide resolved
Comment on lines +28 to +35
These updates include changes to:

* any scripts within ``resourceIndexer/*``
* ``data/manifest_core.json``
* ``convertManifestJsonToAvailableDatasetList`` in ``src/endpoints/charon/parseManifest.js``
* ``datasetRedirectPatterns`` in ``src/redirects.js``

If you are ever unsure, it's better to just bump the revision number!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking

How much simpler would it be if we used the commit id as the "index version" instead of a monotonically-increasing number? (I know we floated both ideas, but I don't recall discussing which to use…)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if we use commit id, we wouldn't run into potential version number clashes between branches.

I thought using commit id comes with it's own weirdness if we continue to store the RESOURCE_INDEX in the config.json. E.g. the commit id in the RESOURCE_INDEX would be pointing to the parent commit id. Also, instead of updating it manually, we would probably want to set up a commit hook to automatically update the RESOURCE_INDEX on every commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted more about this with @tsibley in our 1:1 today.

If we do use the commit id as the version, then we would probably stop hard-coding the RESOURCE_INDEX within the config.json and just set it within the Heroku slug. It would be less obvious of what RESOURCE_INDEX the deployed site is using unless you download the slug or we implement #853.


Leaving this as a future decision as we can make do with the number versions for now.

Copy link
Member

@tsibley tsibley May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. Let's not rewrite files on every single commit! We could instead do something similar as part of the build process (e.g. bake in an updated RESOURCE_INDEX, or bake in a commit id in a file that's used at runtime to construct the right path, etc.). This could even be done via Git's ident munging (à la RCS). [Update: Ah, not quite. That's the blob id, not the commit id. A custom filter clean/smudge would be needed.] But let's leave that for the future, if ever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can revisit this idea in #867

joverlee521 added a commit that referenced this pull request May 14, 2024
joverlee521 and others added 6 commits May 14, 2024 11:23
This change rebuilds separate indexes for the nextstrain-server and
nextstrain-canary Heroku apps if they are using different paths for
`RESOURCE_INDEX` in their config module. If they share the same path,
then the index will only get rebuilt and uploaded once.

The workflow_call's `GITHUB_SHA` takes precedence over the
canary/production commits. Therefore, the CI workflow's call of the
index-resources.yml should only trigger one rebuild/upload using the
commit that the CI was run on.
Attempt to prevent multiple runs of the workflow to not step on each
others' toes by adding a concurrency restriction based on the triggering
branch.

This still doesn't fully guarantee that concurrent runs don't update
the same index because branches can still be using the same
RESOURCE_INDEX.
Include some example changes that would warrant bumping the revision
number for the resources JSON.
Simplify the fetch of the slug id by using a different Heroku API
endpoint that only requires the Heroku app name.

Co-authored-by: Thomas Sibley <tsibley@fredhutch.org>
Instead of parsing the `RESOURCE_INDEX` matrix again within the
GH Action workflow, just return the value as part of the matrix
created by `./scripts/get-resource-index-ref-matrix`.

This also makes it easier to limit concurrency of the workflow
job based on the resource index.
@joverlee521 joverlee521 force-pushed the separate-indexer branch 2 times, most recently from 784b56c to 159ccf7 Compare May 14, 2024 18:56
Allow the `deploy` job to run even if the `index-resource` job was
skipped, e.g. when running via `workflow_dispatch`.

These additional conditions are required because according to the
`jobs.<job_id>.needs` docs¹

> If a job fails or is skipped, all jobs that need it are skipped unless
the jobs use a conditional expression that causes the job to continue.

¹ <https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idneeds>
scripts/get-resource-index-ref-matrix Outdated Show resolved Hide resolved
scripts/get-resource-index-ref-matrix Outdated Show resolved Hide resolved
joverlee521 and others added 2 commits May 15, 2024 15:52
Remove unused variable `resource_index`.

Co-authored-by: Thomas Sibley <tsibley@fredhutch.org>
@joverlee521
Copy link
Contributor Author

I'm planning to merge this tomorrow ~morning, so I can monitor the first CI deployment.

@joverlee521 joverlee521 merged commit 359f0db into master May 16, 2024
7 checks passed
@joverlee521 joverlee521 deleted the separate-indexer branch May 16, 2024 17:33
@joverlee521
Copy link
Contributor Author

joverlee521 commented May 16, 2024

The first merge/deploy failed because I missed the secrets weren't being passed from CI to index-resources. Fixed in #864.

However still running into some issues:

  1. The second merge/deploy succeeded, however, the build-index-matrix job pointed to the nextstrain-server ref -> This is more complicated because the github context is the caller workflows context! (Looking into workarounds in How do you identify the trigger action that's coming from Workflow Call? actions/runner#1884). --> Fixing in index-resources.yml: Add is-workflow-call input #865

  2. I manually triggered the index-resources workflow, but the created matrix hash for the nextstrain-canary is the shortened version instead of the full hash. -> Fixing with a77b0f2.

joverlee521 added a commit that referenced this pull request May 16, 2024
Use the full commit hash so that downstream interactions with the slug
can easily get the full hash.

Follow up to #841 (comment)
joverlee521 added a commit that referenced this pull request May 16, 2024
Ensure the error exit code from `./scripts/get-resource-index-ref-matrix`
is reflected in the GH Action workflow job.

Noted by @tsibley in post-merge review¹

¹ <#841 (comment)>
@joverlee521
Copy link
Contributor Author

Last night's scheduled index resources workflow ran as expected, updating both the nextstrain-server resources.json.gz and the nextstrain-canary resources/v1.json.gz 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants