Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allstar operations overview follow-ups #528

Open
2 of 14 tasks
justaugustus opened this issue Jul 1, 2024 · 1 comment
Open
2 of 14 tasks

Allstar operations overview follow-ups #528

justaugustus opened this issue Jul 1, 2024 · 1 comment
Assignees
Labels
infrastructure Deployment, logging, or monitoring Allstar instances, public or self-hosted

Comments

@justaugustus
Copy link
Member

@jeffmendoza ran a quick Allstar operations overview for the other @ossf/scorecard-admins (Steering) members and I want to make sure we capture some of the content and potential follow-ups as an issue.

GCP access

  • Stephen, Spencer, and Raghav now have Owner access to GCP instance
  • Configure access for additional Scorecard maintainers

Deployment

Pushes to main branch get deployed to staging instance via Google Cloud Build (GCB).
Container images are built via ko and then pushed to Google Container Registry (GCR).

Allstar runs on App Engine flexible environments.

Production deployments are manual runs (app-prod.yml) within the GCP console.


Actions

  • Migrate from GCR to Google Artifact Registry (GAR) (need to do this for scorecard as well)
    • Dual-publish to/use instead GitHub Container Registry (GHCR)?
  • Allstar replatformed to use GKE internally at Google. Raghav to share Terraform examples to bootstrap GKE instance
  • Share Jeff's custom log queries across the GCP project
  • Secrets: not using KMS (potentially use Chainguard Octo STS config workflow as an example)

What would Jeff fix?

  • Shard over installation IDs (need GKE + StatefulSets)
  • Multiple public instances to allow for Branch Protection usage
  • Make it easier for people to run Allstar e.g.,
    • better operator.md
    • oneshot via GitHub Actions (or other mechanism?; needs PAT)
  • Mechanism for surfacing operations status e.g., status page, badge, etc. (same for scorecard)
  • Logging and monitoring work
@justaugustus justaugustus added the infrastructure Deployment, logging, or monitoring Allstar instances, public or self-hosted label Jul 1, 2024
@justaugustus
Copy link
Member Author

Some notes that I took during our [attempted] deployment earlier this week...
(These should get rolled into the issue description task list, but for now, I just want to make sure they're out of my head/notepad):

Is the staging deployment useful / how are we getting feedback from staging before prod deploys / who's running staging?

Currently, just @jeffmendoza in a test organization.
We should encourage others to do so and create a path for providing feedback on this deployment ahead of prod rollouts.

Only one instance of staging should serve at a time

Rarely, staging deploys can hiccup. Will this cause multiple instances of staging to be run simultaneously?
Is there a programmatic way to prevent that behavior?

Error-handling improvements e.g., for rate limits

We should ensure we gracefully handle known error codes e.g., #36

We are creating two images and should be publishing one and pull it during the deploy

Currently, tags will trigger this image build workflow:

- run: ko publish -B ./cmd/allstar --tags ${{ github.ref_name }} --image-refs allstar.ref
env:
KO_DOCKER_REPO: ghcr.io/${{ github.repository_owner }}
- run: ko publish -B ./cmd/allstar --tags ${{ github.ref_name }}-busybox --image-refs allstar-busybox.ref
env:
KO_DOCKER_REPO: ghcr.io/${{ github.repository_owner }}
KO_DEFAULTBASEIMAGE: cgr.dev/chainguard/busybox

Google Cloud Build runs trigger a different workflow:

allstar/cloudbuild.yaml

Lines 5 to 10 in e1316aa

- name: golang:1.21
entrypoint: bash
args: ['-c', 'KO_DOCKER_REPO="gcr.io/allstar-ossf" /go/bin/ko publish ./cmd/allstar > container']
- name: gcr.io/google.com/cloudsdktool/cloud-sdk
entrypoint: bash
args: ['-c', 'gcloud app deploy --appyaml=app-staging.yaml --project=allstar-ossf --image-url $(cat container)']

I'll work on fixing this, with a few premises in mind:

  • we should build images on every new commit to main
  • the way we publish images needs to be consistent, whether the image build is triggered from a push to main or because a new tag is cut
  • we should publish images to the same place (GHCR)
  • deployment configs should only be concerned with running, not building images

Signed-Releases check panicked and not noticed on Allstar staging deployment

@jeffmendoza is investigating this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Deployment, logging, or monitoring Allstar instances, public or self-hosted
Projects
None yet
Development

No branches or pull requests

2 participants