Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update config, remove deployment, add revoke cronjob #2473

Merged
merged 19 commits into from
Aug 22, 2024

Conversation

anna-parker
Copy link
Contributor

@anna-parker anna-parker commented Aug 20, 2024

resolves #1777

preview URL: https://ingest-deployment.loculus.org/

Summary

  • We currently sometime times have dual ingest due to the scheduled cronjobs launching at the same time as the post-launch deployment. Fix this by removing the deployment and only having cronjobs that try to start every 2 minutes (deleted if cannot start up in 1 minute) but are not allowed to be concurrent and will run for max 30min.
  • Additionally add a cronjob that contains revoked. Triggered by: f6bc4a4#r145569186 - getting a rule to run in a container posthoc is hard, instead just create a cronjob that never runs and launch it manually when we are ok with the suggested revocations.

Screenshot

PR Checklist

  • Make sure concurrency is working correctly on argocd: This works perferctly it will just mean that the jobs are always deleted and we see this:
    image - hope it is not an issue
  • Make sure revoke cronjob can be started manually, using kubectl create job --from=cronjob/loculus-revoke-and-regroup-cronjob-{config.organism} <manual-job-name>
    Start works:
    image, then it also shows up in argocd. And it is deleted after 30min as desired:
image

@anna-parker anna-parker added the preview Triggers a deployment to argocd label Aug 21, 2024
@anna-parker
Copy link
Contributor Author

anna-parker commented Aug 21, 2024

I had to add the approve step or jobs were starting very quickly after submission and I could still see concurrency - adding the approve step ensures jobs will run for the full 30minutes.

@anna-parker anna-parker marked this pull request as ready for review August 21, 2024 08:29
@anna-parker anna-parker requested review from theosanderson and corneliusroemer and removed request for theosanderson August 21, 2024 08:29
@corneliusroemer
Copy link
Contributor

Nice! What about making approve quit by itself after 25 minutes? Then we get happy ending and not the kill from kubernetes.

ingest/Snakefile Outdated Show resolved Hide resolved
@anna-parker
Copy link
Contributor Author

anna-parker commented Aug 21, 2024

Sadly even with argocd.argoproj.io/sync-options: Force=true,Replace=true the jobs are not being replaced on sync - I am not sure what is going on here

Update: extensive testing in #2478 (comment) has shown that argocd is actually restarting jobs on resync this just isn't displayed that well

ingest/Snakefile Outdated Show resolved Hide resolved
@anna-parker anna-parker removed the preview Triggers a deployment to argocd label Aug 21, 2024
@anna-parker anna-parker added the preview Triggers a deployment to argocd label Aug 22, 2024
@anna-parker anna-parker merged commit e1cabe0 into main Aug 22, 2024
12 checks passed
@anna-parker anna-parker deleted the ingest_deployment branch August 22, 2024 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Triggers a deployment to argocd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow removal of fast (duplicate) ingest for production environments
3 participants