forked from uc-cdis/gen3-helm
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Goals
- K8s-native orchestration via Argo Workflows running Nextflow (k8s executor).
- Multi-tenant isolation by namespace, RBAC, quotas, and network policies.
- Triggers from internal Git servers and/or GitHub Actions → Argo Events.
- Results and cache on Amazon S3 (or S3-compatible on-prem), surfaced in Argo UI and CI artifacts.
High-Level Architecture (Multi-Tenant)
flowchart TB
subgraph Git[Tenant Git Servers & GitHub Actions]
GH[GitHub Actions per-tenant workflows]
GS[Internal Git servers GHES/GitLab/Gitea]
end
GH -->|Webhook / API| AE[Argo Events Gateway per-tenant or shared]
GS -->|Webhook / Polling| AE
subgraph ARGO[Cluster: Argo Control Plane namespace: argo]
AE --> SENS[Argo Sensors]
SENS -->|submit| WFCTRL[Argo Workflows Controller]
UI[Argo Server SSO + RBAC]
end
subgraph TENANT_A[Namespace: wf-teamA]
WFT_A[WorkflowTemplate: nextflow-runner]
SA_A[(ServiceAccount: nextflow-launcher)]
CFG_A[ConfigMap: nextflow.config k8s profile]
SEC_A[(Secret: S3 creds / WI)]
NF_A[Nextflow Executor Pods]
end
subgraph TENANT_B[Namespace: wf-teamB]
WFT_B[WorkflowTemplate: nextflow-runner]
SA_B[(ServiceAccount: nextflow-launcher)]
CFG_B[ConfigMap: nextflow.config]
SEC_B[(Secret: S3 creds / WI)]
NF_B[Nextflow Executor Pods]
end
WFCTRL -->|launch| WFT_A
WFCTRL -->|launch| WFT_B
WFT_A --> NF_A
WFT_B --> NF_B
subgraph S3[Amazon S3 Multi-tenant buckets/prefixes]
CACHE[Nextflow workDir cache]
ART[Reports & results: report.html, timeline.html, trace.txt]
end
NF_A -->|read/write| S3
NF_B -->|read/write| S3
UI -->|view| WFCTRL
Key multi-tenant controls
- One namespace per team:
wf-<team>, each with own SA/RBAC/Quotas/NetworkPolicies. - Argo Server SSO + RBAC: users only see their namespace.
- S3 buckets or strict prefixes per tenant + lifecycle rules.
- GitHub Actions can trigger Argo (webhook → Argo Events) or call Argo API directly with a short-lived token.
Milestones
M1 — Control Plane & Guardrails
- Install Argo Workflows + Argo Events into
argo. - Configure Argo Server SSO (OIDC) and Argo RBAC (map IdP groups → namespaces).
- Baseline Pod Security and default deny NetworkPolicy cluster-wide.
- Prepare S3: per-tenant bucket (or prefix), IAM policies, lifecycle (cache expiry).
M2 — Tenant Blueprint (repeatable)
Create per tenant:
- Namespace
wf-<team>. - ServiceAccount
nextflow-launcher+ namespace-scoped Role/RoleBinding (pods/jobs + argo resources). - ResourceQuota/LimitRange, NetworkPolicies (egress only to Git, registry, S3, DNS).
- ConfigMap
nextflow-config(k8s profile). - Secret for S3 (or workload identity).
RBAC (namespace-scoped)
apiVersion: v1
kind: ServiceAccount
metadata:
name: nextflow-launcher
namespace: wf-teamA
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: nextflow-executor
namespace: wf-teamA
rules:
- apiGroups: ["", "batch"]
resources: ["pods","pods/log","pods/status","jobs"]
verbs: ["create","get","list","watch","delete","patch"]
- apiGroups: ["argoproj.io"]
resources: ["workflows","workflowtemplates","cronworkflows"]
verbs: ["create","get","list","watch","delete","patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: nextflow-executor-binding
namespace: wf-teamA
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: nextflow-executor
subjects:
- kind: ServiceAccount
name: nextflow-launcher
namespace: wf-teamANextflow k8s profile (uses S3)
apiVersion: v1
kind: ConfigMap
metadata:
name: nextflow-config
namespace: wf-teamA
data:
nextflow.config: |
profiles {
k8s {
process.executor = 'k8s'
workDir = System.getenv('NF_WORKDIR') ?: 's3://teamA-nf-workdir'
k8s {
namespace = 'wf-teamA'
serviceAccount = 'nextflow-launcher'
pod = [
imagePullPolicy: 'IfNotPresent',
resources: [ limits: [ cpu: '4', memory: '8Gi' ], requests: [ cpu: '1', memory: '2Gi' ] ]
]
}
aws.client.bucketRegion = System.getenv('AWS_REGION') ?: 'us-west-2'
// For S3-compatible endpoints, also set aws.client.endpoint & pathStyleAccess
}
}M3 — Artifact Surfacing (Argo ↔ S3)
- Configure ArtifactRepositoryRef so Argo links HTML reports while Nextflow uses S3 as
workDir. - Option A: one global default in
argonamespace (simple). - Option B: per-tenant artifact repo (strict separation).
apiVersion: v1
kind: ConfigMap
metadata:
name: artifact-repositories
namespace: argo
data:
default-v1: |
s3:
bucket: teamA-argo-artifacts
endpoint: s3.amazonaws.com
region: us-west-2
accessKeySecret: { name: teamA-s3, key: accessKey }
secretKeySecret: { name: teamA-s3, key: secretKey }
archiveLogs: trueM4 — WorkflowTemplate (Nextflow runner, reusable)
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: nextflow-runner
namespace: wf-teamA
spec:
entrypoint: run
arguments:
parameters:
- { name: pipeline, value: "./" }
- { name: params, value: "" }
templates:
- name: run
container:
image: ghcr.io/yourorg/nf-runtime:jdk17
command: ["/bin/bash","-lc"]
args:
- >-
set -euo pipefail;
export NXF_HOME="/workspace/.nxf";
cp /config/nextflow.config ./nextflow.config || true;
if [[ "{{workflow.parameters.pipeline}}" == "./" ]]; then
PIPE="./";
else
PIPE="-r {{workflow.parameters.pipeline}}";
fi;
nextflow run $PIPE -profile k8s {{workflow.parameters.params}} \
-with-report report.html -with-timeline timeline.html -with-trace trace.txt;
volumeMounts:
- name: cfg
mountPath: /config
volumes:
- name: cfg
configMap: { name: nextflow-config }
env:
- name: AWS_REGION
valueFrom: { secretKeyRef: { name: s3-credentials, key: AWS_REGION } }
- name: AWS_ACCESS_KEY_ID
valueFrom: { secretKeyRef: { name: s3-credentials, key: AWS_ACCESS_KEY_ID } }
- name: AWS_SECRET_ACCESS_KEY
valueFrom: { secretKeyRef: { name: s3-credentials, key: AWS_SECRET_ACCESS_KEY } }
- name: NF_WORKDIR
valueFrom: { secretKeyRef: { name: s3-credentials, key: NF_WORKDIR } }
# For S3-compatible endpoints, add S3_ENDPOINT + pathStyleAccess in config
outputs:
artifacts:
- { name: report, path: /workspace/report.html }
- { name: timeline, path: /workspace/timeline.html }
- { name: trace, path: /workspace/trace.txt }
serviceAccountName: nextflow-launcherM5 — Triggers
A) GitHub Actions → Argo (Nextflow-based Actions emphasized)
- Preferred: GHA workflow sends a webhook (signed secret) → Argo Events.
- Alternative: GHA calls Argo Server API (OIDC token or robot user) to submit a Workflow from
WorkflowTemplate.
GHA example (dispatch → Argo API)
name: trigger-argo-nextflow
on:
workflow_dispatch:
inputs:
pipeline: { description: "Path or repo", default: "./", required: true }
params: { description: "Nextflow params", required: false, default: "" }
jobs:
submit:
runs-on: ubuntu-latest
steps:
- name: Submit to Argo (tenant A)
env:
ARGO_ADDR: https://argo.example.org
ARGO_TOKEN: ${{ secrets.ARGO_TEAM_A_BEARER }}
run: |
cat > wf.json <<'JSON'
{
"namespace": "wf-teamA",
"resourceKind": "Workflow",
"resource": {
"apiVersion": "argoproj.io/v1alpha1",
"kind": "Workflow",
"metadata": { "generateName": "nf-" },
"spec": {
"workflowTemplateRef": { "name": "nextflow-runner" },
"arguments": {
"parameters": [
{ "name": "pipeline", "value": "${{ github.event.inputs.pipeline }}" },
{ "name": "params", "value": "${{ github.event.inputs.params }}" }
]
}
}
}
}
JSON
curl -sS -X POST "$ARGO_ADDR/api/v1/workflows/wf-teamA/submit" \
-H "Authorization: Bearer $ARGO_TOKEN" \
-H "Content-Type: application/json" \
--data-binary @wf.jsonB) Argo Events (webhook) from Git servers
- EventSource exposes
/webhook/<repo>; Sensor parameterizespipeline¶msand submits fromWorkflowTemplate. - Polling fallback: CronJob that
git ls-remoteand fires a local event when refs change.
M6 — Ad-Hoc & Scheduled
- Ad-hoc: Argo UI “Submit” (RBAC isolates per namespace), or
argo submit --from workflowtemplate/… -p …. - Scheduled:
CronWorkflowper tenant:
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: nightly-nf
namespace: wf-teamA
spec:
schedule: "0 2 * * *"
concurrencyPolicy: "Forbid"
workflowSpec:
workflowTemplateRef: { name: nextflow-runner }
arguments:
parameters:
- { name: pipeline, value: "./pipelines/rnaseq" }
- { name: params, value: "--reads 's3://teamA/raw/*.fastq.gz'" }M7 — Results & Observability
- S3 is source of truth:
s3://teamA-nf-workdir/<pipeline>/<run-id>/… - Argo UI shows HTML reports (report.html, timeline.html, trace.txt) via artifacts.
- Centralize logs to Loki/ELK with labels:
tenant,pipeline,run_id,commit,trigger=gha|webhook. - S3 lifecycle: expire cache/work dirs after N days; retain final outputs per policy.
Multi-Tenant Checklist
- Namespaces: one per team; no shared runtime.
- RBAC: namespace-scoped Roles; Argo RBAC maps IdP groups → only their namespace.
- NetworkPolicies: default deny; egress allowlist (Git, registry, S3, DNS).
- Quotas: per namespace CPU/MEM/POD caps; optional PriorityClasses.
- S3 segmentation: buckets per tenant or prefixes with IAM boundaries.
- Supply chain: signed/pinned images; OPA/Gatekeeper allow-list.
- Secrets: External Secrets / workload identity; never mount cluster-wide creds.
Acceptance
- Ad-hoc run (UI + CLI) and GitHub Actions-triggered run both succeed.
- Nextflow executor pods stay within tenant namespace; no cross-tenant access.
- Artifacts visible in Argo UI; S3 paths follow tenant/run-ID convention.
- Quota breach is contained to the tenant; other tenants unaffected.
- Lifecycle rules reclaim cache without deleting retained outputs.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request