diff --git a/docs/modules/ROOT/pages/adr/0049-managed-openbao.adoc b/docs/modules/ROOT/pages/adr/0049-managed-openbao.adoc new file mode 100644 index 00000000..cbdd3e60 --- /dev/null +++ b/docs/modules/ROOT/pages/adr/0049-managed-openbao.adoc @@ -0,0 +1,331 @@ += ADR 0043 - Managed OpenBao Service Implementation +:adr_author: Yannik Dällenbach +:adr_owner: Schedar/bespinian +:adr_reviewers: Schedar +:adr_date: 2025-01-13 +:adr_upd_date: 2025-01-13 +:adr_status: draft +:adr_tags: service,openbao,secret-management + +include::partial$adr-meta.adoc[] + +[NOTE] +.Summary +==== +This ADR outlines the implementation of a managed OpenBao service on the AppCat platform to provide secret management capabilities to customers. +This builds upon the suggestion of xref:adr/0024-product-choice-for-secret-management.adoc[] to use OpenBao as secret and PKI management solution. +==== + +== Context + +Following the suggestion in xref:adr/0024-product-choice-for-secret-management.adoc[] to use OpenBao for secret management, we need to implement it as a managed service within the AppCat ecosystem. OpenBao provides: + +- Secret storage with REST API +- Vault API compatibility +- Open-source license with Linux Foundation backing +- Self-hostable deployment model + +The service must integrate with the existing AppCat patterns including: + +- Crossplane-based provisioning +- Managed namespace deployment model +- User-workload monitoring integration +- Backup and maintenance automation +- SLA monitoring and reporting + +== Requirements + +=== Functional Requirements + +* **Secret Management**: Store, retrieve, and manage secrets via REST API +* **API Compatibility**: Maintain Vault API compatibility for existing tooling +* **High Availability**: Support clustered deployment for production workloads +* **Authentication**: Integration with OIDC + +=== Operational Requirements + +* **Backup & Recovery**: Automated backup of secret data +* **Monitoring**: SLA metrics, capacity alerts, and operational dashboards +* **Maintenance**: Automated security updates and version upgrades +* **Scaling**: Horizontal scaling capabilities for high-throughput scenarios +* **Security**: Encryption at rest, TLS in transit, audit logging + +== Proposals + +=== Proposal 1: Helm Chart with External Storage + +Deploy OpenBao using the official Helm chart with external storage backends (PostgreSQL). + +Implementation:: + +- Use `provider-helm` to deploy OpenBao Helm chart +- PostgreSQL backend for secret storage (leveraging existing `VSHNPostgreSQL`) +- Initialization through composition functions + +Advantages:: + +- Leverages existing PostgreSQL infrastructure +- Official Helm chart provides production-ready deployment +- Separation of compute and storage for better scaling +- Familiar AppCat deployment patterns + +Disadvantages:: + +- Additional complexity in managing external dependencies +- Potential performance overhead with external storage + +=== Proposal 2: Helm Chart with Internal Storage + +Deploy OpenBao using the official Helm chart with integrated storage using Raft consensus. + +Implementation:: + +- Use `provider-helm` to deploy OpenBao Helm chart +- Raft storage backend for simplicity +- Built-in clustering for high availability + +Advantages:: + +- Simplified deployment with fewer external dependencies +- Built-in consensus and replication + +Disadvantages:: + +- Raft cluster management overhead + +=== Proposal 3: Operator-Based Deployment + +Develop or adopt an OpenBao operator for Kubernetes-native management. + +Implementation:: + +- Custom operator following AppCat patterns +- CRDs for vault configuration and policies +- Automated lifecycle management +- Native Kubernetes integration + +Advantages:: + +- Full Kubernetes-native experience +- Automated day-2 operations +- Extensible for future features + +Disadvantages:: + +- High development effort +- Additional operational complexity +- Maintenance burden for custom operator + +== Decision + +**Proposal 2: Helm Chart with Internal Storage** + +We choose to implement OpenBao using the official Helm chart with integrated Raft storage. + +### Implementation Details + +Storage Backend:: + +- Primary: Raft consensus storage for built-in clustering +- Leverage existing AppCat Backup mechanisms (K8up) +- Self-contained storage eliminates external dependencies + +**API Specification:** + +The VSHNOpenBao CRD follows AppCat conventions (xref:adr/0016-service-api-design.adoc[]) with parameter groups for service configuration, sizing, backup, monitoring, and maintenance. + +```yaml +apiVersion: vshn.appcat.vshn.io/v1 +kind: VSHNOpenBao +metadata: + name: my-openbao + namespace: my-namespace +spec: + parameters: + # Service configuration + service: + version: "2.1.0" # OpenBao version (enum of supported versions) + fqdn: "openbao.example.com" # Fully qualified domain name + serviceLevel: guaranteed # besteffort or guaranteed + openBaoSettings: + # Auto-unseal configuration (optional) + # Enables automatic unsealing using external key management systems + # Only one provider should be configured at a time + autoUnseal: + awsKmsSecretRef: "" # Reference to secret containing AWS KMS credentials and configuration + azureKeyVaultSecretRef: "" # Reference to secret containing Azure Key Vault credentials and configuration + gcpKmsSecretRef: "" # Reference to secret containing GCP Cloud KMS credentials and configuration + transitSecretRef: "" # Reference to secret containing connection details to another Vault/OpenBao instance + + # Number of OpenBao instances + # For guaranteed serviceLevel: must be 3 + # For besteffort serviceLevel: can be 1 or 3 + instances: 3 + + # Sizing + size: + plan: standard # Resource plan: small, standard, large + requests: + cpu: "2" + memory: "4Gi" + disk: 20Gi # Raft storage volume size per replica + storageClass: "" # Optional storage class override + + # Backup and restore configuration (using K8up) + backup: + enabled: true + schedule: "0 2 * * *" # Cron schedule for Raft snapshots + retention: + keepLast: 2 + keepHourly: 2 + keepDaily: 7 + keepWeekly: 4 + keepMonthly: 3 + restore: + claimName: "" + backupName: "" + + # Maintenance window + maintenance: + dayOfWeek: Tuesday # enum: Monday-Sunday + timeOfDay: "22:00" # HH:MM format in UTC + + # Monitoring + monitoring: + alertmanagerConfigRef: "" + alertmanagerConfigSecretRef: {} + alertmanagerConfigTemplate: {} + email: "" + + # Unseal keys and root token secret reference + # This secret will contain the unseal keys and root token generated during initialization + writeConnectionSecretToRef: + name: openbao-unseal-keys +``` + +**Unseal Keys Secret:** + +The `writeConnectionSecretToRef` secret contains the unseal keys and root token: + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: openbao-unseal-keys +data: + UNSEAL_KEY_1: + UNSEAL_KEY_2: + UNSEAL_KEY_3: + UNSEAL_KEY_4: + UNSEAL_KEY_5: + ROOT_TOKEN: +``` + +**Auto-unseal** + +Auto unseal allows OpenBao to unseal automatically without manual intervention using an external key management system. This is crucial for automated recovery and reduces operational burden. + +By default OpenBao instances will be configured to use a central, internal VSHN managed Vault or OpenBao to auto-unseal. + +WARNING: If a customer configures an auto-unseal provider, only the service level "besteffort" can be guaranteed. + +Supported auto-unseal providers: + +AWS KMS::: Configure using `awsKmsSecretRef` pointing to a secret containing AWS credentials and KMS key configuration +Azure Key Vault::: Configure using `azureKeyVaultSecretRef` pointing to a secret containing Azure credentials and Key Vault details +GCP Cloud KMS::: Configure using `gcpKmsSecretRef` pointing to a secret containing GCP credentials and Cloud KMS configuration +Transit (Vault/OpenBao)::: Configure using `transitSecretRef` pointing to a secret containing connection details to another Vault/OpenBao instance + +Each secret reference should contain the necessary credentials and configuration for the respective provider. When auto-unseal is configured, OpenBao will automatically unseal after restarts without requiring the unseal keys from `writeUnsealKeysSecretToRef`. + +If no auto-unseal provider is configured, manual unsealing using the unseal keys is required after each pod restart. + +Example AWS KMS auto-unseal secret: + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: openbao-awskms-config +type: Opaque +stringData: + region: "us-east-1" + access_key: "AKIAIOSFODNN7EXAMPLE" + secret_key: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" + kms_key_id: "19ec80b0-dfdd-4d97-8164-c6examplekey" + endpoint: "https://vpce-0e1bb1852241f8cc6-pzi0do8n.kms.us-east-1.vpce.amazonaws.com" +``` + +**Service Levels:** + +besteffort:: +- 1 or 3 instances +- Standard resource guarantees +- Best-effort availability + +guaranteed:: +- Requires 3 instances (HA deployment) +- Resource guarantees with pod anti-affinity +- Higher availability SLA + +**Plans:** + +By default, the following plans are available on every cluster: + +[cols="25a,15,15,15", options="header"] +|=== +| Plan | CPU | Memory | Disk +| standard-2 | 500m | 2Gi | 16Gi +| standard-4 | 1 | 4Gi | 16Gi +| standard-8 | 2 | 8Gi | 16Gi +|=== + +Key Components:: + +1. **OpenBao Cluster**: 3-node HA deployment with Raft consensus +2. **Raft Storage**: Built-in distributed storage backend +3. **Backup Storage**: `ObjectBucket` for Raft snapshots using K8up +4. **Monitoring**: Custom SLI exporter and Prometheus integration + +Security Model:: + +- TLS encryption for all communications +- RBAC policies managed through OpenBao +- Audit logging to persistent storage +- Auto-unseal configuration for OpenBao bootstrap + +== Consequences + +Positive:: + +- Simplified deployment with fewer external dependencies +- Built-in consensus and replication reduces operational complexity +- Self-contained backup mechanisms using Raft snapshots +- Leverages official OpenBao Helm chart for production readiness +- Eliminates external storage dependency management + +Negative:: + +- Raft cluster management requires specialized knowledge +- Limited to OpenBao's built-in storage capabilities +- Potential storage scaling limitations compared to external databases +- No feature parity with HashiCorp Vault Enterprise + +Operational Impact:: + +- Simplified service deployment with reduced external dependencies +- Raft snapshot management and restoration procedures +- Need for OpenBao and Raft consensus expertise in operations team +- Integration testing with existing AppCat services +- TLS certificate lifecycle management (renewal, rotation) +- Auto-unseal configuration and cluster bootstrap management +- Raft cluster health monitoring and node management +- Audit log management and compliance reporting +- ServiceMonitor configuration for Prometheus integration +- Snapshot-based backup validation and testing + +Customer Benefits:: + +- Self-hosted alternative to cloud secret management services +- Vault API compatibility for existing applications and tooling +- Compliance with data sovereignty requirements diff --git a/docs/modules/ROOT/partials/nav-adrs.adoc b/docs/modules/ROOT/partials/nav-adrs.adoc index cd51872e..2f7cffb3 100644 --- a/docs/modules/ROOT/partials/nav-adrs.adoc +++ b/docs/modules/ROOT/partials/nav-adrs.adoc @@ -45,4 +45,5 @@ ** xref:adr/0045-service-orchestration-crossplane-2-0.adoc[] ** xref:adr/0046-secret-management-framework-2-0.adoc[] ** xref:adr/0047-service-maintenance-and-upgrades-framework-2-0.adoc[] -** xref:adr/0048-evaluating-vector-databases-as-appcat-services.adoc[] \ No newline at end of file +** xref:adr/0048-evaluating-vector-databases-as-appcat-services.adoc[] +** xref:adr/0049-managed-openbao.adoc[] diff --git a/templates/adr/cookiecutter.json b/templates/adr/cookiecutter.json index 0241b844..c5b5b833 100644 --- a/templates/adr/cookiecutter.json +++ b/templates/adr/cookiecutter.json @@ -1,5 +1,5 @@ { - "adr_number": "0049", + "adr_number": "0050", "full_name": "VSHNeer Name", "adr_title": "Title", "adr_reviewers": "", @@ -37,4 +37,4 @@ }, "adr_tags": "Comma separated list of tags - all lowercase" } -} \ No newline at end of file +}