Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
331 changes: 331 additions & 0 deletions docs/modules/ROOT/pages/adr/0049-managed-openbao.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,331 @@
= ADR 0043 - Managed OpenBao Service Implementation
:adr_author: Yannik Dällenbach
:adr_owner: Schedar/bespinian
:adr_reviewers: Schedar
:adr_date: 2025-01-13
:adr_upd_date: 2025-01-13
:adr_status: draft
:adr_tags: service,openbao,secret-management

include::partial$adr-meta.adoc[]

[NOTE]
.Summary
====
This ADR outlines the implementation of a managed OpenBao service on the AppCat platform to provide secret management capabilities to customers.
This builds upon the suggestion of xref:adr/0024-product-choice-for-secret-management.adoc[] to use OpenBao as secret and PKI management solution.
====

== Context

Following the suggestion in xref:adr/0024-product-choice-for-secret-management.adoc[] to use OpenBao for secret management, we need to implement it as a managed service within the AppCat ecosystem. OpenBao provides:

- Secret storage with REST API
- Vault API compatibility
- Open-source license with Linux Foundation backing
- Self-hostable deployment model

The service must integrate with the existing AppCat patterns including:

- Crossplane-based provisioning
- Managed namespace deployment model
- User-workload monitoring integration
- Backup and maintenance automation
- SLA monitoring and reporting

== Requirements

=== Functional Requirements

* **Secret Management**: Store, retrieve, and manage secrets via REST API
* **API Compatibility**: Maintain Vault API compatibility for existing tooling
* **High Availability**: Support clustered deployment for production workloads
* **Authentication**: Integration with OIDC

=== Operational Requirements

* **Backup & Recovery**: Automated backup of secret data
* **Monitoring**: SLA metrics, capacity alerts, and operational dashboards
* **Maintenance**: Automated security updates and version upgrades
* **Scaling**: Horizontal scaling capabilities for high-throughput scenarios
* **Security**: Encryption at rest, TLS in transit, audit logging

== Proposals

=== Proposal 1: Helm Chart with External Storage

Deploy OpenBao using the official Helm chart with external storage backends (PostgreSQL).

Implementation::

- Use `provider-helm` to deploy OpenBao Helm chart
- PostgreSQL backend for secret storage (leveraging existing `VSHNPostgreSQL`)
- Initialization through composition functions

Advantages::

- Leverages existing PostgreSQL infrastructure
- Official Helm chart provides production-ready deployment
- Separation of compute and storage for better scaling
- Familiar AppCat deployment patterns

Disadvantages::

- Additional complexity in managing external dependencies
- Potential performance overhead with external storage

=== Proposal 2: Helm Chart with Internal Storage

Deploy OpenBao using the official Helm chart with integrated storage using Raft consensus.

Implementation::

- Use `provider-helm` to deploy OpenBao Helm chart
- Raft storage backend for simplicity
- Built-in clustering for high availability

Advantages::

- Simplified deployment with fewer external dependencies
- Built-in consensus and replication

Disadvantages::

- Raft cluster management overhead

=== Proposal 3: Operator-Based Deployment

Develop or adopt an OpenBao operator for Kubernetes-native management.

Implementation::

- Custom operator following AppCat patterns
- CRDs for vault configuration and policies
- Automated lifecycle management
- Native Kubernetes integration

Advantages::

- Full Kubernetes-native experience
- Automated day-2 operations
- Extensible for future features

Disadvantages::

- High development effort
- Additional operational complexity
- Maintenance burden for custom operator

== Decision

**Proposal 2: Helm Chart with Internal Storage**

We choose to implement OpenBao using the official Helm chart with integrated Raft storage.

### Implementation Details

Storage Backend::

- Primary: Raft consensus storage for built-in clustering
- Leverage existing AppCat Backup mechanisms (K8up)
- Self-contained storage eliminates external dependencies

**API Specification:**

The VSHNOpenBao CRD follows AppCat conventions (xref:adr/0016-service-api-design.adoc[]) with parameter groups for service configuration, sizing, backup, monitoring, and maintenance.

```yaml
apiVersion: vshn.appcat.vshn.io/v1
kind: VSHNOpenBao
metadata:
name: my-openbao
namespace: my-namespace
spec:
parameters:
# Service configuration
service:
version: "2.1.0" # OpenBao version (enum of supported versions)
fqdn: "openbao.example.com" # Fully qualified domain name
serviceLevel: guaranteed # besteffort or guaranteed
openBaoSettings:
# Auto-unseal configuration (optional)
# Enables automatic unsealing using external key management systems
# Only one provider should be configured at a time
autoUnseal:
awsKmsSecretRef: "" # Reference to secret containing AWS KMS credentials and configuration
azureKeyVaultSecretRef: "" # Reference to secret containing Azure Key Vault credentials and configuration
gcpKmsSecretRef: "" # Reference to secret containing GCP Cloud KMS credentials and configuration
transitSecretRef: "" # Reference to secret containing connection details to another Vault/OpenBao instance

# Number of OpenBao instances
# For guaranteed serviceLevel: must be 3
# For besteffort serviceLevel: can be 1 or 3
instances: 3

# Sizing
size:
plan: standard # Resource plan: small, standard, large
requests:
cpu: "2"
memory: "4Gi"
disk: 20Gi # Raft storage volume size per replica
storageClass: "" # Optional storage class override

# Backup and restore configuration (using K8up)
backup:
enabled: true
schedule: "0 2 * * *" # Cron schedule for Raft snapshots
retention:
keepLast: 2
keepHourly: 2
keepDaily: 7
keepWeekly: 4
keepMonthly: 3
restore:
claimName: ""
backupName: ""

# Maintenance window
maintenance:
dayOfWeek: Tuesday # enum: Monday-Sunday
timeOfDay: "22:00" # HH:MM format in UTC

# Monitoring
monitoring:
alertmanagerConfigRef: ""
alertmanagerConfigSecretRef: {}
alertmanagerConfigTemplate: {}
email: ""

# Unseal keys and root token secret reference
# This secret will contain the unseal keys and root token generated during initialization
writeConnectionSecretToRef:
name: openbao-unseal-keys
```

**Unseal Keys Secret:**

The `writeConnectionSecretToRef` secret contains the unseal keys and root token:

```yaml
apiVersion: v1
kind: Secret
metadata:
name: openbao-unseal-keys
data:
UNSEAL_KEY_1: <base64-encoded-key>
UNSEAL_KEY_2: <base64-encoded-key>
UNSEAL_KEY_3: <base64-encoded-key>
UNSEAL_KEY_4: <base64-encoded-key>
UNSEAL_KEY_5: <base64-encoded-key>
ROOT_TOKEN: <base64-encoded-root-token>
```

**Auto-unseal**

Auto unseal allows OpenBao to unseal automatically without manual intervention using an external key management system. This is crucial for automated recovery and reduces operational burden.

By default OpenBao instances will be configured to use a central, internal VSHN managed Vault or OpenBao to auto-unseal.

WARNING: If a customer configures an auto-unseal provider, only the service level "besteffort" can be guaranteed.

Supported auto-unseal providers:

AWS KMS::: Configure using `awsKmsSecretRef` pointing to a secret containing AWS credentials and KMS key configuration
Azure Key Vault::: Configure using `azureKeyVaultSecretRef` pointing to a secret containing Azure credentials and Key Vault details
GCP Cloud KMS::: Configure using `gcpKmsSecretRef` pointing to a secret containing GCP credentials and Cloud KMS configuration
Transit (Vault/OpenBao)::: Configure using `transitSecretRef` pointing to a secret containing connection details to another Vault/OpenBao instance

Each secret reference should contain the necessary credentials and configuration for the respective provider. When auto-unseal is configured, OpenBao will automatically unseal after restarts without requiring the unseal keys from `writeUnsealKeysSecretToRef`.

If no auto-unseal provider is configured, manual unsealing using the unseal keys is required after each pod restart.

Example AWS KMS auto-unseal secret:

```yaml
apiVersion: v1
kind: Secret
metadata:
name: openbao-awskms-config
type: Opaque
stringData:
region: "us-east-1"
access_key: "AKIAIOSFODNN7EXAMPLE"
secret_key: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
kms_key_id: "19ec80b0-dfdd-4d97-8164-c6examplekey"
endpoint: "https://vpce-0e1bb1852241f8cc6-pzi0do8n.kms.us-east-1.vpce.amazonaws.com"
```

**Service Levels:**

besteffort::
- 1 or 3 instances
- Standard resource guarantees
- Best-effort availability

guaranteed::
- Requires 3 instances (HA deployment)
- Resource guarantees with pod anti-affinity
- Higher availability SLA

**Plans:**

By default, the following plans are available on every cluster:

[cols="25a,15,15,15", options="header"]
|===
| Plan | CPU | Memory | Disk
| standard-2 | 500m | 2Gi | 16Gi
| standard-4 | 1 | 4Gi | 16Gi
| standard-8 | 2 | 8Gi | 16Gi
|===

Key Components::

1. **OpenBao Cluster**: 3-node HA deployment with Raft consensus
2. **Raft Storage**: Built-in distributed storage backend
3. **Backup Storage**: `ObjectBucket` for Raft snapshots using K8up
4. **Monitoring**: Custom SLI exporter and Prometheus integration

Security Model::

- TLS encryption for all communications
- RBAC policies managed through OpenBao
- Audit logging to persistent storage
- Auto-unseal configuration for OpenBao bootstrap

== Consequences

Positive::

- Simplified deployment with fewer external dependencies
- Built-in consensus and replication reduces operational complexity
- Self-contained backup mechanisms using Raft snapshots
- Leverages official OpenBao Helm chart for production readiness
- Eliminates external storage dependency management

Negative::

- Raft cluster management requires specialized knowledge
- Limited to OpenBao's built-in storage capabilities
- Potential storage scaling limitations compared to external databases
- No feature parity with HashiCorp Vault Enterprise

Operational Impact::

- Simplified service deployment with reduced external dependencies
- Raft snapshot management and restoration procedures
- Need for OpenBao and Raft consensus expertise in operations team
- Integration testing with existing AppCat services
- TLS certificate lifecycle management (renewal, rotation)
- Auto-unseal configuration and cluster bootstrap management
- Raft cluster health monitoring and node management
- Audit log management and compliance reporting
- ServiceMonitor configuration for Prometheus integration
- Snapshot-based backup validation and testing

Customer Benefits::

- Self-hosted alternative to cloud secret management services
- Vault API compatibility for existing applications and tooling
- Compliance with data sovereignty requirements
3 changes: 2 additions & 1 deletion docs/modules/ROOT/partials/nav-adrs.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,5 @@
** xref:adr/0045-service-orchestration-crossplane-2-0.adoc[]
** xref:adr/0046-secret-management-framework-2-0.adoc[]
** xref:adr/0047-service-maintenance-and-upgrades-framework-2-0.adoc[]
** xref:adr/0048-evaluating-vector-databases-as-appcat-services.adoc[]
** xref:adr/0048-evaluating-vector-databases-as-appcat-services.adoc[]
** xref:adr/0049-managed-openbao.adoc[]
4 changes: 2 additions & 2 deletions templates/adr/cookiecutter.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"adr_number": "0049",
"adr_number": "0050",
"full_name": "VSHNeer Name",
"adr_title": "Title",
"adr_reviewers": "",
Expand Down Expand Up @@ -37,4 +37,4 @@
},
"adr_tags": "Comma separated list of tags - all lowercase"
}
}
}
Loading