diff --git a/docs/src/dev-docs/design-deployment-orchestration.md b/docs/src/dev-docs/design-deployment-orchestration.md index 80bbe0c675..51efa05788 100755 --- a/docs/src/dev-docs/design-deployment-orchestration.md +++ b/docs/src/dev-docs/design-deployment-orchestration.md @@ -1,8 +1,9 @@ # Deployment orchestration -The CLP package is composed of several components that are currently designed to be deployed in a -set of containers that are orchestrated using a framework like [Docker Compose][docker-compose]. -This document explains the architecture of that orchestration and any associated nuances. +The CLP package is composed of several components that are designed to be deployed in a set of +containers that are orchestrated using a framework like [Docker Compose][docker-compose] or +[Kubernetes][kubernetes] (via [Helm][helm]). This document explains the architecture of that +orchestration and any associated nuances. ## Architecture @@ -188,102 +189,96 @@ graph LR **Table 2**: One-time initialization jobs in the CLP package. :::: -## Code structure +## Orchestration methods -The orchestration code is split up into: +CLP supports two orchestration methods: Docker Compose for single-host or manual multi-host +deployments, and Helm for Kubernetes deployments. Both methods share the same configuration +interface (`clp-config.yaml` and `credentials.yaml`) and support the same deployment types. -* `BaseController` that defines: - * common logic for preparing the environment variables, configuration files, and directories - necessary for each service. - * abstract methods that orchestrator-specific derived classes must implement in order to - orchestrate a deployment. -* `Controller` that implements (and/or overrides) any of the methods in - `BaseController` (`` is a placeholder for the specific orchestrator for which the - class is being implemented). +### Configuration -## Docker Compose orchestration +Each service requires configuration values passed through config files, environment variables, +and/or command line arguments. Since services run in containers, some values must be adapted for the +orchestration environment—specifically, host paths must be converted to container paths, and +hostnames/ports must use service discovery mechanisms. -This section explains how we use Docker Compose to orchestrate the CLP package and is broken into -the following subsections: +The orchestration controller (e.g., `DockerComposeController`) reads `etc/clp-config.yaml` and +`etc/credentials.yaml`, then generates: +* A container-specific CLP config file with adapted paths and service names +* Runtime configuration (environment variables or ConfigMaps) +* Required directories (e.g., data output directories) -* [Setting up the Docker Compose project's environment](#setting-up-the-environment) -* [Starting and stoping the Docker Compose project](#starting-and-stopping-the-project) -* [Deployment types](#deployment-types) -* [Implementation details](#implementation-details) -* [Troubleshooting](#troubleshooting) +For Docker Compose, this generates `var/log/.clp-config.yaml` and `.env`. For Kubernetes, the Helm +chart generates a ConfigMap and Secrets from `values.yaml`. -### Setting up the environment - -Several services require configuration values to be passed in through the CLP package's config file, -environment variables, and/or command line arguments. Since the services are running in containers, -some of these configuration values need to be modified for the orchestration environment. -Specifically: +:::{note} +A `KubernetesController` is also planned that will read `clp-config.yaml` and `credentials.yaml` +like `DockerComposeController`, then set up the Helm release accordingly. This will unify the +configuration experience across both orchestration methods. +::: -1. Paths on the host must be converted to appropriate paths in the container. -2. Component hostnames must be converted to service names, and component ports must be converted to the component's default ports. - * This ensures that in the Docker Compose configuration, services can communicate over fixed, predictable hostnames and ports rather than relying on configurable variables. +### Secrets -To achieve this, before starting the deployment, `DockerComposeController.start` generates: +Sensitive credentials (database passwords, API keys) are stored in `etc/credentials.yaml` and +require special handling to avoid exposure. -* a CLP configuration file (`/var/log/.clp-config.yaml` on the host) specific to the - Docker Compose project environment. -* an environment variable file (`/.env`) for any other configuration values. -* any necessary directories (e.g., data output directories). +* **Docker Compose**: Credentials are written to `.env` and passed as environment variables +* **Kubernetes**: Credentials are stored in Kubernetes Secrets -The Docker Compose project then passes those environment variables to the relevant services, either -as environment variables or command line arguments, as necessary. +### Dependencies -### Starting and stopping the project +As shown in [Figure 1](#figure-1), services have complex interdependencies. Both orchestrators +ensure services start only after their dependencies are healthy. -To start and stop the project, `DockerComposeController` simply invokes `docker compose up` or -`docker compose down` as appropriate. However, to allow multiple CLP packages to be run on the same -host, we explicitly specify a project name for the project, where the name is based on the package's -instance ID. +* **Docker Compose**: Uses `depends_on` with `condition: service_healthy` and container healthchecks +* **Kubernetes**: Uses init containers (via the `clp.waitFor` helper) and readiness/liveness probes -### Deployment Types +### Storage -CLP supports four deployment types determined by the `compression_scheduler.type` and -`package.query_engine` configuration setting. +Services require persistent storage for logs, data, archives, and streams. -| Deployment Type | Compression Scheduler | Query Engine | Docker Compose File | -|-----------------|-----------------------|------------------------------|------------------------------------| -| Base | Celery | [Presto][presto-integration] | `docker-compose-base.yaml` | -| Full | Celery | Native | `docker-compose.yaml` | -| Spider Base | Spider | [Presto][presto-integration] | `docker-compose-spider-base.yaml` | -| Spider Full | Spider | Native | `docker-compose-spider.yaml` | +* **Docker Compose**: Uses bind mounts for host directories and named volumes for database data. + Conditional mounts use variable interpolation to mount empty tmpfs when not needed. +* **Kubernetes**: Uses PersistentVolumeClaims per component, with shared PVCs (`ReadWriteMany`) for + archives and streams. Uses `local-storage` StorageClass by default. -### Implementation details +### Deployment types -One notable implementation detail is in how we handle mounts that are only necessary under certain -configurations. For instance, the input logs mount is only necessary when the `logs_input.type` is -`fs`. If `logs_input.type` is `s3`, we shouldn't mount some random directory from the user's -host filesystem into the container. However, Docker doesn't provide a mechanism to perform -conditional mounts. Instead, we use Docker's variable interpolation to conditionally mount an empty -tmpfs mount into the container. This strategy is used wherever we need a conditional mount. +CLP supports multiple deployment configurations based on the compression scheduler and query engine. -### Troubleshooting +| Deployment Type | Compression Scheduler | Query Engine | +|-----------------|-----------------------|------------------------------| +| Base | Celery | [Presto][presto-integration] | +| Full | Celery | Native | +| Spider Base | Spider | [Presto][presto-integration] | +| Spider Full | Spider | Native | -If you encounter issues with the Docker Compose deployment, first determine the instance ID for your -deployment by checking the content of `/var/log/instance-id`. Then run one of the -commands below as necessary. +:::{note} +Spider support is not yet available for Helm. +::: -1. Check service status: +Docker Compose selects the appropriate compose file (e.g., `docker-compose.yaml` for Full, +`docker-compose-spider.yaml` for Spider Full) and uses `deploy.replicas` with environment +variables (e.g., `CLP_MCP_SERVER_ENABLED`) to toggle optional services. Helm uses conditional +templating to include/exclude resources. - ```bash - docker compose --project-name clp-package- ps - ``` +## Troubleshooting -2. View service logs: +When issues arise, use the appropriate commands for your orchestration method: - ```bash - docker compose --project-name clp-package- logs - ``` +* [Docker Compose debugging][docker-compose-debugging] +* [Kubernetes Helm debugging][kubernetes-debugging] -3. Validate configuration: +## User guides - ```bash - docker compose config - ``` +* [Kubernetes deployment][kubernetes-guide]: Deploying CLP with Helm +* [Multi-host deployment][docker-compose-multi-host]: Manual Docker Compose across multiple hosts [docker-compose]: https://docs.docker.com/compose/ +[docker-compose-debugging]: ../user-docs/guides-docker-compose-deployment.md#monitoring-and-debugging +[helm]: https://helm.sh/ +[kubernetes]: https://kubernetes.io/ +[kubernetes-debugging]: ../user-docs/guides-k8s-deployment.md#monitoring-and-debugging +[kubernetes-guide]: ../user-docs/guides-k8s-deployment.md +[docker-compose-multi-host]: ../user-docs/guides-docker-compose-deployment.md#multi-host-deployment [presto-integration]: ../user-docs/guides-using-presto.md diff --git a/docs/src/user-docs/guides-multi-host.md b/docs/src/user-docs/guides-docker-compose-deployment.md similarity index 90% rename from docs/src/user-docs/guides-multi-host.md rename to docs/src/user-docs/guides-docker-compose-deployment.md index e9e5ecd436..3f8be96752 100755 --- a/docs/src/user-docs/guides-multi-host.md +++ b/docs/src/user-docs/guides-docker-compose-deployment.md @@ -1,12 +1,28 @@ -# Multi-host deployment +# Docker Compose deployment + +This guide explains how to deploy CLP using Docker Compose. Docker Compose provides a +straightforward way to orchestrate CLP's services, suitable for both development and production +environments. + +## Deployment options + +Docker Compose can be used for: + +* **Single-host deployment**: Run all CLP services on a single machine. This is the simplest setup, + covered in the [quick-start guides](quick-start/index.md). +* **Multi-host deployment**: Distribute CLP services across multiple machines for higher throughput + and scalability. This is covered in detail below. + +--- + +## Multi-host deployment A multi-host deployment allows you to run CLP across a distributed set of hosts. -:::{warning} -The instructions below provide a temporary solution for multi-host deployment and may change as we -actively work to improve ease of deployment. The present solution uses *manual* Docker Compose -orchestration; however, Kubernetes Helm support will be available in a future release, which will -simplify multi-host deployments significantly. +:::{note} +The instructions below use *manual* Docker Compose orchestration, which is more lightweight and +provides fine-grained control over service placement, but requires more configuration than +Helm-based deployments. ::: ## Requirements @@ -305,9 +321,11 @@ sbin/stop-clp.sh This will stop all CLP services managed by Docker Compose on the current host. -## Monitoring services +## Monitoring and debugging -To check the status of services on a host: +First, determine your instance ID from `/var/log/instance-id`. + +To check the status of services: ```bash docker compose --project-name clp-package- ps @@ -319,6 +337,20 @@ To view logs for a specific service: docker compose --project-name clp-package- logs -f ``` +To execute commands in a running container: + +```bash +docker compose --project-name clp-package- exec /bin/bash +``` + +To validate your Docker Compose configuration: + +```bash +docker compose config +``` + +--- + ## Setting up SeaweedFS The instructions below are for running a simple SeaweedFS cluster on a set of hosts. For other use diff --git a/docs/src/user-docs/guides-external-database.md b/docs/src/user-docs/guides-external-database.md index e4d5a8b670..dcfb27b896 100644 --- a/docs/src/user-docs/guides-external-database.md +++ b/docs/src/user-docs/guides-external-database.md @@ -205,7 +205,7 @@ initialization jobs (`db-table-creator` and `results-cache-indices-creator`). [aws-rds]: https://aws.amazon.com/rds/ [azure-databases]: https://azure.microsoft.com/en-us/products/category/databases -[docker-compose-orchestration]: ../dev-docs/design-deployment-orchestration.md#docker-compose-orchestration +[docker-compose-orchestration]: ../user-docs/guides-docker-compose-deployment.md [mongodb-install]: https://www.mongodb.com/docs/manual/installation/ [mongodb-security]: https://docs.mongodb.com/manual/security/ -[multi-host-guide]: guides-multi-host.md#starting-clp +[multi-host-guide]: guides-docker-compose-deployment.md#starting-clp diff --git a/docs/src/user-docs/guides-k8s-deployment.md b/docs/src/user-docs/guides-k8s-deployment.md new file mode 100644 index 0000000000..6353211bdb --- /dev/null +++ b/docs/src/user-docs/guides-k8s-deployment.md @@ -0,0 +1,615 @@ +# Kubernetes deployment + +This guide explains how to deploy CLP on Kubernetes using [Helm]. This provides an alternative to +Docker Compose and enables deployment on Kubernetes clusters ranging from local development setups +to production environments. + +:::{note} +For a detailed overview of CLP's services and their dependencies, see the [deployment orchestration +design doc][design-orchestration]. +::: + +--- + +## Requirements + +The following tools are required to deploy CLP on Kubernetes: + +* [kubectl] >= 1.30 +* [Helm] >= 4.0 +* A Kubernetes cluster (see [Setting up a cluster](#setting-up-a-cluster) below) +* When not using S3 storage, a shared filesystem accessible by all worker pods (e.g., NFS, + [SeaweedFS]) or local storage for single-node deployments + +--- + +## Setting up a cluster + +You can deploy CLP on either a local development cluster or a production Kubernetes cluster. + +### Option 1: Local development with kind + +[kind] (Kubernetes in Docker) is ideal for testing and development. It runs a Kubernetes cluster +inside Docker containers on your local machine. + +For single-host kind deployments, see the [quick-start guides][quick-start], which cover +creating a kind cluster and installing the Helm chart. + +### Option 2: Production Kubernetes cluster + +For production deployments, you can use any Kubernetes distribution: + +* Managed Kubernetes services: [Amazon EKS][eks], [Google GKE][gke], [Azure AKS][aks] +* Self-hosted: [kubeadm], [k3s], [RKE2] + +#### Setting up a cluster with kubeadm + +[kubeadm] is the official Kubernetes tool for bootstrapping clusters. Follow the +[official kubeadm installation guide][kubeadm] to install the prerequisites, container runtime, +and kubeadm on all nodes. + +1. **Initialize the control plane** (on the control-plane node only): + + ```bash + sudo kubeadm init --pod-network-cidr=10.244.0.0/16 + ``` + + :::{tip} + Save the `kubeadm join` command printed at the end of the output. You'll need it to join worker + nodes later. + ::: + + :::{note} + The `--pod-network-cidr` specifies the IP range for pods. If `10.244.0.0/16` conflicts with your + network, use a different private range as [RFC 1918][rfc-1918] specifies (e.g., `192.168.0.0/16`, + `172.16.0.0/16`, or `10.200.0.0/16`). + ::: + + To set up `kubectl` for your user: + + ```bash + mkdir -p "$HOME/.kube" + sudo cp -i /etc/kubernetes/admin.conf "$HOME/.kube/config" + sudo chown "$(id -u):$(id -g)" "$HOME/.kube/config" + ``` + +2. **Install a CNI plugin** (on the control-plane node): + + A CNI plugin is required for pod-to-pod networking. The following installs [Cilium], a + high-performance CNI that uses eBPF: + + ```bash + helm repo add cilium https://helm.cilium.io/ + helm repo update + helm install cilium cilium/cilium --namespace kube-system \ + --set ipam.operator.clusterPoolIPv4PodCIDRList=10.244.0.0/16 + ``` + + :::{note} + The `clusterPoolIPv4PodCIDRList` must match the `--pod-network-cidr` used in `kubeadm init`. + ::: + +3. **Join worker nodes** (on each worker node): + + Run `kubeadm join` with the token and hash you saved from step 1: + + ```bash + sudo kubeadm join :6443 \ + --token \ + --discovery-token-ca-cert-hash sha256: + ``` + + If you lost the command, regenerate it on the control-plane node with + `kubeadm token create --print-join-command`. + +--- + +## Installing the Helm chart + +Once your cluster is ready, you can install CLP using the Helm chart. + +### Getting the chart + +The CLP Helm chart is located in the repository at `tools/deployment/package-helm/`. + +```bash +# Clone the repository (if you haven't already) +git clone https://github.com/y-scope/clp.git +cd clp/tools/deployment/package-helm +``` + +#### Production cluster requirements (optional) + +The following configurations are optional but recommended for production deployments. You can skip +this section for testing or development. + +1. **Storage for CLP Package services' data and logs** (optional, for centralized debugging): + + The Helm chart creates static PersistentVolumes using local host paths by default, so no + StorageClass configuration is required for basic deployments. For easier debugging, you can + configure a centralized storage backend for the following directories: + + * `data_directory` - where CLP stores runtime data + * `logs_directory` - where CLP services write logs + * `tmp_directory` - where temporary files are stored + + :::{note} + We aim to improve the logging infrastructure so mapping log volumes will not be required in the + future. See [issue #1760][logging-infra-issue] for details. + ::: + +2. **Shared storage for workers** (required for multi-node clusters using filesystem storage): + + :::{tip} + [S3 storage][s3-storage] is **strongly recommended** for multi-node clusters as it does not + require shared local storage between workers. If you use S3 storage, you can skip this section. + ::: + + For multi-node clusters using filesystem storage, the following directories **must** be + accessible from all worker nodes at the same paths. Without shared storage, compressed logs + created by one worker cannot be searched by other workers. + + * `archive_output.storage.directory` - where compressed archives are stored + * `stream_output.storage.directory` - where stream files are stored + * `logs_input.directory` - where input logs are read from + + Set up NFS, SeaweedFS, or another shared filesystem to provide this access. See the + [multi-host deployment guide][multi-host-guide] for SeaweedFS setup instructions. + +3. **External databases** (recommended for production): + * See the [external database setup guide][external-db-guide] for using external + MariaDB/MySQL and MongoDB databases + +### Basic installation + +Create the required directories on all worker nodes: + +```bash +export CLP_HOME="/tmp/clp" + +mkdir -p "$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \ + "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ + "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ + "$CLP_HOME/var/tmp" +``` + +Then on the **control-plane node**, generate credentials and install CLP: + +```bash +export CLP_HOME="/tmp/clp" + +mkdir -p "$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \ + "$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \ + "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ + "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ + "$CLP_HOME/var/log/"{garbage_collector,api_server,log_ingestor,mcp_server} \ + "$CLP_HOME/var/tmp" + +# Credentials (change these for production) +export CLP_DB_PASS="pass" +export CLP_DB_ROOT_PASS="root-pass" +export CLP_QUEUE_PASS="pass" +export CLP_REDIS_PASS="pass" + +# Worker replicas (increase for multi-node clusters) +export CLP_COMPRESSION_WORKER_REPLICAS=1 +export CLP_QUERY_WORKER_REPLICAS=1 + +helm install clp . \ + --set clpConfig.data_directory="$CLP_HOME/var/data" \ + --set clpConfig.logs_directory="$CLP_HOME/var/log" \ + --set clpConfig.tmp_directory="$CLP_HOME/var/tmp" \ + --set clpConfig.archive_output.storage.directory="$CLP_HOME/var/data/archives" \ + --set clpConfig.stream_output.storage.directory="$CLP_HOME/var/data/streams" \ + --set credentials.database.password="$CLP_DB_PASS" \ + --set credentials.database.root_password="$CLP_DB_ROOT_PASS" \ + --set credentials.queue.password="$CLP_QUEUE_PASS" \ + --set credentials.redis.password="$CLP_REDIS_PASS" \ + --set compressionWorker.replicas="$CLP_COMPRESSION_WORKER_REPLICAS" \ + --set queryWorker.replicas="$CLP_QUERY_WORKER_REPLICAS" +``` + +### Multi-node deployment + +For multi-node clusters with shared storage mounted on all nodes (e.g., NFS/CephFS via +`/etc/fstab`), enable distributed storage mode and configure multiple worker replicas: + +```bash +helm install clp . \ + --set distributed=true \ + --set compressionWorker.replicas=3 \ + --set queryWorker.replicas=3 +``` + +### Installation with custom values + +For highly customized deployments, create a values file instead of using many `--set` flags: + +```{code-block} yaml +:caption: custom-values.yaml + +# Use a custom image. For local images, import to each node's container runtime first. +image: + clpPackage: + repository: "clp-package" + pullPolicy: "Never" # Use "Never" for local images, "IfNotPresent" for remote + tag: "latest" + +# Adjust worker concurrency +workerConcurrency: 16 + +# Configure CLP settings +clpConfig: + # Use clp-text, instead of clp-json (default) + package: + storage_engine: "clp" # Use "clp-s" for clp-json, "clp" for clp-text + query_engine: "clp" # Use "clp-s" for clp-json, "clp" for clp-text, "presto" for Presto + + # Configure archive output + archive_output: + target_archive_size: 536870912 # 512 MB + compression_level: 6 + + # Enable MCP server + mcp_server: + port: 30800 + logging_level: "INFO" + + # Configure data retention (in minutes) + archive_output: + retention_period: 10080 # 7 days + results_cache: + retention_period: 120 # 2 hours + +# Override credentials (use secrets in production!) +credentials: + database: + username: "clp-user" + password: "your-db-password" + root_username: "root" + root_password: "your-db-root-password" + queue: + username: "clp-user" + password: "your-queue-password" + redis: + password: "your-redis-password" +``` + +Install with custom values: + +```bash +helm install clp . -f custom-values.yaml +``` + +::::{tip} +To preview the generated Kubernetes manifests before installing, use `helm template`: + +```bash +helm template clp . -f custom-values.yaml +``` +:::: + +### Worker scheduling + +You can control where workers are scheduled using standard Kubernetes scheduling primitives +(`nodeSelector`, `affinity`, `tolerations`, `topologySpreadConstraints`). + +#### Dedicated node pools + +To run compression and query workers on separate node pools: + +1. Label your nodes: + + ```bash + # Label compression nodes + kubectl label nodes node1 node2 yscope.io/nodeType=compression + + # Label query nodes + kubectl label nodes node3 node4 yscope.io/nodeType=query + ``` + +2. Configure scheduling: + + ```{code-block} yaml + :caption: dedicated-scheduling.yaml + + compressionWorker: + replicas: 2 + scheduling: + nodeSelector: + yscope.io/nodeType: compression + + queryWorker: + replicas: 2 + scheduling: + nodeSelector: + yscope.io/nodeType: query + ``` + +3. Install: + + ```bash + helm install clp . -f dedicated-scheduling.yaml --set distributed=true + ``` + +#### Shared node pool + +To run both worker types on the same node pool: + +1. Label your nodes: + + ```bash + kubectl label nodes node1 node2 node3 node4 yscope.io/nodeType=compute + ``` + +2. Configure scheduling: + + ```{code-block} yaml + :caption: shared-scheduling.yaml + + compressionWorker: + replicas: 2 + scheduling: + nodeSelector: + yscope.io/nodeType: compute + topologySpreadConstraints: + - maxSkew: 1 + topologyKey: "kubernetes.io/hostname" + whenUnsatisfiable: "DoNotSchedule" + labelSelector: + matchLabels: + app.kubernetes.io/component: compression-worker + + queryWorker: + replicas: 2 + scheduling: + nodeSelector: + yscope.io/nodeType: compute + ``` + +3. Install: + + ```bash + helm install clp . -f shared-scheduling.yaml --set distributed=true + ``` + +### Common configuration options + +The following table lists commonly used Helm values. For a complete list, see `values.yaml` in the +chart directory. + +| Parameter | Description | Default | +|----------------------------------------------|----------------------------------------------|-----------------------------------| +| `image.clpPackage.repository` | CLP package image repository | `ghcr.io/y-scope/clp/clp-package` | +| `image.clpPackage.tag` | Image tag | `main` | +| `workerConcurrency` | Number of worker processes | `8` | +| `distributed` | Distributed/multi-node deployment mode | `false` | +| `compressionWorker.replicas` | Number of compression worker replicas | `1` | +| `compressionWorker.scheduling` | Scheduling config for compression workers | `{}` | +| `queryWorker.replicas` | Number of query worker replicas | `1` | +| `queryWorker.scheduling` | Scheduling config for query workers | `{}` | +| `storage.storageClassName` | StorageClass name (created if "local-storage") | `local-storage` | +| `allowHostAccessForSbinScripts` | Expose database/cache for sbin scripts | `true` | +| `clpConfig.package.storage_engine` | Storage engine (`clp-s` or `clp`) | `clp-s` | +| `clpConfig.package.query_engine` | Query engine (`clp-s`, `clp`, or `presto`) | `clp-s` | +| `clpConfig.webui.port` | Web UI NodePort | `30000` | +| `clpConfig.api_server.port` | API server NodePort | `30301` | +| `clpConfig.database.port` | Database NodePort | `30306` | +| `clpConfig.results_cache.port` | Results cache (MongoDB) NodePort | `30017` | +| `clpConfig.mcp_server.port` | MCP server NodePort | `30800` | +| `clpConfig.logs_input.directory` | Directory containing logs to compress | `/` | +| `clpConfig.data_directory` | Directory for data storage | `/tmp/clp/var/data` | +| `clpConfig.logs_directory` | Directory for log files | `/tmp/clp/var/log` | +| `clpConfig.tmp_directory` | Directory for temporary files | `/tmp/clp/var/tmp` | +| `clpConfig.archive_output.storage.directory` | Directory for compressed archives | `/tmp/clp/var/data/archives` | +| `clpConfig.stream_output.storage.directory` | Directory for stream files | `/tmp/clp/var/data/streams` | +| `clpConfig.archive_output.retention_period` | Archive retention (minutes, null to disable) | `null` | +| `clpConfig.results_cache.retention_period` | Search results retention (minutes) | `60` | + +--- + +## Verifying the deployment + +After installing the Helm chart, verify that all components are running correctly. + +### Check pod status + +Wait for all pods to be ready: + +```bash +# Watch pod status +kubectl get pods -w + +# Wait for all pods to be ready +kubectl wait pods --all --for=condition=Ready --timeout=300s +``` + +Expected output shows all pods in `Running` state: + +``` +NAME READY STATUS RESTARTS AGE +clp-api-server-... 1/1 Running 0 2m +clp-compression-scheduler-... 1/1 Running 0 2m +clp-compression-worker-... 1/1 Running 0 2m +clp-database-0 1/1 Running 0 2m +clp-garbage-collector-... 1/1 Running 0 2m +clp-query-scheduler-... 1/1 Running 0 2m +clp-query-worker-... 1/1 Running 0 2m +clp-queue-0 1/1 Running 0 2m +clp-reducer-... 1/1 Running 0 2m +clp-redis-0 1/1 Running 0 2m +clp-results-cache-0 1/1 Running 0 2m +clp-webui-... 1/1 Running 0 2m +``` + +### Check initialization jobs + +CLP runs initialization jobs on first deployment: + +```bash +# Check job completion +kubectl get jobs + +# Expected output: +# NAME COMPLETIONS DURATION AGE +# clp-db-table-creator 1/1 5s 2m +# clp-results-cache-indices-creator 1/1 3s 2m +``` + +### Access the Web UI + +Once all pods are ready, access the CLP Web UI: `http://:30000` (the value of `clpConfig.webui.port`) + +--- + +## Using CLP + +With CLP deployed on Kubernetes, you can compress and search logs using the same workflows as +Docker Compose deployments. Refer to the quick-start guide for your chosen flavor: + +::::{grid} 1 1 2 2 +:gutter: 2 + +:::{grid-item-card} +:link: quick-start/clp-json +Using clp-json +^^^ +How to compress and search JSON logs. +::: + +:::{grid-item-card} +:link: quick-start/clp-text +Using clp-text +^^^ +How to compress and search unstructured text logs. +::: +:::: + +:::{note} +By default (`allowHostAccessForSbinScripts: true`), the database and results cache are exposed on +NodePorts, allowing you to use `sbin/` scripts from the CLP package. Download a +[release][clp-releases] matching the chart's `appVersion`, then configure `etc/clp-config.yaml`: + +```yaml +database: + port: 30306 # Match `clpConfig.database.port` in Helm values +results_cache: + port: 30017 # Match `clpConfig.results_cache.port` in Helm values +``` + +Alternatively, use the Web UI ([clp-json][webui-clp-json]|[clp-text][webui-clp-text]) to compress +logs and search interactively, or the [API server][api-server] to submit queries and view results +programmatically. +::: + +--- + +## Monitoring and debugging + +To check the status of pods: + +```bash +kubectl get pods +``` + +To view logs for a specific pod: + +```bash +kubectl logs -f +``` + +To execute commands in a pod: + +```bash +kubectl exec -it -- /bin/bash +``` + +To debug Helm chart issues: + +```bash +helm install clp . --dry-run --debug +``` + +--- + +## Managing releases + +This section covers how to manage your CLP Helm release. + +:::{note} +Upgrade and rollback are not yet supported. We plan to add support as we finalize the migration +mechanism. +::: + +### Uninstall CLP + +```bash +helm uninstall clp +``` + +:::{warning} +Uninstalling the Helm release will delete all CLP pods and services. However, PersistentVolumes +with `Retain` policy will preserve your data. To completely remove all data, delete the PVs and +the data directories manually. +::: + +--- + +## Cleaning up + +To tear down a kubeadm cluster: + +1. **Uninstall Cilium** (on the control-plane): + + ```bash + helm uninstall cilium --namespace kube-system + ``` + +2. **Reset each node** (run on all worker nodes first, then the control-plane): + + ```bash + sudo kubeadm reset -f + sudo rm -rf /etc/cni/net.d/* + sudo umount /var/run/cilium/cgroupv2/ + sudo rm -rf /var/run/cilium + ``` + +3. **Clean up kubeconfig** (on the control-plane): + + ```bash + rm -rf ~/.kube + ``` + +--- + +## Related guides + +* [Docker Compose deployment][docker-compose-deployment] - Docker Compose orchestration for single or multi-host setups +* [External database setup][external-db-guide] - Using external MariaDB and MongoDB +* [Using object storage][s3-storage] - Configuring S3 storage +* [Configuring retention periods][retention-guide] - Setting up data retention policies + +[aks]: https://azure.microsoft.com/en-us/products/kubernetes-service +[api-server]: guides-using-the-api-server.md +[Cilium]: https://cilium.io/ +[clp-releases]: https://github.com/y-scope/clp/releases +[design-orchestration]: ../dev-docs/design-deployment-orchestration.md +[docker-compose-deployment]: guides-docker-compose-deployment.md +[eks]: https://aws.amazon.com/eks/ +[external-db-guide]: guides-external-database.md +[gke]: https://cloud.google.com/kubernetes-engine +[Helm]: https://helm.sh/ +[k3s]: https://k3s.io/ +[kind]: https://kind.sigs.k8s.io/ +[kubeadm]: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ +[kubectl]: https://kubernetes.io/docs/tasks/tools/ +[logging-infra-issue]: https://github.com/y-scope/clp/issues/1760 +[multi-host-guide]: guides-docker-compose-deployment.md +[quick-start]: quick-start/index.md +[retention-guide]: guides-retention.md +[rfc-1918]: https://datatracker.ietf.org/doc/html/rfc1918#section-3 +[RKE2]: https://docs.rke2.io/ +[s3-storage]: guides-using-object-storage/index +[SeaweedFS]: https://github.com/seaweedfs/seaweedfs +[webui-clp-json]: quick-start/clp-json.md#searching-from-the-ui +[webui-clp-text]: quick-start/clp-text.md#searching-from-the-ui diff --git a/docs/src/user-docs/guides-overview.md b/docs/src/user-docs/guides-overview.md index 8b9186768b..90c6c1ed10 100644 --- a/docs/src/user-docs/guides-overview.md +++ b/docs/src/user-docs/guides-overview.md @@ -1,6 +1,41 @@ # Overview -The guides below describe how to use CLP in different use cases. +These guides cover deploying, configuring, and integrating CLP for various use cases. + +--- + +## Deployment + +Guides for deploying CLP in production environments. + +:::{tip} +For single-host deployments, see the [quick-start guide](quick-start/index), which includes tabs +for both Docker Compose and Kubernetes (kind) orchestration. +::: + +::::{grid} 1 1 2 2 +:gutter: 2 + +:::{grid-item-card} +:link: guides-docker-compose-deployment +Docker Compose deployment +^^^ +Deploy CLP using Docker Compose for single or multi-host setups. +::: + +:::{grid-item-card} +:link: guides-k8s-deployment +Kubernetes deployment +^^^ +Deploy CLP on a Kubernetes cluster using Helm. +::: +:::: + +--- + +## Input & storage + +Guides for configuring data sources and storage backends. ::::{grid} 1 1 2 2 :gutter: 2 @@ -9,41 +44,58 @@ The guides below describe how to use CLP in different use cases. :link: guides-using-object-storage/index Using object storage ^^^ -Using CLP to ingest logs from object storage and store archives on object storage. +Ingest logs from and store archives on S3-compatible object storage. ::: :::{grid-item-card} :link: guides-external-database External database setup ^^^ -Guide for setting up external databases for CLP package components. +Use external MariaDB/MySQL and MongoDB databases. ::: :::{grid-item-card} :link: guides-retention Configuring retention periods ^^^ -How to configure retention periods for CLP archives and search results. +Configure retention periods for archives and search results. ::: +:::: + +--- + +## Package services + +Guides for using services included in the CLP package. + +::::{grid} 1 1 2 2 +:gutter: 2 :::{grid-item-card} -:link: guides-multi-host -Multi-host deployment +:link: guides-using-the-api-server +Using the API server ^^^ -How to deploy CLP across multiple hosts. +Submit queries, view results, and manage jobs programmatically. ::: :::{grid-item-card} -:link: guides-using-the-api-server -Using the API server +:link: guides-mcp-server/index +MCP server ^^^ -How to use the API server to interact with CLP. +Integrate CLP with AI assistants using the Model Context Protocol. ::: :::{grid-item-card} :link: guides-using-presto -Using Presto with CLP +Using Presto +^^^ +Use Presto for distributed SQL queries on compressed logs. +::: + +:::{grid-item-card} +:link: guides-using-spider +Using Spider ^^^ -How to use Presto to query compressed logs in CLP. +Use Spider for compression and query job task distribution. ::: :::: diff --git a/docs/src/user-docs/index.md b/docs/src/user-docs/index.md index 715dcefe55..2dd116bc42 100644 --- a/docs/src/user-docs/index.md +++ b/docs/src/user-docs/index.md @@ -57,15 +57,33 @@ quick-start/clp-text :::{toctree} :hidden: :caption: Guides -:glob: guides-overview -guides-mcp-server/index +::: + +:::{toctree} +:hidden: +:caption: Deployment + +guides-docker-compose-deployment +guides-k8s-deployment +::: + +:::{toctree} +:hidden: +:caption: Input & storage + guides-using-object-storage/index -guides-using-the-api-server guides-external-database -guides-multi-host guides-retention +::: + +:::{toctree} +:hidden: +:caption: Package services + +guides-using-the-api-server +guides-mcp-server/index guides-using-presto guides-using-spider ::: diff --git a/docs/src/user-docs/quick-start/clp-json.md b/docs/src/user-docs/quick-start/clp-json.md index 385736e339..c93e7cce90 100644 --- a/docs/src/user-docs/quick-start/clp-json.md +++ b/docs/src/user-docs/quick-start/clp-json.md @@ -11,26 +11,146 @@ text logs, refer to [this section below](#compressing-unstructured-text-logs). ## Starting CLP +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + To start CLP, run: ```bash sbin/start-clp.sh ``` -:::{tip} +```{tip} To validate configuration and prepare directories without launching services, add the `--setup-only` flag (e.g., `sbin/start-clp.sh --setup-only`). -::: +``` -:::{note} +```{note} If CLP fails to start (e.g., due to a port conflict), try adjusting the settings in `etc/clp-config.yaml` and then run the start command again. +``` + +For more details on Docker Compose deployment, see the [Docker Compose deployment guide][docker-compose-deployment]. ::: +:::{tab-item} Kubernetes (kind) +:sync: kind + +First, create a kind cluster: + +```bash +# Data and logs directory for the CLP Package +export CLP_HOME="$HOME/clp" + +# Host port mappings +export CLP_WEBUI_PORT=30000 +export CLP_RESULTS_CACHE_PORT=30017 +export CLP_API_SERVER_PORT=30301 +export CLP_DATABASE_PORT=30306 +export CLP_MCP_SERVER_PORT=30800 + +# Credentials (generate random or use your own) +export CLP_DB_PASS=$(openssl rand -hex 16) +export CLP_DB_ROOT_PASS=$(openssl rand -hex 16) +export CLP_QUEUE_PASS=$(openssl rand -hex 16) +export CLP_REDIS_PASS=$(openssl rand -hex 16) + +# Create required directories +mkdir -p "$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \ + "$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \ + "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ + "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ + "$CLP_HOME/var/log/"{garbage_collector,api_server,log_ingestor,mcp_server} \ + "$CLP_HOME/var/tmp" + +# Create the kind cluster +cat <' [ ...] * `` are paths to JSON log files or directories containing such files. * Each JSON log file should contain each log event as a - [separate JSON object](./index.md#clp-json), i.e., *not* as an array. + [separate JSON object][clp-json-format], i.e., *not* as an array. The compression script will output the compression ratio of each dataset you compress, or you can use the UI to view overall statistics. @@ -61,10 +181,29 @@ config option in `etc/clp-config.yaml` (`archive_output.storage.directory` defau :::{tip} To compress logs from object storage, see -[Using object storage](../guides-using-object-storage/index). +[Using object storage][object-storage]. ::: -## Compressing unstructured text logs +### Compressing unstructured text logs + +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + +No additional configuration is required. +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +Configure `etc/clp-config.yaml` to connect to the kind-deployed database: + +```yaml +database: + port: 30306 +``` +::: +:::: clp-json supports compressing unstructured text logs by converting them into JSON. To enable this conversion, run the compression script with the `--unstructured` flag: @@ -103,7 +242,7 @@ When the `--unstructured` flag is used, clp-json will always use `"timestamp"` a ### Sample logs -For some sample logs, check out the [open-source datasets](../resources-datasets). +For some sample logs, check out the [open-source datasets][datasets]. --- @@ -155,13 +294,16 @@ as well as a kv-pair with key `"msg"` and a value that matches the wildcard quer `"*write concern*"`. A complete reference for clp-json's query syntax is available on the -[syntax reference page](../reference-json-search-syntax). +[syntax reference page][json-search-syntax]. ### Searching from the UI -To search your compressed logs from CLP's UI, open [http://localhost:4000](http://localhost:4000) in -your browser (if you changed `webui.host` or `webui.port` in `etc/clp-config.yaml`, use the new -values). +To search your compressed logs from CLP's UI, open [http://localhost:4000](http://localhost:4000) +(Docker Compose) or [http://localhost:30000](http://localhost:30000) (Kubernetes) in your browser. + +:::{note} +If you changed `webui.host` or `webui.port` in the configuration, use the new values. +::: [Figure 3](#figure-3) shows the search page after running a query. @@ -177,7 +319,7 @@ values). The numbered circles in [Figure 3](#figure-3) correspond to the following elements: 1. **The query input box**. The format of your query should conform to CLP's - [JSON search syntax](../reference-json-search-syntax.md). + [JSON search syntax][json-search-syntax]. 2. **The query case-sensitivity toggle**. When turned on, CLP will search for log events that match the case of your query. 3. **The time range selector**. CLP will search for log events that are in the specified time range. @@ -196,11 +338,51 @@ The numbered circles in [Figure 3](#figure-3) correspond to the following elemen :::{note} By default, the UI will only return 1,000 of the latest search results. To perform searches which -return more results, use the [command line](#searching-from-the-command-line). +return more results, use the [command line](#searching-from-the-command-line) or +[API server](#searching-via-the-api-server). ::: +### Searching via the API server + +To search via the API server: + +```bash +curl -X POST "http://localhost:30301/query/submit" \ + -H "Content-Type: application/json" \ + -d '{ + "query_string": "", + "max_num_results": 1000, + "timestamp_begin": null, + "timestamp_end": null, + "case_sensitive": false + }' +``` + +For more details on the API, see [Using the API server][api-server]. + ### Searching from the command line +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + +No additional configuration is required. +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +Configure `etc/clp-config.yaml` to connect to the kind-deployed services: + +```yaml +database: + port: 30306 +results_cache: + port: 30017 +``` +::: +:::: + To search your compressed logs from the command line, run: ```bash @@ -224,8 +406,39 @@ searches are case-**sensitive** on the command line. ## Stopping CLP +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + If you need to stop CLP, run: ```bash sbin/stop-clp.sh ``` +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +To stop CLP, uninstall the Helm release: + +```bash +helm uninstall clp +``` + +To also delete the kind cluster: + +```bash +kind delete cluster --name clp +``` +::: +:::: + +[api-server]: ../guides-using-the-api-server.md +[clp-json-format]: ./index.md#clp-json +[clp-releases]: https://github.com/y-scope/clp/releases +[datasets]: ../resources-datasets +[docker-compose-deployment]: ../guides-docker-compose-deployment.md +[json-search-syntax]: ../reference-json-search-syntax.md +[k8s-deployment]: ../guides-k8s-deployment.md +[object-storage]: ../guides-using-object-storage/index diff --git a/docs/src/user-docs/quick-start/clp-text.md b/docs/src/user-docs/quick-start/clp-text.md index f1f637c66c..b21dcadc9f 100644 --- a/docs/src/user-docs/quick-start/clp-text.md +++ b/docs/src/user-docs/quick-start/clp-text.md @@ -13,23 +13,27 @@ query individual fields. This limitation will be addressed in a future version o ## Starting CLP +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + To start CLP, run: ```bash sbin/start-clp.sh ``` -:::{tip} +```{tip} To validate configuration and prepare directories without launching services, add the `--setup-only` flag (e.g., `sbin/start-clp.sh --setup-only`). -::: +``` -:::{note} +```{note} If CLP fails to start (e.g., due to a port conflict), try adjusting the settings in `etc/clp-config.yaml` and then run the start command again. -::: +``` -:::{warning} +````{warning} **Do not comment out or remove the `package` block in `etc/clp-config.yaml`**; otherwise, the storage and query engines will default to `clp-s`, which is optimized for JSON logs rather than unstructured text logs. @@ -41,12 +45,130 @@ package: storage_engine: "clp" query_engine: "clp" ``` +```` + +For more details on Docker Compose deployment, see the [Docker Compose deployment guide][docker-compose-deployment]. +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +First, create a kind cluster: + +```bash +# Data and logs directory for the CLP Package +export CLP_HOME="$HOME/clp" + +# Host port mappings +export CLP_WEBUI_PORT=30000 +export CLP_RESULTS_CACHE_PORT=30017 +export CLP_API_SERVER_PORT=30301 +export CLP_DATABASE_PORT=30306 +export CLP_MCP_SERVER_PORT=30800 + +# Credentials (generate random or use your own) +export CLP_DB_PASS=$(openssl rand -hex 16) +export CLP_DB_ROOT_PASS=$(openssl rand -hex 16) +export CLP_QUEUE_PASS=$(openssl rand -hex 16) +export CLP_REDIS_PASS=$(openssl rand -hex 16) + +# Create required directories +mkdir -p "$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \ + "$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \ + "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ + "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ + "$CLP_HOME/var/log/"{garbage_collector,api_server,log_ingestor,mcp_server} \ + "$CLP_HOME/var/tmp" + +# Create the kind cluster +cat <", + "max_num_results": 1000, + "timestamp_begin": null, + "timestamp_end": null, + "case_sensitive": false + }' +``` + +For more details on the API, see [Using the API server][api-server]. + ### Searching from the command line +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + +No additional configuration is required. +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +Configure `etc/clp-config.yaml` to connect to the kind-deployed services: + +```yaml +database: + port: 30306 +results_cache: + port: 30017 +``` +::: +:::: + To search your compressed logs from the command line, run: ```bash @@ -181,8 +346,37 @@ searches are case-**sensitive** on the command line. ## Stopping CLP +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + If you need to stop CLP, run: ```bash sbin/stop-clp.sh ``` +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +To stop CLP, uninstall the Helm release: + +```bash +helm uninstall clp +``` + +To also delete the kind cluster: + +```bash +kind delete cluster --name clp +``` +::: +:::: + +[api-server]: ../guides-using-the-api-server.md +[clp-releases]: https://github.com/y-scope/clp/releases +[datasets]: ../resources-datasets +[docker-compose-deployment]: ../guides-docker-compose-deployment.md +[k8s-deployment]: ../guides-k8s-deployment.md +[text-search-syntax]: ../reference-text-search-syntax.md diff --git a/docs/src/user-docs/quick-start/index.md b/docs/src/user-docs/quick-start/index.md index dc165f1c6e..c1d2350773 100644 --- a/docs/src/user-docs/quick-start/index.md +++ b/docs/src/user-docs/quick-start/index.md @@ -4,35 +4,72 @@ This guide describes the following: * [CLP's system requirements](#system-requirements) * [How to choose a CLP flavor](#choosing-a-flavor) -* [How to use CLP](#using-clp). +* [How to use CLP](#using-clp) --- ## System Requirements -To run a CLP release, you'll need: +This quick start guide covers **single-host** deployment using Docker Compose or Kubernetes with +Helm. For deployments that scale across multiple machines for higher throughput, see: -* [Docker](#docker) +* [Docker Compose deployment][docker-compose-deployment] for advanced Docker Compose configurations +* [Kubernetes deployment][k8s-deployment] for production Kubernetes clusters + +Choose the requirements below based on your preferred orchestration method. + +::::{tab-set} +:::{tab-item} Docker Compose +:sync: docker + +* [Docker][Docker] * `containerd.io` >= 1.7.18 * `docker-ce` >= 27.0.3 * `docker-ce-cli` >= 27.0.3 * `docker-compose-plugin` >= 2.28.1 -### Docker - -To check whether Docker is installed on your system, run: +To check whether the required tools are installed on your system, run: ```bash -docker version +containerd --version +docker version --format '{{.Server.Version}}' +docker compose version --short ``` -If Docker isn't installed, follow [these instructions][Docker] to install it. - -NOTE: - +```{note} * If you're not running as root, ensure Docker can be run [without superuser privileges][docker-non-root]. * If you're using Docker Desktop, ensure version 4.34 or higher is installed. +``` + +::: + +:::{tab-item} Kubernetes (kind) +:sync: kind + +[kind] (Kubernetes in Docker) runs a Kubernetes cluster inside Docker containers, making it ideal +for local Kubernetes testing and development. + +* [Docker][Docker] (required for kind) + * `containerd.io` >= 1.7.18 + * `docker-ce` >= 27.0.3 + * `docker-ce-cli` >= 27.0.3 +* [kubectl][kubectl] >= 1.30 +* [Helm][Helm] >= 4.0 +* [kind][kind] >= 0.23 + +To check whether the tools are installed on your system, run: + +```bash +containerd --version +docker version --format '{{.Server.Version}}' +kubectl version --client --output=yaml | grep gitVersion +helm version --short +kind version +``` + +::: +:::: --- @@ -73,7 +110,7 @@ The log file above contains two log events represented by two JSON objects print other. Whitespace is ignored, so the log events could also appear with no newlines and indentation. If you're using JSON logs, download and extract the `clp-json` release from the -[Releases][clp-releases] page, then proceed to the [clp-json quick-start](./clp-json.md) guide. +[Releases][clp-releases] page, then proceed to the [clp-json quick-start][clp-json] guide. ### clp-text @@ -101,7 +138,7 @@ The log file above contains two log events, both beginning with a timestamp. The line, while the second contains multiple lines. If you're using unstructured text logs, download and extract the `clp-text` release from the -[Releases][clp-releases] page, then proceed to the [clp-text quick-start](./clp-text.md) guide. +[Releases][clp-releases] page, then proceed to the [clp-text quick-start][clp-text] guide. --- @@ -128,6 +165,13 @@ How to compress and search unstructured text logs. ::: :::: +[clp-json]: ./clp-json.md [clp-releases]: https://github.com/y-scope/clp/releases +[clp-text]: ./clp-text.md [Docker]: https://docs.docker.com/engine/install/ +[docker-compose-deployment]: ../guides-docker-compose-deployment.md [docker-non-root]: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user +[Helm]: https://helm.sh/docs/intro/install/ +[k8s-deployment]: ../guides-k8s-deployment.md +[kind]: https://kind.sigs.k8s.io/docs/user/quick-start/#installation +[kubectl]: https://kubernetes.io/docs/tasks/tools/ diff --git a/tools/deployment/package-helm/.test-common.sh b/tools/deployment/package-helm/.test-common.sh new file mode 100644 index 0000000000..3333fdc544 --- /dev/null +++ b/tools/deployment/package-helm/.test-common.sh @@ -0,0 +1,109 @@ +#!/usr/bin/env bash + +# Common utilities for CLP Helm chart tests + +set -o errexit +set -o nounset +set -o pipefail + +CLP_HOME="${CLP_HOME:-/tmp/clp}" + +# Waits for all jobs to complete and all non-job pods to be ready. +# +# @param {int} timeout_seconds Overall timeout in seconds +# @param {int} poll_interval_seconds Interval between status checks +# @param {int} wait_timeout_seconds Timeout for each kubectl wait call +# @param {string} kubectl_get_opts Additional options for kubectl get pods (optional) +# @return {int} 0 on success, 1 on timeout +wait_for_pods() { + local timeout_seconds=$1 + local poll_interval_seconds=$2 + local wait_timeout_seconds=$3 + local kubectl_get_opts="${4:-}" + + echo "Waiting for all pods to be ready" \ + "(timeout=${timeout_seconds}s, poll=${poll_interval_seconds}s," \ + "wait=${wait_timeout_seconds}s)..." + + # Reset bash built-in SECONDS counter + SECONDS=0 + + while true; do + sleep "${poll_interval_seconds}" + # shellcheck disable=SC2086 + kubectl get pods ${kubectl_get_opts} + + if kubectl wait job \ + --all \ + --for=condition=Complete \ + --timeout="${wait_timeout_seconds}s" 2>/dev/null \ + && kubectl wait pods \ + --all \ + --selector='!job-name' \ + --for=condition=Ready \ + --timeout="${wait_timeout_seconds}s" 2>/dev/null + then + echo "All jobs completed and services are ready." + return 0 + fi + + if [[ ${SECONDS} -ge ${timeout_seconds} ]]; then + echo "ERROR: Timed out waiting for pods to be ready" + return 1 + fi + + echo "---" + done +} + +# Initializes the CLP home directory structure. +init_clp_home() { + rm -rf "$CLP_HOME" + mkdir -p "$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \ + "$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \ + "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ + "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ + "$CLP_HOME/var/log/"{garbage_collector,api_server,log_ingestor,mcp_server} \ + "$CLP_HOME/var/tmp" \ + "$CLP_HOME/samples" +} + +# Downloads sample datasets in the background. +# Sets SAMPLE_DOWNLOAD_PID to the background process ID. +download_samples() { + wget -O - https://zenodo.org/records/10516402/files/postgresql.tar.gz?download=1 \ + | tar xz -C "$CLP_HOME/samples" & + SAMPLE_DOWNLOAD_PID=$! +} + +# Waits for sample download to complete. +wait_for_samples() { + wait "$SAMPLE_DOWNLOAD_PID" + echo "Sample download and extraction complete" +} + +# Generates kind extra port mappings for control-plane node. +# These are the NodePort services exposed by the Helm chart. +generate_kind_port_mappings() { + cat <<'EOF' + extraPortMappings: + - containerPort: 30306 + hostPort: 30306 + protocol: TCP + - containerPort: 30017 + hostPort: 30017 + protocol: TCP + - containerPort: 30000 + hostPort: 30000 + protocol: TCP + - containerPort: 30301 + hostPort: 30301 + protocol: TCP + - containerPort: 30302 + hostPort: 30302 + protocol: TCP + - containerPort: 30800 + hostPort: 30800 + protocol: TCP +EOF +} diff --git a/tools/deployment/package-helm/Chart.yaml b/tools/deployment/package-helm/Chart.yaml index 1da0d1598e..d26da895a4 100644 --- a/tools/deployment/package-helm/Chart.yaml +++ b/tools/deployment/package-helm/Chart.yaml @@ -1,6 +1,6 @@ apiVersion: "v2" name: "clp" -version: "0.1.2-dev.7" +version: "0.1.2-dev.8" description: "A Helm chart for CLP's (Compressed Log Processor) package deployment" type: "application" appVersion: "0.7.1-dev" diff --git a/tools/deployment/package-helm/templates/NOTES.txt b/tools/deployment/package-helm/templates/NOTES.txt new file mode 100644 index 0000000000..97a261da8a --- /dev/null +++ b/tools/deployment/package-helm/templates/NOTES.txt @@ -0,0 +1 @@ +TODO: This should be filled with usage instructions. \ No newline at end of file diff --git a/tools/deployment/package-helm/templates/_helpers.tpl b/tools/deployment/package-helm/templates/_helpers.tpl index 882c8a3b17..f948742c86 100644 --- a/tools/deployment/package-helm/templates/_helpers.tpl +++ b/tools/deployment/package-helm/templates/_helpers.tpl @@ -23,9 +23,9 @@ used as a full name. {{- .Release.Name | trunc 63 | trimSuffix "-" }} {{- else }} {{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }} -{{- end }} -{{- end }} -{{- end }} +{{- end }}{{/* if contains $name .Release.Name */}} +{{- end }}{{/* if .Values.fullnameOverride */}} +{{- end }}{{/* define "clp.fullname" */}} {{/* Creates chart name and version as used by the chart label. @@ -111,20 +111,17 @@ Used for: {{- end }} {{/* -Creates a local PersistentVolume. +Creates a PersistentVolume that does not use dynamic provisioning. @param {object} root Root template context @param {string} component_category (e.g., "database", "shared-data") @param {string} name (e.g., "archives", "data", "logs") -@param {string} nodeRole Node role for affinity. Targets nodes with label - "node-role.kubernetes.io/". Always falls back to - "node-role.kubernetes.io/control-plane" @param {string} capacity Storage capacity @param {string[]} accessModes Access modes @param {string} hostPath Absolute path on host @return {string} YAML-formatted PersistentVolume resource */}} -{{- define "clp.createLocalPv" -}} +{{- define "clp.createStaticPv" -}} apiVersion: "v1" kind: "PersistentVolume" metadata: @@ -137,19 +134,22 @@ spec: storage: {{ .capacity }} accessModes: {{ .accessModes }} persistentVolumeReclaimPolicy: "Retain" - storageClassName: "local-storage" + storageClassName: {{ .root.Values.storage.storageClassName | quote }} + {{- if .root.Values.distributed }} + hostPath: + path: {{ .hostPath | quote }} + type: "DirectoryOrCreate" + {{- else }} local: path: {{ .hostPath | quote }} nodeAffinity: required: nodeSelectorTerms: - - matchExpressions: - - key: {{ printf "node-role.kubernetes.io/%s" .nodeRole | quote }} - operator: "Exists" - matchExpressions: - key: "node-role.kubernetes.io/control-plane" operator: "Exists" -{{- end }} + {{- end }}{{/* if .root.Values.distributed */}} +{{- end }}{{/* define "clp.createStaticPv" */}} {{/* Creates a PersistentVolumeClaim for the given component. @@ -171,7 +171,7 @@ metadata: app.kubernetes.io/component: {{ .component_category | quote }} spec: accessModes: {{ .accessModes }} - storageClassName: "local-storage" + storageClassName: {{ .root.Values.storage.storageClassName | quote }} selector: matchLabels: {{- include "clp.selectorLabels" .root | nindent 6 }} @@ -247,6 +247,47 @@ hostPath: type: "Directory" {{- end }} +{{/* +Creates scheduling configuration (nodeSelector, affinity, tolerations, topologySpreadConstraints) +for a component. + +When distributed is false (single-node mode), a control-plane toleration is automatically added +so pods can be scheduled on tainted control-plane nodes without manual untainting. + +@param {object} root Root template context +@param {string} component Top-level values key (e.g., "compressionWorker", "queryWorker") +@return {string} YAML-formatted scheduling fields (nodeSelector, affinity, tolerations, + topologySpreadConstraints) +*/}} +{{- define "clp.createSchedulingConfigs" -}} +{{- $componentConfig := index .root.Values .component | default dict -}} +{{- $scheduling := $componentConfig.scheduling | default dict -}} +{{- $tolerations := $scheduling.tolerations | default list -}} +{{- if not .root.Values.distributed -}} +{{- $tolerations = append $tolerations (dict + "key" "node-role.kubernetes.io/control-plane" + "operator" "Exists" + "effect" "NoSchedule" +) -}} +{{- end -}} +{{- with $scheduling.nodeSelector }} +nodeSelector: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- with $scheduling.affinity }} +affinity: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- with $tolerations }} +tolerations: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- with $scheduling.topologySpreadConstraints }} +topologySpreadConstraints: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- end }}{{/* define "clp.createSchedulingConfigs" */}} + {{/* Creates an initContainer that waits for a Kubernetes resource to be ready. diff --git a/tools/deployment/package-helm/templates/api-server-deployment.yaml b/tools/deployment/package-helm/templates/api-server-deployment.yaml new file mode 100644 index 0000000000..a36cfde8fd --- /dev/null +++ b/tools/deployment/package-helm/templates/api-server-deployment.yaml @@ -0,0 +1,107 @@ +{{- if .Values.clpConfig.api_server }} +apiVersion: "apps/v1" +kind: "Deployment" +metadata: + name: {{ include "clp.fullname" . }}-api-server + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "api-server" +spec: + replicas: 1 + selector: + matchLabels: + {{- include "clp.selectorLabels" . | nindent 6 }} + app.kubernetes.io/component: "api-server" + template: + metadata: + labels: + {{- include "clp.labels" . | nindent 8 }} + app.kubernetes.io/component: "api-server" + spec: + serviceAccountName: {{ include "clp.fullname" . }}-job-watcher + terminationGracePeriodSeconds: 60 + securityContext: + runAsUser: {{ .Values.securityContext.firstParty.uid }} + runAsGroup: {{ .Values.securityContext.firstParty.gid }} + fsGroup: {{ .Values.securityContext.firstParty.gid }} + initContainers: + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "db-table-creator" + ) | nindent 10 }} + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "results-cache-indices-creator" + ) | nindent 10 }} + containers: + - name: "api-server" + image: "{{ include "clp.image.ref" . }}" + imagePullPolicy: "{{ .Values.image.clpPackage.pullPolicy }}" + env: + - name: "CLP_DB_PASS" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "password" + - name: "CLP_DB_USER" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "username" + - name: "CLP_LOGS_DIR" + value: "/var/log/api_server" + - name: "RUST_LOG" + value: "INFO" + ports: + - name: "api-server" + containerPort: 3001 + volumeMounts: + - name: {{ include "clp.volumeName" (dict + "component_category" "api-server" + "name" "logs" + ) | quote }} + mountPath: "/var/log/api_server" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} + - name: {{ include "clp.volumeName" (dict + "component_category" "shared-data" + "name" "streams" + ) | quote }} + mountPath: "/var/data/streams" + {{- end }} + command: [ + "/opt/clp/bin/api_server", + "--host", "0.0.0.0", + "--port", "3001", + "--config", "/etc/clp-config.yaml" + ] + readinessProbe: + {{- include "clp.readinessProbeTimings" . | nindent 12 }} + httpGet: &api-server-health-check + path: "/health" + port: 3001 + livenessProbe: + {{- include "clp.livenessProbeTimings" . | nindent 12 }} + httpGet: *api-server-health-check + volumes: + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "api-server" + "name" "logs" + ) | nindent 10 }} + - name: "config" + configMap: + name: {{ include "clp.fullname" . }}-config + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "shared-data" + "name" "streams" + ) | nindent 10 }} + {{- end }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/api-server-logs-pv.yaml b/tools/deployment/package-helm/templates/api-server-logs-pv.yaml new file mode 100644 index 0000000000..e117f83bd1 --- /dev/null +++ b/tools/deployment/package-helm/templates/api-server-logs-pv.yaml @@ -0,0 +1,10 @@ +{{- if .Values.clpConfig.api_server }} +{{- include "clp.createStaticPv" (dict + "root" . + "component_category" "api-server" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") + "hostPath" (printf "%s/api_server" .Values.clpConfig.logs_directory) +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/api-server-logs-pvc.yaml b/tools/deployment/package-helm/templates/api-server-logs-pvc.yaml new file mode 100644 index 0000000000..d9429b6dad --- /dev/null +++ b/tools/deployment/package-helm/templates/api-server-logs-pvc.yaml @@ -0,0 +1,9 @@ +{{- if .Values.clpConfig.api_server }} +{{- include "clp.createPvc" (dict + "root" . + "component_category" "api-server" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/api-server-service.yaml b/tools/deployment/package-helm/templates/api-server-service.yaml new file mode 100644 index 0000000000..0aed0e7efa --- /dev/null +++ b/tools/deployment/package-helm/templates/api-server-service.yaml @@ -0,0 +1,18 @@ +{{- if .Values.clpConfig.api_server }} +apiVersion: "v1" +kind: "Service" +metadata: + name: {{ include "clp.fullname" . }}-api-server + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "api-server" +spec: + type: "NodePort" + selector: + {{- include "clp.selectorLabels" . | nindent 4 }} + app.kubernetes.io/component: "api-server" + ports: + - port: 3001 + targetPort: "api-server" + nodePort: {{ .Values.clpConfig.api_server.port }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/compression-scheduler-deployment.yaml b/tools/deployment/package-helm/templates/compression-scheduler-deployment.yaml index f58e9594b8..37a86ca019 100644 --- a/tools/deployment/package-helm/templates/compression-scheduler-deployment.yaml +++ b/tools/deployment/package-helm/templates/compression-scheduler-deployment.yaml @@ -66,11 +66,6 @@ spec: - name: "PYTHONPATH" value: "/opt/clp/lib/python3/site-packages" volumeMounts: - - {{- include "clp.logsInputVolumeMount" . | nindent 14 }} - - name: "config" - mountPath: "/etc/clp-config.yaml" - subPath: "clp-config.yaml" - readOnly: true - name: {{ include "clp.volumeName" (dict "component_category" "compression-scheduler" "name" "logs" @@ -81,13 +76,24 @@ spec: "name" "user-logs" ) | quote }} mountPath: "/var/log/user" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true + {{- if .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + mountPath: "/opt/clp/.aws" + readOnly: true + {{- end }} + {{- if eq .Values.clpConfig.logs_input.type "fs" }} + - {{- include "clp.logsInputVolumeMount" . | nindent 14 }} + {{- end }} command: [ "python3", "-u", "-m", "job_orchestration.scheduler.compress.compression_scheduler", "--config", "/etc/clp-config.yaml" ] volumes: - - {{- include "clp.logsInputVolume" . | nindent 10 }} - {{- include "clp.pvcVolume" (dict "root" . "component_category" "compression-scheduler" @@ -101,3 +107,12 @@ spec: - name: "config" configMap: name: {{ include "clp.fullname" . }}-config + {{- with .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + hostPath: + path: {{ . | quote }} + type: "Directory" + {{- end }} + {{- if eq .Values.clpConfig.logs_input.type "fs" }} + - {{- include "clp.logsInputVolume" . | nindent 10 }} + {{- end }} diff --git a/tools/deployment/package-helm/templates/compression-scheduler-logs-pv.yaml b/tools/deployment/package-helm/templates/compression-scheduler-logs-pv.yaml index 565f5dc40b..0a886c7c53 100644 --- a/tools/deployment/package-helm/templates/compression-scheduler-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/compression-scheduler-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "compression-scheduler" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/compression_scheduler" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/compression-scheduler-user-logs-pv.yaml b/tools/deployment/package-helm/templates/compression-scheduler-user-logs-pv.yaml index 9b667a58bf..2e30551fbd 100644 --- a/tools/deployment/package-helm/templates/compression-scheduler-user-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/compression-scheduler-user-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "compression-scheduler" "name" "user-logs" - "nodeRole" "control-plane" "capacity" "10Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/user" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/compression-worker-deployment.yaml b/tools/deployment/package-helm/templates/compression-worker-deployment.yaml index 70a6165154..192d521ba7 100644 --- a/tools/deployment/package-helm/templates/compression-worker-deployment.yaml +++ b/tools/deployment/package-helm/templates/compression-worker-deployment.yaml @@ -6,7 +6,7 @@ metadata: {{- include "clp.labels" . | nindent 4 }} app.kubernetes.io/component: "compression-worker" spec: - replicas: 1 + replicas: {{ .Values.compressionWorker.replicas }} selector: matchLabels: {{- include "clp.selectorLabels" . | nindent 6 }} @@ -17,6 +17,10 @@ spec: {{- include "clp.labels" . | nindent 8 }} app.kubernetes.io/component: "compression-worker" spec: + {{- include "clp.createSchedulingConfigs" (dict + "root" . + "component" "compressionWorker" + ) | nindent 6 }} terminationGracePeriodSeconds: 60 securityContext: runAsUser: {{ .Values.securityContext.firstParty.uid }} @@ -45,7 +49,6 @@ spec: - name: "PYTHONPATH" value: "/opt/clp/lib/python3/site-packages" volumeMounts: - - {{- include "clp.logsInputVolumeMount" . | nindent 14 }} - name: {{ include "clp.volumeName" (dict "component_category" "compression-worker" "name" "logs" @@ -60,11 +63,28 @@ spec: mountPath: "/etc/clp-config.yaml" subPath: "clp-config.yaml" readOnly: true + {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} - name: {{ include "clp.volumeName" (dict "component_category" "shared-data" "name" "archives" ) | quote }} mountPath: "/var/data/archives" + {{- end }} + {{- if .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + mountPath: "/opt/clp/.aws" + readOnly: true + {{- end }} + {{- if eq .Values.clpConfig.logs_input.type "fs" }} + - {{- include "clp.logsInputVolumeMount" . | nindent 14 }} + {{- end }} + {{- if eq .Values.clpConfig.archive_output.storage.type "s3" }} + - name: {{ include "clp.volumeName" (dict + "component_category" "compression-worker" + "name" "staged-archives" + ) | quote }} + mountPath: "/var/data/staged-archives" + {{- end }} command: [ "python3", "-u", "/opt/clp/lib/python3/site-packages/bin/celery", @@ -77,7 +97,6 @@ spec: "-n", "compression-worker" ] volumes: - - {{- include "clp.logsInputVolume" . | nindent 10 }} - {{- include "clp.pvcVolume" (dict "root" . "component_category" "compression-worker" @@ -88,11 +107,29 @@ spec: "component_category" "compression-worker" "name" "tmp" ) | nindent 10 }} + - name: "config" + configMap: + name: {{ include "clp.fullname" . }}-config + {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} - {{- include "clp.pvcVolume" (dict "root" . "component_category" "shared-data" "name" "archives" ) | nindent 10 }} - - name: "config" - configMap: - name: {{ include "clp.fullname" . }}-config + {{- end }} + {{- with .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + hostPath: + path: {{ . | quote }} + type: "Directory" + {{- end }} + {{- if eq .Values.clpConfig.logs_input.type "fs" }} + - {{- include "clp.logsInputVolume" . | nindent 10 }} + {{- end }} + {{- if eq .Values.clpConfig.archive_output.storage.type "s3" }} + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "compression-worker" + "name" "staged-archives" + ) | nindent 10 }} + {{- end }} diff --git a/tools/deployment/package-helm/templates/compression-worker-logs-pv.yaml b/tools/deployment/package-helm/templates/compression-worker-logs-pv.yaml index 4b6d55466b..55f243cf8c 100644 --- a/tools/deployment/package-helm/templates/compression-worker-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/compression-worker-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "compression-worker" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/compression_worker" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/compression-worker-staged-archives-pv.yaml b/tools/deployment/package-helm/templates/compression-worker-staged-archives-pv.yaml new file mode 100644 index 0000000000..26f20ed732 --- /dev/null +++ b/tools/deployment/package-helm/templates/compression-worker-staged-archives-pv.yaml @@ -0,0 +1,10 @@ +{{- if eq .Values.clpConfig.archive_output.storage.type "s3" }} +{{- include "clp.createStaticPv" (dict + "root" . + "component_category" "compression-worker" + "name" "staged-archives" + "capacity" "20Gi" + "accessModes" (list "ReadWriteOnce") + "hostPath" (printf "%s/staged-archives" .Values.clpConfig.data_directory) +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/compression-worker-staged-archives-pvc.yaml b/tools/deployment/package-helm/templates/compression-worker-staged-archives-pvc.yaml new file mode 100644 index 0000000000..b8a9367236 --- /dev/null +++ b/tools/deployment/package-helm/templates/compression-worker-staged-archives-pvc.yaml @@ -0,0 +1,9 @@ +{{- if eq .Values.clpConfig.archive_output.storage.type "s3" }} +{{- include "clp.createPvc" (dict + "root" . + "component_category" "compression-worker" + "name" "staged-archives" + "capacity" "20Gi" + "accessModes" (list "ReadWriteOnce") +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/compression-worker-tmp-pv.yaml b/tools/deployment/package-helm/templates/compression-worker-tmp-pv.yaml index 43e46c2503..d7107e14ae 100644 --- a/tools/deployment/package-helm/templates/compression-worker-tmp-pv.yaml +++ b/tools/deployment/package-helm/templates/compression-worker-tmp-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "compression-worker" "name" "tmp" - "nodeRole" "control-plane" "capacity" "10Gi" "accessModes" (list "ReadWriteOnce") "hostPath" .Values.clpConfig.tmp_directory diff --git a/tools/deployment/package-helm/templates/configmap.yaml b/tools/deployment/package-helm/templates/configmap.yaml index f61296931f..881dfbb972 100644 --- a/tools/deployment/package-helm/templates/configmap.yaml +++ b/tools/deployment/package-helm/templates/configmap.yaml @@ -6,78 +6,198 @@ metadata: {{- include "clp.labels" . | nindent 4 }} data: clp-config.yaml: | + {{- with .Values.clpConfig.api_server }} + api_server: + default_max_num_query_results: {{ .default_max_num_query_results | int }} + host: "localhost" + port: 3001 + query_job_polling: + initial_backoff_ms: {{ .query_job_polling.initial_backoff_ms | int }} + max_backoff_ms: {{ .query_job_polling.max_backoff_ms | int }} + {{- else }} + api_server: null + {{- end }} + {{- with .Values.clpConfig.archive_output }} archive_output: - compression_level: {{ .Values.clpConfig.archive_output.compression_level }} + compression_level: {{ .compression_level | int }} + {{ with .retention_period }} + retention_period: {{ . | int }} + {{ else }} + retention_period: null + {{ end }} storage: - directory: "/var/data/archives" + {{- if eq .storage.type "fs" }} type: "fs" - target_archive_size: {{ .Values.clpConfig.archive_output.target_archive_size | int }} - target_dictionaries_size: {{ .Values.clpConfig.archive_output.target_dictionaries_size - | int }} - target_encoded_file_size: {{ .Values.clpConfig.archive_output.target_encoded_file_size - | int }} - target_segment_size: {{ .Values.clpConfig.archive_output.target_segment_size - | int }} + directory: "/var/data/archives" + {{- else }} + type: "s3" + staging_directory: "/var/data/staged-archives" + {{- with .storage.s3_config }} + s3_config: + endpoint_url: {{ .endpoint_url | default "null" }} + region_code: {{ .region_code | quote }} + bucket: {{ .bucket | quote }} + key_prefix: {{ .key_prefix | quote }} + {{- with .aws_authentication }} + aws_authentication: + type: {{ .type | quote }} + {{- if eq .type "credentials" }} + credentials: + access_key_id: {{ .credentials.access_key_id | quote }} + secret_access_key: {{ .credentials.secret_access_key | quote }} + {{- end }} + {{- if eq .type "profile" }} + profile: {{ .profile | quote }} + {{- end }} + {{- end }}{{/* with .aws_authentication */}} + {{- end }}{{/* with .storage.s3_config */}} + {{- end }}{{/* if eq .storage.type "fs" */}} + target_archive_size: {{ .target_archive_size | int }} + target_dictionaries_size: {{ .target_dictionaries_size | int }} + target_encoded_file_size: {{ .target_encoded_file_size | int }} + target_segment_size: {{ .target_segment_size | int }} + {{- end }}{{/* with .Values.clpConfig.archive_output */}} + {{- with .Values.clpConfig.aws_config_directory }} + aws_config_directory: {{ . | quote }} + {{- else }} + aws_config_directory: null + {{- end }} compression_scheduler: jobs_poll_delay: {{ .Values.clpConfig.compression_scheduler.jobs_poll_delay }} logging_level: {{ .Values.clpConfig.compression_scheduler.logging_level | quote }} max_concurrent_tasks_per_job: {{ - .Values.clpConfig.compression_scheduler.max_concurrent_tasks_per_job }} + .Values.clpConfig.compression_scheduler.max_concurrent_tasks_per_job | int }} + type: {{ .Values.clpConfig.compression_scheduler.type | quote }} compression_worker: logging_level: {{ .Values.clpConfig.compression_worker.logging_level | quote }} data_directory: "/var/data" database: auto_commit: false compress: true - host: {{ include "clp.fullname" . }}-database - name: {{ .Values.clpConfig.database.name | quote }} + host: "{{ include "clp.fullname" . }}-database" + names: + clp: {{ .Values.clpConfig.database.names.clp | quote }} port: 3306 ssl_cert: null type: {{ .Values.clpConfig.database.type | quote }} + garbage_collector: + logging_level: {{ .Values.clpConfig.garbage_collector.logging_level | quote }} + sweep_interval: + archive: {{ .Values.clpConfig.garbage_collector.sweep_interval.archive | int }} + search_result: {{ .Values.clpConfig.garbage_collector.sweep_interval.search_result | int }} logs_directory: "/var/log" + {{- with .Values.clpConfig.logs_input }} logs_input: - directory: "/mnt/logs" + {{- if eq .type "fs" }} type: "fs" + directory: "/mnt/logs" + {{- else }} + type: "s3" + {{- with .aws_authentication }} + aws_authentication: + type: {{ .type | quote }} + {{- if eq .type "credentials" }} + credentials: + access_key_id: {{ .credentials.access_key_id | quote }} + secret_access_key: {{ .credentials.secret_access_key | quote }} + {{- end }} + {{- if eq .type "profile" }} + profile: {{ .profile | quote }} + {{- end }} + {{- end }}{{/* with .aws_authentication */}} + {{- end }}{{/* if eq .type "fs" */}} + {{- end }}{{/* with .Values.clpConfig.logs_input */}} + {{- with .Values.clpConfig.log_ingestor }} + log_ingestor: + buffer_flush_threshold: {{ .buffer_flush_threshold | int }} + buffer_flush_timeout: {{ .buffer_flush_timeout | int }} + channel_capacity: {{ .channel_capacity | int }} + host: "localhost" + logging_level: {{ .logging_level | quote }} + port: 3002 + {{- else }} + log_ingestor: null + {{- end }} + {{- with .Values.clpConfig.mcp_server }} + mcp_server: + host: "localhost" + logging_level: {{ .logging_level | quote }} + port: 8000 + {{- else }} + mcp_server: null + {{- end }} package: query_engine: {{ .Values.clpConfig.package.query_engine | quote }} storage_engine: {{ .Values.clpConfig.package.storage_engine | quote }} + presto: {{ .Values.clpConfig.presto }} query_scheduler: - host: {{ include "clp.fullname" . }}-query-scheduler + host: "{{ include "clp.fullname" . }}-query-scheduler" jobs_poll_delay: {{ .Values.clpConfig.query_scheduler.jobs_poll_delay }} logging_level: {{ .Values.clpConfig.query_scheduler.logging_level | quote }} num_archives_to_search_per_sub_job: {{ - .Values.clpConfig.query_scheduler.num_archives_to_search_per_sub_job }} + .Values.clpConfig.query_scheduler.num_archives_to_search_per_sub_job | int }} port: 7000 query_worker: logging_level: {{ .Values.clpConfig.query_worker.logging_level | quote }} queue: - host: {{ include "clp.fullname" . }}-queue + host: "{{ include "clp.fullname" . }}-queue" port: 5672 redis: - compression_backend_database: {{ .Values.clpConfig.redis.compression_backend_database }} - host: {{ include "clp.fullname" . }}-redis + compression_backend_database: {{ .Values.clpConfig.redis.compression_backend_database | int }} + host: "{{ include "clp.fullname" . }}-redis" port: 6379 - query_backend_database: {{ .Values.clpConfig.redis.query_backend_database }} + query_backend_database: {{ .Values.clpConfig.redis.query_backend_database | int }} reducer: base_port: 14009 - host: {{ include "clp.fullname" . }}-reducer + host: "{{ include "clp.fullname" . }}-reducer" logging_level: {{ .Values.clpConfig.reducer.logging_level | quote }} - upsert_interval: {{ .Values.clpConfig.reducer.upsert_interval }} + upsert_interval: {{ .Values.clpConfig.reducer.upsert_interval | int }} results_cache: db_name: {{ .Values.clpConfig.results_cache.db_name | quote }} - host: {{ include "clp.fullname" . }}-results-cache + host: "{{ include "clp.fullname" . }}-results-cache" port: 27017 + {{ with .Values.clpConfig.results_cache.retention_period }} + retention_period: {{ . | int }} + {{ else }} + retention_period: null + {{ end }} stream_collection_name: {{ .Values.clpConfig.results_cache.stream_collection_name | quote }} + {{- with .Values.clpConfig.stream_output }} stream_output: storage: - directory: "/var/data/streams" + {{- if eq .storage.type "fs" }} type: "fs" - target_uncompressed_size: {{ .Values.clpConfig.stream_output.target_uncompressed_size | int }} + directory: "/var/data/streams" + {{- else }} + type: "s3" + staging_directory: "/var/data/staged-streams" + {{- with .storage.s3_config }} + s3_config: + endpoint_url: {{ .endpoint_url | default "null" }} + region_code: {{ .region_code | quote }} + bucket: {{ .bucket | quote }} + key_prefix: {{ .key_prefix | quote }} + {{- with .aws_authentication }} + aws_authentication: + type: {{ .type | quote }} + {{- if eq .type "credentials" }} + credentials: + access_key_id: {{ .credentials.access_key_id | quote }} + secret_access_key: {{ .credentials.secret_access_key | quote }} + {{- end }} + {{- if eq .type "profile" }} + profile: {{ .profile | quote }} + {{- end }} + {{- end }}{{/* with .aws_authentication */}} + {{- end }}{{/* with .storage.s3_config */}} + {{- end }}{{/* if eq .storage.type "fs" */}} + target_uncompressed_size: {{ .target_uncompressed_size | int }} + {{- end }}{{/* with .Values.clpConfig.stream_output */}} tmp_directory: "/var/tmp" webui: host: "localhost" port: 4000 - rate_limit: {{ .Values.clpConfig.webui.rate_limit }} + rate_limit: {{ .Values.clpConfig.webui.rate_limit | int }} results_metadata_collection_name: {{ .Values.clpConfig.webui.results_metadata_collection_name | quote }} @@ -114,7 +234,11 @@ data: "ClpStorageEngine": {{ .Values.clpConfig.package.storage_engine | quote }}, "ClpQueryEngine": {{ .Values.clpConfig.package.query_engine | quote }}, "LogsInputType": {{ .Values.clpConfig.logs_input.type | quote }}, + {{- if eq .Values.clpConfig.logs_input.type "fs" }} "LogsInputRootDir": "/mnt/logs", + {{- else }} + "LogsInputRootDir": null, + {{- end }} "MongoDbSearchResultsMetadataCollectionName": {{ .Values.clpConfig.webui.results_metadata_collection_name | quote }}, "SqlDbClpArchivesTableName": "clp_archives", @@ -128,7 +252,7 @@ data: { "SqlDbHost": "{{ include "clp.fullname" . }}-database", "SqlDbPort": 3306, - "SqlDbName": {{ .Values.clpConfig.database.name | quote }}, + "SqlDbName": {{ .Values.clpConfig.database.names.clp | quote }}, "SqlDbQueryJobsTableName": "query_jobs", "SqlDbCompressionJobsTableName": "compression_jobs", "MongoDbHost": "{{ include "clp.fullname" . }}-results-cache", @@ -140,14 +264,31 @@ data: {{ .Values.clpConfig.results_cache.stream_collection_name | quote }}, "ClientDir": "/opt/clp/var/www/webui/client", "LogViewerDir": "/opt/clp/var/www/webui/yscope-log-viewer", + {{- if eq .Values.clpConfig.logs_input.type "fs" }} "LogsInputRootDir": "/mnt/logs", + {{- else }} + "LogsInputRootDir": null, + {{- end }} + {{- with .Values.clpConfig.stream_output.storage }} + {{- if eq .type "fs" }} "StreamFilesDir": "/var/data/streams", - "StreamTargetUncompressedSize": - {{ .Values.clpConfig.stream_output.target_uncompressed_size | int }}, "StreamFilesS3Region": null, "StreamFilesS3PathPrefix": null, "StreamFilesS3Profile": null, - "ArchiveOutputCompressionLevel": {{ .Values.clpConfig.archive_output.compression_level }}, + {{- else }} + "StreamFilesDir": null, + "StreamFilesS3Region": {{ .s3_config.region_code | quote }}, + "StreamFilesS3PathPrefix": "{{ .s3_config.bucket }}/{{ .s3_config.key_prefix }}", + {{- if eq .s3_config.aws_authentication.type "profile" }} + "StreamFilesS3Profile": {{ .s3_config.aws_authentication.profile | quote }}, + {{- else }} + "StreamFilesS3Profile": null, + {{- end }} + {{- end }}{{/* if eq .type "fs" */}} + {{- end }}{{/* with .Values.clpConfig.stream_output.storage */}} + "StreamTargetUncompressedSize": + {{ .Values.clpConfig.stream_output.target_uncompressed_size | int }}, + "ArchiveOutputCompressionLevel": {{ .Values.clpConfig.archive_output.compression_level | int }}, "ArchiveOutputTargetArchiveSize": {{ .Values.clpConfig.archive_output.target_archive_size | int }}, "ArchiveOutputTargetDictionariesSize": @@ -158,6 +299,11 @@ data: {{ .Values.clpConfig.archive_output.target_segment_size | int }}, "ClpQueryEngine": {{ .Values.clpConfig.package.query_engine | quote }}, "ClpStorageEngine": {{ .Values.clpConfig.package.storage_engine | quote }}, + {{- with .Values.clpConfig.presto }} + "PrestoHost": {{ .host | quote }}, + "PrestoPort": {{ .port | int }} + {{- else }} "PrestoHost": null, "PrestoPort": null + {{- end }} } diff --git a/tools/deployment/package-helm/templates/database-data-pv.yaml b/tools/deployment/package-helm/templates/database-data-pv.yaml index 3bd6ea2b9a..1456cb9f9f 100644 --- a/tools/deployment/package-helm/templates/database-data-pv.yaml +++ b/tools/deployment/package-helm/templates/database-data-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "database" "name" "data" - "nodeRole" "control-plane" "capacity" "20Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/database" .Values.clpConfig.data_directory) diff --git a/tools/deployment/package-helm/templates/database-logs-pv.yaml b/tools/deployment/package-helm/templates/database-logs-pv.yaml index 794215cf3a..c9f2e63793 100644 --- a/tools/deployment/package-helm/templates/database-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/database-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "database" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/database" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/database-statefulset.yaml b/tools/deployment/package-helm/templates/database-statefulset.yaml index 8e67cee934..4c510c57c8 100644 --- a/tools/deployment/package-helm/templates/database-statefulset.yaml +++ b/tools/deployment/package-helm/templates/database-statefulset.yaml @@ -33,7 +33,7 @@ spec: imagePullPolicy: "Always" env: - name: "MYSQL_DATABASE" - value: {{ .Values.clpConfig.database.name | quote }} + value: {{ .Values.clpConfig.database.names.clp | quote }} - name: "MYSQL_USER" valueFrom: secretKeyRef: diff --git a/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml b/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml new file mode 100644 index 0000000000..f50d0e4e57 --- /dev/null +++ b/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml @@ -0,0 +1,125 @@ +{{- if or .Values.clpConfig.archive_output.retention_period + .Values.clpConfig.results_cache.retention_period }} +apiVersion: "apps/v1" +kind: "Deployment" +metadata: + name: {{ include "clp.fullname" . }}-garbage-collector + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "garbage-collector" +spec: + replicas: 1 + selector: + matchLabels: + {{- include "clp.selectorLabels" . | nindent 6 }} + app.kubernetes.io/component: "garbage-collector" + template: + metadata: + labels: + {{- include "clp.labels" . | nindent 8 }} + app.kubernetes.io/component: "garbage-collector" + spec: + serviceAccountName: {{ include "clp.fullname" . }}-job-watcher + terminationGracePeriodSeconds: 10 + securityContext: + runAsUser: {{ .Values.securityContext.firstParty.uid }} + runAsGroup: {{ .Values.securityContext.firstParty.gid }} + fsGroup: {{ .Values.securityContext.firstParty.gid }} + initContainers: + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "db-table-creator" + ) | nindent 10 }} + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "results-cache-indices-creator" + ) | nindent 10 }} + containers: + - name: "garbage-collector" + image: "{{ include "clp.image.ref" . }}" + imagePullPolicy: "{{ .Values.image.clpPackage.pullPolicy }}" + env: + - name: "CLP_DB_PASS" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "password" + - name: "CLP_DB_USER" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "username" + - name: "CLP_HOME" + value: "/opt/clp" + - name: "CLP_LOGGING_LEVEL" + value: {{ .Values.clpConfig.garbage_collector.logging_level | quote }} + - name: "CLP_LOGS_DIR" + value: "/var/log/garbage_collector" + - name: "PYTHONPATH" + value: "/opt/clp/lib/python3/site-packages" + volumeMounts: + - name: {{ include "clp.volumeName" (dict + "component_category" "garbage-collector" + "name" "logs" + ) | quote }} + mountPath: "/var/log/garbage_collector" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true + {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} + - name: {{ include "clp.volumeName" (dict + "component_category" "shared-data" + "name" "archives" + ) | quote }} + mountPath: "/var/data/archives" + {{- end }} + {{- if .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + mountPath: "/opt/clp/.aws" + readOnly: true + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} + - name: {{ include "clp.volumeName" (dict + "component_category" "shared-data" + "name" "streams" + ) | quote }} + mountPath: "/var/data/streams" + {{- end }} + command: [ + "python3", "-u", + "-m", "job_orchestration.garbage_collector.garbage_collector", + "--config", "/etc/clp-config.yaml" + ] + volumes: + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "garbage-collector" + "name" "logs" + ) | nindent 10 }} + - name: "config" + configMap: + name: {{ include "clp.fullname" . }}-config + {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "shared-data" + "name" "archives" + ) | nindent 10 }} + {{- end }} + {{- with .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + hostPath: + path: {{ . | quote }} + type: "Directory" + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "shared-data" + "name" "streams" + ) | nindent 10 }} + {{- end }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/garbage-collector-logs-pv.yaml b/tools/deployment/package-helm/templates/garbage-collector-logs-pv.yaml new file mode 100644 index 0000000000..31a3484de4 --- /dev/null +++ b/tools/deployment/package-helm/templates/garbage-collector-logs-pv.yaml @@ -0,0 +1,11 @@ +{{- if or .Values.clpConfig.archive_output.retention_period + .Values.clpConfig.results_cache.retention_period }} +{{- include "clp.createStaticPv" (dict + "root" . + "component_category" "garbage-collector" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") + "hostPath" (printf "%s/garbage_collector" .Values.clpConfig.logs_directory) +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/garbage-collector-logs-pvc.yaml b/tools/deployment/package-helm/templates/garbage-collector-logs-pvc.yaml new file mode 100644 index 0000000000..af6f5a0392 --- /dev/null +++ b/tools/deployment/package-helm/templates/garbage-collector-logs-pvc.yaml @@ -0,0 +1,10 @@ +{{- if or .Values.clpConfig.archive_output.retention_period + .Values.clpConfig.results_cache.retention_period }} +{{- include "clp.createPvc" (dict + "root" . + "component_category" "garbage-collector" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/log-ingestor-deployment.yaml b/tools/deployment/package-helm/templates/log-ingestor-deployment.yaml new file mode 100644 index 0000000000..bf753c7636 --- /dev/null +++ b/tools/deployment/package-helm/templates/log-ingestor-deployment.yaml @@ -0,0 +1,88 @@ +{{- if eq .Values.clpConfig.logs_input.type "s3" }} +apiVersion: "apps/v1" +kind: "Deployment" +metadata: + name: {{ include "clp.fullname" . }}-log-ingestor + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "log-ingestor" +spec: + replicas: 1 + selector: + matchLabels: + {{- include "clp.selectorLabels" . | nindent 6 }} + app.kubernetes.io/component: "log-ingestor" + template: + metadata: + labels: + {{- include "clp.labels" . | nindent 8 }} + app.kubernetes.io/component: "log-ingestor" + spec: + serviceAccountName: {{ include "clp.fullname" . }}-job-watcher + terminationGracePeriodSeconds: 60 + securityContext: + runAsUser: {{ .Values.securityContext.firstParty.uid }} + runAsGroup: {{ .Values.securityContext.firstParty.gid }} + fsGroup: {{ .Values.securityContext.firstParty.gid }} + initContainers: + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "db-table-creator" + ) | nindent 10 }} + containers: + - name: "log-ingestor" + image: "{{ include "clp.image.ref" . }}" + imagePullPolicy: "{{ .Values.image.clpPackage.pullPolicy }}" + env: + - name: "CLP_DB_PASS" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "password" + - name: "CLP_DB_USER" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "username" + - name: "CLP_LOGS_DIR" + value: "/var/log/log_ingestor" + - name: "RUST_LOG" + value: {{ .Values.clpConfig.log_ingestor.logging_level | quote }} + ports: + - name: "log-ingestor" + containerPort: 3002 + volumeMounts: + - name: {{ include "clp.volumeName" (dict + "component_category" "log-ingestor" + "name" "logs" + ) | quote }} + mountPath: "/var/log/log_ingestor" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true + command: [ + "/opt/clp/bin/log-ingestor", + "--config", "/etc/clp-config.yaml", + "--host", "0.0.0.0", + "--port", "3002" + ] + readinessProbe: + {{- include "clp.readinessProbeTimings" . | nindent 12 }} + httpGet: &log-ingestor-health-check + path: "/health" + port: 3002 + livenessProbe: + {{- include "clp.livenessProbeTimings" . | nindent 12 }} + httpGet: *log-ingestor-health-check + volumes: + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "log-ingestor" + "name" "logs" + ) | nindent 10 }} + - name: "config" + configMap: + name: {{ include "clp.fullname" . }}-config +{{- end }} diff --git a/tools/deployment/package-helm/templates/log-ingestor-logs-pv.yaml b/tools/deployment/package-helm/templates/log-ingestor-logs-pv.yaml new file mode 100644 index 0000000000..0324e4127e --- /dev/null +++ b/tools/deployment/package-helm/templates/log-ingestor-logs-pv.yaml @@ -0,0 +1,10 @@ +{{- if eq .Values.clpConfig.logs_input.type "s3" }} +{{- include "clp.createStaticPv" (dict + "root" . + "component_category" "log-ingestor" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") + "hostPath" (printf "%s/log_ingestor" .Values.clpConfig.logs_directory) +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/log-ingestor-logs-pvc.yaml b/tools/deployment/package-helm/templates/log-ingestor-logs-pvc.yaml new file mode 100644 index 0000000000..62fc3c9740 --- /dev/null +++ b/tools/deployment/package-helm/templates/log-ingestor-logs-pvc.yaml @@ -0,0 +1,9 @@ +{{- if eq .Values.clpConfig.logs_input.type "s3" }} +{{- include "clp.createPvc" (dict + "root" . + "component_category" "log-ingestor" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/log-ingestor-service.yaml b/tools/deployment/package-helm/templates/log-ingestor-service.yaml new file mode 100644 index 0000000000..f79a324e61 --- /dev/null +++ b/tools/deployment/package-helm/templates/log-ingestor-service.yaml @@ -0,0 +1,18 @@ +{{- if eq .Values.clpConfig.logs_input.type "s3" }} +apiVersion: "v1" +kind: "Service" +metadata: + name: {{ include "clp.fullname" . }}-log-ingestor + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "log-ingestor" +spec: + type: "NodePort" + selector: + {{- include "clp.selectorLabels" . | nindent 4 }} + app.kubernetes.io/component: "log-ingestor" + ports: + - port: 3002 + targetPort: "log-ingestor" + nodePort: {{ .Values.clpConfig.log_ingestor.port }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/mcp-server-deployment.yaml b/tools/deployment/package-helm/templates/mcp-server-deployment.yaml new file mode 100644 index 0000000000..f648a6f232 --- /dev/null +++ b/tools/deployment/package-helm/templates/mcp-server-deployment.yaml @@ -0,0 +1,96 @@ +{{- if .Values.clpConfig.mcp_server }} +apiVersion: "apps/v1" +kind: "Deployment" +metadata: + name: {{ include "clp.fullname" . }}-mcp-server + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "mcp-server" +spec: + replicas: 1 + selector: + matchLabels: + {{- include "clp.selectorLabels" . | nindent 6 }} + app.kubernetes.io/component: "mcp-server" + template: + metadata: + labels: + {{- include "clp.labels" . | nindent 8 }} + app.kubernetes.io/component: "mcp-server" + spec: + serviceAccountName: {{ include "clp.fullname" . }}-job-watcher + terminationGracePeriodSeconds: 60 + securityContext: + runAsUser: {{ .Values.securityContext.firstParty.uid }} + runAsGroup: {{ .Values.securityContext.firstParty.gid }} + fsGroup: {{ .Values.securityContext.firstParty.gid }} + initContainers: + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "db-table-creator" + ) | nindent 10 }} + - {{- include "clp.waitFor" (dict + "root" . + "type" "job" + "name" "results-cache-indices-creator" + ) | nindent 10 }} + containers: + - name: "mcp-server" + image: "{{ include "clp.image.ref" . }}" + imagePullPolicy: "{{ .Values.image.clpPackage.pullPolicy }}" + env: + - name: "CLP_DB_PASS" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "password" + - name: "CLP_DB_USER" + valueFrom: + secretKeyRef: + name: {{ include "clp.fullname" . }}-database + key: "username" + - name: "CLP_LOGGING_LEVEL" + value: {{ .Values.clpConfig.mcp_server.logging_level | quote }} + - name: "CLP_LOGS_DIR" + value: "/var/log/mcp_server" + - name: "PYTHONPATH" + value: "/opt/clp/lib/python3/site-packages" + ports: + - name: "mcp-server" + containerPort: 8000 + volumeMounts: + - name: {{ include "clp.volumeName" (dict + "component_category" "mcp-server" + "name" "logs" + ) | quote }} + mountPath: "/var/log/mcp_server" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true + command: [ + "python3", "-u", + "-m", "clp_mcp_server.clp_mcp_server", + "--host", "0.0.0.0", + "--port", "8000", + "--config-path", "/etc/clp-config.yaml" + ] + readinessProbe: + {{- include "clp.readinessProbeTimings" . | nindent 12 }} + httpGet: &mcp-server-health-check + path: "/health" + port: 8000 + livenessProbe: + {{- include "clp.livenessProbeTimings" . | nindent 12 }} + httpGet: *mcp-server-health-check + volumes: + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "mcp-server" + "name" "logs" + ) | nindent 10 }} + - name: "config" + configMap: + name: {{ include "clp.fullname" . }}-config +{{- end }} diff --git a/tools/deployment/package-helm/templates/mcp-server-logs-pv.yaml b/tools/deployment/package-helm/templates/mcp-server-logs-pv.yaml new file mode 100644 index 0000000000..fc297f5020 --- /dev/null +++ b/tools/deployment/package-helm/templates/mcp-server-logs-pv.yaml @@ -0,0 +1,10 @@ +{{- if .Values.clpConfig.mcp_server }} +{{- include "clp.createStaticPv" (dict + "root" . + "component_category" "mcp-server" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") + "hostPath" (printf "%s/mcp_server" .Values.clpConfig.logs_directory) +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/mcp-server-logs-pvc.yaml b/tools/deployment/package-helm/templates/mcp-server-logs-pvc.yaml new file mode 100644 index 0000000000..7ab19f231d --- /dev/null +++ b/tools/deployment/package-helm/templates/mcp-server-logs-pvc.yaml @@ -0,0 +1,9 @@ +{{- if .Values.clpConfig.mcp_server }} +{{- include "clp.createPvc" (dict + "root" . + "component_category" "mcp-server" + "name" "logs" + "capacity" "5Gi" + "accessModes" (list "ReadWriteOnce") +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/mcp-server-service.yaml b/tools/deployment/package-helm/templates/mcp-server-service.yaml new file mode 100644 index 0000000000..9acbe48f9e --- /dev/null +++ b/tools/deployment/package-helm/templates/mcp-server-service.yaml @@ -0,0 +1,18 @@ +{{- if .Values.clpConfig.mcp_server }} +apiVersion: "v1" +kind: "Service" +metadata: + name: {{ include "clp.fullname" . }}-mcp-server + labels: + {{- include "clp.labels" . | nindent 4 }} + app.kubernetes.io/component: "mcp-server" +spec: + type: "NodePort" + selector: + {{- include "clp.selectorLabels" . | nindent 4 }} + app.kubernetes.io/component: "mcp-server" + ports: + - port: 8000 + targetPort: "mcp-server" + nodePort: {{ .Values.clpConfig.mcp_server.port }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml b/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml index ca7149dd75..5eba71bc30 100644 --- a/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml +++ b/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml @@ -71,20 +71,27 @@ spec: - name: "query-scheduler" containerPort: 7000 volumeMounts: - - name: "config" - mountPath: "/etc/clp-config.yaml" - subPath: "clp-config.yaml" - readOnly: true - name: {{ include "clp.volumeName" (dict "component_category" "query-scheduler" "name" "logs" ) | quote }} mountPath: "/var/log/query_scheduler" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true command: [ "python3", "-u", "-m", "job_orchestration.scheduler.query.query_scheduler", "--config", "/etc/clp-config.yaml" ] + readinessProbe: + {{- include "clp.readinessProbeTimings" . | nindent 12 }} + tcpSocket: &query-scheduler-health-check + port: "query-scheduler" + livenessProbe: + {{- include "clp.livenessProbeTimings" . | nindent 12 }} + tcpSocket: *query-scheduler-health-check volumes: - {{- include "clp.pvcVolume" (dict "root" . diff --git a/tools/deployment/package-helm/templates/query-scheduler-logs-pv.yaml b/tools/deployment/package-helm/templates/query-scheduler-logs-pv.yaml index de7633da55..bc66ac1ae3 100644 --- a/tools/deployment/package-helm/templates/query-scheduler-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/query-scheduler-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "query-scheduler" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/query_scheduler" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/query-worker-deployment.yaml b/tools/deployment/package-helm/templates/query-worker-deployment.yaml index f95e4ddaf7..0fed67a154 100644 --- a/tools/deployment/package-helm/templates/query-worker-deployment.yaml +++ b/tools/deployment/package-helm/templates/query-worker-deployment.yaml @@ -6,7 +6,7 @@ metadata: {{- include "clp.labels" . | nindent 4 }} app.kubernetes.io/component: "query-worker" spec: - replicas: 1 + replicas: {{ .Values.queryWorker.replicas }} selector: matchLabels: {{- include "clp.selectorLabels" . | nindent 6 }} @@ -17,6 +17,10 @@ spec: {{- include "clp.labels" . | nindent 8 }} app.kubernetes.io/component: "query-worker" spec: + {{- include "clp.createSchedulingConfigs" (dict + "root" . + "component" "queryWorker" + ) | nindent 6 }} terminationGracePeriodSeconds: 60 securityContext: runAsUser: {{ .Values.securityContext.firstParty.uid }} @@ -45,25 +49,41 @@ spec: - name: "PYTHONPATH" value: "/opt/clp/lib/python3/site-packages" volumeMounts: - - name: "config" - mountPath: "/etc/clp-config.yaml" - subPath: "clp-config.yaml" - readOnly: true - name: {{ include "clp.volumeName" (dict "component_category" "query-worker" "name" "logs" ) | quote }} mountPath: "/var/log/query_worker" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true + {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} - name: {{ include "clp.volumeName" (dict "component_category" "shared-data" "name" "archives" ) | quote }} mountPath: "/var/data/archives" + {{- end }} + {{- if .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + mountPath: "/opt/clp/.aws" + readOnly: true + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "s3" }} + - name: {{ include "clp.volumeName" (dict + "component_category" "query-worker" + "name" "staged-streams" + ) | quote }} + mountPath: "/var/data/staged-streams" + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} - name: {{ include "clp.volumeName" (dict "component_category" "shared-data" "name" "streams" ) | quote }} mountPath: "/var/data/streams" + {{- end }} command: [ "python3", "-u", "/opt/clp/lib/python3/site-packages/bin/celery", @@ -81,16 +101,33 @@ spec: "component_category" "query-worker" "name" "logs" ) | nindent 10 }} + - name: "config" + configMap: + name: {{ include "clp.fullname" . }}-config + {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} - {{- include "clp.pvcVolume" (dict "root" . "component_category" "shared-data" "name" "archives" ) | nindent 10 }} + {{- end }} + {{- with .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + hostPath: + path: {{ . | quote }} + type: "Directory" + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "s3" }} + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "query-worker" + "name" "staged-streams" + ) | nindent 10 }} + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} - {{- include "clp.pvcVolume" (dict "root" . "component_category" "shared-data" "name" "streams" ) | nindent 10 }} - - name: "config" - configMap: - name: {{ include "clp.fullname" . }}-config + {{- end }} diff --git a/tools/deployment/package-helm/templates/query-worker-logs-pv.yaml b/tools/deployment/package-helm/templates/query-worker-logs-pv.yaml index 4f602a8c74..1fd12c2e84 100644 --- a/tools/deployment/package-helm/templates/query-worker-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/query-worker-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "query-worker" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/query_worker" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/query-worker-staged-streams-pv.yaml b/tools/deployment/package-helm/templates/query-worker-staged-streams-pv.yaml new file mode 100644 index 0000000000..a404a36f81 --- /dev/null +++ b/tools/deployment/package-helm/templates/query-worker-staged-streams-pv.yaml @@ -0,0 +1,10 @@ +{{- if eq .Values.clpConfig.stream_output.storage.type "s3" }} +{{- include "clp.createStaticPv" (dict + "root" . + "component_category" "query-worker" + "name" "staged-streams" + "capacity" "20Gi" + "accessModes" (list "ReadWriteOnce") + "hostPath" (printf "%s/staged-streams" .Values.clpConfig.data_directory) +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/query-worker-staged-streams-pvc.yaml b/tools/deployment/package-helm/templates/query-worker-staged-streams-pvc.yaml new file mode 100644 index 0000000000..e909a57b21 --- /dev/null +++ b/tools/deployment/package-helm/templates/query-worker-staged-streams-pvc.yaml @@ -0,0 +1,9 @@ +{{- if eq .Values.clpConfig.stream_output.storage.type "s3" }} +{{- include "clp.createPvc" (dict + "root" . + "component_category" "query-worker" + "name" "staged-streams" + "capacity" "20Gi" + "accessModes" (list "ReadWriteOnce") +) }} +{{- end }} diff --git a/tools/deployment/package-helm/templates/queue-logs-pv.yaml b/tools/deployment/package-helm/templates/queue-logs-pv.yaml index c9315630f5..37f45a5d4e 100644 --- a/tools/deployment/package-helm/templates/queue-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/queue-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "queue" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/queue" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/redis-data-pv.yaml b/tools/deployment/package-helm/templates/redis-data-pv.yaml index 56efc9d19c..5e30c1c1c4 100644 --- a/tools/deployment/package-helm/templates/redis-data-pv.yaml +++ b/tools/deployment/package-helm/templates/redis-data-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "redis" "name" "data" - "nodeRole" "control-plane" "capacity" "20Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/redis" .Values.clpConfig.data_directory) diff --git a/tools/deployment/package-helm/templates/redis-logs-pv.yaml b/tools/deployment/package-helm/templates/redis-logs-pv.yaml index 7f03c8cdad..811b48a4d8 100644 --- a/tools/deployment/package-helm/templates/redis-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/redis-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "redis" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/redis" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/redis-statefulset.yaml b/tools/deployment/package-helm/templates/redis-statefulset.yaml index 3929561c01..96f13eec8d 100644 --- a/tools/deployment/package-helm/templates/redis-statefulset.yaml +++ b/tools/deployment/package-helm/templates/redis-statefulset.yaml @@ -26,11 +26,6 @@ spec: - name: "redis" image: "redis:7.2.4" imagePullPolicy: "Always" - command: [ - "redis-server", - "/etc/redis/redis.conf", - "--requirepass", "$(REDIS_PASSWORD)" - ] env: - name: "REDIS_PASSWORD" valueFrom: @@ -55,6 +50,11 @@ spec: "name" "logs" ) | quote }} mountPath: "/var/log/redis" + command: [ + "redis-server", + "/etc/redis/redis.conf", + "--requirepass", "$(REDIS_PASSWORD)" + ] readinessProbe: {{- include "clp.readinessProbeTimings" . | nindent 12 }} exec: &redis-health-check diff --git a/tools/deployment/package-helm/templates/reducer-deployment.yaml b/tools/deployment/package-helm/templates/reducer-deployment.yaml index 09954063b5..51970785f7 100644 --- a/tools/deployment/package-helm/templates/reducer-deployment.yaml +++ b/tools/deployment/package-helm/templates/reducer-deployment.yaml @@ -50,15 +50,15 @@ spec: - name: "PYTHONPATH" value: "/opt/clp/lib/python3/site-packages" volumeMounts: - - name: "config" - mountPath: "/etc/clp-config.yaml" - subPath: "clp-config.yaml" - readOnly: true - name: {{ include "clp.volumeName" (dict "component_category" "reducer" "name" "logs" ) | quote }} mountPath: "/var/log/reducer" + - name: "config" + mountPath: "/etc/clp-config.yaml" + subPath: "clp-config.yaml" + readOnly: true command: [ "python3", "-u", "-m", "job_orchestration.reducer.reducer", diff --git a/tools/deployment/package-helm/templates/reducer-logs-pv.yaml b/tools/deployment/package-helm/templates/reducer-logs-pv.yaml index 0aab54233a..4c467877ae 100644 --- a/tools/deployment/package-helm/templates/reducer-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/reducer-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "reducer" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/reducer" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/results-cache-data-pv.yaml b/tools/deployment/package-helm/templates/results-cache-data-pv.yaml index 8410734b2e..9e8ccea38e 100644 --- a/tools/deployment/package-helm/templates/results-cache-data-pv.yaml +++ b/tools/deployment/package-helm/templates/results-cache-data-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "results-cache" "name" "data" - "nodeRole" "control-plane" "capacity" "20Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/results_cache" .Values.clpConfig.data_directory) diff --git a/tools/deployment/package-helm/templates/results-cache-logs-pv.yaml b/tools/deployment/package-helm/templates/results-cache-logs-pv.yaml index e76aaf90af..2064685ee0 100644 --- a/tools/deployment/package-helm/templates/results-cache-logs-pv.yaml +++ b/tools/deployment/package-helm/templates/results-cache-logs-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "results-cache" "name" "logs" - "nodeRole" "control-plane" "capacity" "5Gi" "accessModes" (list "ReadWriteOnce") "hostPath" (printf "%s/results_cache" .Values.clpConfig.logs_directory) diff --git a/tools/deployment/package-helm/templates/results-cache-statefulset.yaml b/tools/deployment/package-helm/templates/results-cache-statefulset.yaml index 56efc28ee0..bc3f5aea63 100644 --- a/tools/deployment/package-helm/templates/results-cache-statefulset.yaml +++ b/tools/deployment/package-helm/templates/results-cache-statefulset.yaml @@ -26,10 +26,6 @@ spec: - name: "results-cache" image: "mongo:7.0.1" imagePullPolicy: "Always" - args: [ - "--config", "/etc/mongo/mongod.conf", - "--bind_ip", "0.0.0.0" - ] ports: - name: "results-cache" containerPort: 27017 @@ -48,6 +44,10 @@ spec: "name" "logs" ) | quote }} mountPath: "/var/log/mongodb" + args: [ + "--config", "/etc/mongo/mongod.conf", + "--bind_ip", "0.0.0.0" + ] readinessProbe: {{- include "clp.readinessProbeTimings" . | nindent 12 }} exec: &results-cache-health-check diff --git a/tools/deployment/package-helm/templates/shared-data-archives-pv.yaml b/tools/deployment/package-helm/templates/shared-data-archives-pv.yaml index 3e17545925..4260827682 100644 --- a/tools/deployment/package-helm/templates/shared-data-archives-pv.yaml +++ b/tools/deployment/package-helm/templates/shared-data-archives-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "shared-data" "name" "archives" - "nodeRole" "control-plane" "capacity" "50Gi" "accessModes" (list "ReadWriteMany") "hostPath" .Values.clpConfig.archive_output.storage.directory diff --git a/tools/deployment/package-helm/templates/shared-data-streams-pv.yaml b/tools/deployment/package-helm/templates/shared-data-streams-pv.yaml index 8cc2cd8019..d235870390 100644 --- a/tools/deployment/package-helm/templates/shared-data-streams-pv.yaml +++ b/tools/deployment/package-helm/templates/shared-data-streams-pv.yaml @@ -1,8 +1,7 @@ -{{- include "clp.createLocalPv" (dict +{{- include "clp.createStaticPv" (dict "root" . "component_category" "shared-data" "name" "streams" - "nodeRole" "control-plane" "capacity" "20Gi" "accessModes" (list "ReadWriteMany") "hostPath" .Values.clpConfig.stream_output.storage.directory diff --git a/tools/deployment/package-helm/templates/storage-class.yaml b/tools/deployment/package-helm/templates/storage-class.yaml new file mode 100644 index 0000000000..acbb3c4334 --- /dev/null +++ b/tools/deployment/package-helm/templates/storage-class.yaml @@ -0,0 +1,10 @@ +{{- if eq .Values.storage.storageClassName "local-storage" }} +apiVersion: "storage.k8s.io/v1" +kind: "StorageClass" +metadata: + name: "local-storage" + labels: + {{- include "clp.labels" . | nindent 4 }} +provisioner: "kubernetes.io/no-provisioner" +volumeBindingMode: "WaitForFirstConsumer" +{{- end }} diff --git a/tools/deployment/package-helm/templates/webui-deployment.yaml b/tools/deployment/package-helm/templates/webui-deployment.yaml index ec4ae5c13a..80e5376a58 100644 --- a/tools/deployment/package-helm/templates/webui-deployment.yaml +++ b/tools/deployment/package-helm/templates/webui-deployment.yaml @@ -59,11 +59,18 @@ spec: value: "4000" - name: "RATE_LIMIT" value: {{ .Values.clpConfig.webui.rate_limit | quote }} + {{- with .Values.clpConfig.stream_output.storage }} + {{- if and (eq .type "s3") (eq .s3_config.aws_authentication.type "credentials") }} + - name: "AWS_ACCESS_KEY_ID" + value: {{ .s3_config.aws_authentication.credentials.access_key_id | quote }} + - name: "AWS_SECRET_ACCESS_KEY" + value: {{ .s3_config.aws_authentication.credentials.secret_access_key | quote }} + {{- end }} + {{- end }} ports: - name: "webui" containerPort: 4000 volumeMounts: - - {{- include "clp.logsInputVolumeMount" . | nindent 14 }} - name: "client-settings" mountPath: "/opt/clp/var/www/webui/client/settings.json" subPath: "webui-client-settings.json" @@ -72,11 +79,21 @@ spec: mountPath: "/opt/clp/var/www/webui/server/dist/settings.json" subPath: "webui-server-settings.json" readOnly: true + {{- if .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + mountPath: "/opt/clp/.aws" + readOnly: true + {{- end }} + {{- if eq .Values.clpConfig.logs_input.type "fs" }} + - {{- include "clp.logsInputVolumeMount" . | nindent 14 }} + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} - name: {{ include "clp.volumeName" (dict "component_category" "shared-data" "name" "streams" ) | quote }} mountPath: "/var/data/streams" + {{- end }} command: [ "/opt/clp/bin/node-22", "/opt/clp/var/www/webui/server/dist/src/main.js" @@ -89,15 +106,25 @@ spec: {{- include "clp.livenessProbeTimings" . | nindent 12 }} tcpSocket: *webui-health-check volumes: - - {{- include "clp.logsInputVolume" . | nindent 10 }} - - {{- include "clp.pvcVolume" (dict - "root" . - "component_category" "shared-data" - "name" "streams" - ) | nindent 10 }} - name: "client-settings" configMap: name: {{ include "clp.fullname" . }}-config - name: "server-settings" configMap: name: {{ include "clp.fullname" . }}-config + {{- with .Values.clpConfig.aws_config_directory }} + - name: "aws-config" + hostPath: + path: {{ . | quote }} + type: "Directory" + {{- end }} + {{- if eq .Values.clpConfig.logs_input.type "fs" }} + - {{- include "clp.logsInputVolume" . | nindent 10 }} + {{- end }} + {{- if eq .Values.clpConfig.stream_output.storage.type "fs" }} + - {{- include "clp.pvcVolume" (dict + "root" . + "component_category" "shared-data" + "name" "streams" + ) | nindent 10 }} + {{- end }} diff --git a/tools/deployment/package-helm/test-multi-dedicated.sh b/tools/deployment/package-helm/test-multi-dedicated.sh new file mode 100755 index 0000000000..357f76ca99 --- /dev/null +++ b/tools/deployment/package-helm/test-multi-dedicated.sh @@ -0,0 +1,95 @@ +#!/usr/bin/env bash + +# Multi-node cluster test with dedicated worker nodes for each worker type +# Demonstrates nodeSelector scheduling with separate node pools + +script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=.test-common.sh +source "${script_dir}/.test-common.sh" + +CLUSTER_NAME="${CLUSTER_NAME:-clp-test-dedicated}" +NUM_COMPRESSION_NODES="${NUM_COMPRESSION_NODES:-2}" +NUM_QUERY_NODES="${NUM_QUERY_NODES:-2}" +COMPRESSION_WORKER_REPLICAS="${COMPRESSION_WORKER_REPLICAS:-2}" +QUERY_WORKER_REPLICAS="${QUERY_WORKER_REPLICAS:-2}" + +echo "=== Multi-node CLP Helm Chart Test (Dedicated Workers) ===" +echo "Compression nodes: ${NUM_COMPRESSION_NODES}" +echo "Query nodes: ${NUM_QUERY_NODES}" +echo "Compression worker replicas: ${COMPRESSION_WORKER_REPLICAS}" +echo "Query worker replicas: ${QUERY_WORKER_REPLICAS}" +echo "" + +kind delete cluster --name "${CLUSTER_NAME}" 2>/dev/null || true +init_clp_home +download_samples + +total_workers=$((NUM_COMPRESSION_NODES + NUM_QUERY_NODES)) +echo "Creating kind cluster with 1 control-plane + ${total_workers} worker nodes..." +{ + cat </dev/null || true + +helm uninstall test --ignore-not-found 2>/dev/null || true +sleep 2 + +echo "Installing Helm chart with dedicated worker nodes..." +helm install test . \ + --set "distributed=true" \ + --set "compressionWorker.replicas=${COMPRESSION_WORKER_REPLICAS}" \ + --set "compressionWorker.scheduling.nodeSelector.yscope\.io/nodeType=compression" \ + --set "queryWorker.replicas=${QUERY_WORKER_REPLICAS}" \ + --set "queryWorker.scheduling.nodeSelector.yscope\.io/nodeType=query" + +wait_for_samples +wait_for_pods 300 5 5 + +echo "" +echo "=== Pod Distribution ===" +kubectl get pods -o wide + +echo "" +echo "To clean up:" +echo " kind delete cluster --name ${CLUSTER_NAME}" diff --git a/tools/deployment/package-helm/test-multi-shared.sh b/tools/deployment/package-helm/test-multi-shared.sh new file mode 100755 index 0000000000..d72f5e8974 --- /dev/null +++ b/tools/deployment/package-helm/test-multi-shared.sh @@ -0,0 +1,84 @@ +#!/usr/bin/env bash + +# Multi-node cluster test with shared worker nodes +# Both compression and query workers share the same node pool + +script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=.test-common.sh +source "${script_dir}/.test-common.sh" + +CLUSTER_NAME="${CLUSTER_NAME:-clp-test-multi}" +NUM_WORKER_NODES="${NUM_WORKER_NODES:-2}" +COMPRESSION_WORKER_REPLICAS="${COMPRESSION_WORKER_REPLICAS:-2}" +QUERY_WORKER_REPLICAS="${QUERY_WORKER_REPLICAS:-2}" + +echo "=== Multi-node CLP Helm Chart Test ===" +echo "Worker nodes: ${NUM_WORKER_NODES}" +echo "Compression worker replicas: ${COMPRESSION_WORKER_REPLICAS}" +echo "Query worker replicas: ${QUERY_WORKER_REPLICAS}" +echo "" + +kind delete cluster --name "${CLUSTER_NAME}" 2>/dev/null || true +init_clp_home +download_samples + +echo "Creating kind cluster with 1 control-plane + ${NUM_WORKER_NODES} worker nodes..." +{ + cat </dev/null || true + +helm uninstall test --ignore-not-found 2>/dev/null || true +sleep 2 + +echo "Installing Helm chart with distributed storage mode..." +helm install test . \ + --set "distributed=true" \ + --set "compressionWorker.replicas=${COMPRESSION_WORKER_REPLICAS}" \ + --set "queryWorker.replicas=${QUERY_WORKER_REPLICAS}" + +wait_for_samples +wait_for_pods 300 5 5 + +echo "" +echo "=== Pod Distribution ===" +kubectl get pods -o wide + +echo "" +echo "To clean up:" +echo " kind delete cluster --name ${CLUSTER_NAME}" diff --git a/tools/deployment/package-helm/test.sh b/tools/deployment/package-helm/test.sh index abdd857512..ab70528e3b 100755 --- a/tools/deployment/package-helm/test.sh +++ b/tools/deployment/package-helm/test.sh @@ -1,99 +1,46 @@ #!/usr/bin/env bash +# Single-node cluster test for CLP Helm chart # TODO: Migrate into integration test -set -o errexit -set -o nounset -set -o pipefail - -CLP_HOME="/tmp/clp" - -# Waits for all jobs to complete and all non-job pods to be ready. -# -# @param {int} timeout_seconds Overall timeout in seconds -# @param {int} poll_interval_seconds Interval between status checks -# @param {int} wait_timeout_seconds Timeout for each kubectl wait call -# @return {int} 0 on success, 1 on timeout -wait_for_pods() { - local timeout_seconds=$1 - local poll_interval_seconds=$2 - local wait_timeout_seconds=$3 - - echo "Waiting for all pods to be ready" \ - "(timeout=${timeout_seconds}s, poll=${poll_interval_seconds}s," \ - "wait=${wait_timeout_seconds}s)..." - - # Reset bash built-in SECONDS counter - SECONDS=0 - - while true; do - sleep "${poll_interval_seconds}" - kubectl get pods - - if kubectl wait job \ - --all \ - --for=condition=Complete \ - --timeout="${wait_timeout_seconds}s" 2>/dev/null \ - && kubectl wait pods \ - --all \ - --selector='!job-name' \ - --for=condition=Ready \ - --timeout="${wait_timeout_seconds}s" 2>/dev/null - then - echo "All jobs completed and services are ready." - return 0 - fi - - if [[ ${SECONDS} -ge ${timeout_seconds} ]]; then - echo "ERROR: Timed out waiting for pods to be ready" - return 1 - fi - - echo "---" - done -} - -kind delete cluster --name clp-test -rm -rf "$CLP_HOME" -mkdir -p "$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \ - "$CLP_HOME/var/data/"{archives,streams} \ - "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ - "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ - "$CLP_HOME/var/tmp" \ - "$CLP_HOME/samples" - -# Download sample datasets in the background -wget -O - https://zenodo.org/records/10516402/files/postgresql.tar.gz?download=1 \ - | tar xz -C "$CLP_HOME/samples" & -SAMPLE_DOWNLOAD_PID=$! - -cat </dev/null || true +init_clp_home +download_samples + +echo "Creating kind cluster..." +{ + cat </dev/null || true sleep 2 -helm install test . -wait $SAMPLE_DOWNLOAD_PID -echo "Sample download and extraction complete" +echo "Installing Helm chart..." +helm install test . +wait_for_samples wait_for_pods 300 5 5 + +echo "" +echo "To clean up:" +echo " kind delete cluster --name ${CLUSTER_NAME}" diff --git a/tools/deployment/package-helm/values.yaml b/tools/deployment/package-helm/values.yaml index 190fde33ed..421d1ba006 100644 --- a/tools/deployment/package-helm/values.yaml +++ b/tools/deployment/package-helm/values.yaml @@ -18,8 +18,54 @@ image: pullPolicy: "Always" tag: "main" +# Deployment mode: +# - false: Single-node deployment. PVs use local volumes bound to one node. Pods automatically +# tolerate control-plane taints. Only works with worker replicas=1. +# - true: Multi-node deployment. PVs use hostPath without node affinity, assuming unmanaged +# shared storage (e.g., NFS/CephFS mounted via /etc/fstab) at the same path on all nodes. +distributed: false + +# Number of concurrent processes per worker pod. workerConcurrency: 8 +compressionWorker: + replicas: 1 + # Controls which nodes run compression workers + # scheduling: + # nodeSelector: + # yscope.io/nodeType: compute + # tolerations: + # - key: "yscope.io/dedicated" + # operator: "Equal" + # value: "compression" + # effect: "NoSchedule" + # topologySpreadConstraints: + # - maxSkew: 1 + # topologyKey: "kubernetes.io/hostname" + # whenUnsatisfiable: "DoNotSchedule" + +queryWorker: + replicas: 1 + # Controls which nodes run query workers + # scheduling: + # nodeSelector: + # yscope.io/nodeType: compute + # tolerations: + # - key: "yscope.io/dedicated" + # operator: "Equal" + # value: "query" + # effect: "NoSchedule" + # topologySpreadConstraints: + # - maxSkew: 1 + # topologyKey: "kubernetes.io/hostname" + # whenUnsatisfiable: "DoNotSchedule" + +storage: + # Name of the StorageClass for PVs and PVCs. + # - If "local-storage" (default), the chart creates a StorageClass with WaitForFirstConsumer + # - If a different name, the StorageClass must already exist in your cluster + storageClassName: "local-storage" + clpConfig: package: storage_engine: "clp-s" @@ -34,18 +80,21 @@ clpConfig: # this directory will be ignored. directory: "/" + bundled: ["database", "queue", "redis", "results_cache"] + database: type: "mariadb" # "mariadb" or "mysql" port: 30306 - name: "clp-db" + names: + clp: "clp-db" compression_scheduler: jobs_poll_delay: 0.1 # seconds logging_level: "INFO" max_concurrent_tasks_per_job: 0 # A value of 0 disables the limit + type: "celery" # "celery" or "spider" query_scheduler: - port: 7000 jobs_poll_delay: 0.1 # seconds num_archives_to_search_per_sub_job: 16 logging_level: "INFO" @@ -63,6 +112,9 @@ clpConfig: db_name: "clp-query-results" stream_collection_name: "stream-files" + # Retention period for search results, in minutes. Set to null to disable automatic deletion. + retention_period: 60 + compression_worker: logging_level: "INFO" @@ -74,6 +126,10 @@ clpConfig: results_metadata_collection_name: "results-metadata" rate_limit: 1000 + mcp_server: null + # port: 30800 + # logging_level: "INFO" + # Where archives should be output to archive_output: storage: @@ -113,6 +169,26 @@ clpConfig: # How large each stream file should be before being split into a new stream file target_uncompressed_size: 134217728 # 128 MB + # Garbage collector config + garbage_collector: + logging_level: "INFO" + + # Interval (in minutes) at which garbage collector jobs run + sweep_interval: + archive: 60 + search_result: 30 + + # API server config + api_server: null + + # log-ingestor config. Currently, the config is applicable only if `logs_input.type` is "s3". + log_ingestor: null + + # Presto client config + presto: null + # host: "localhost" + # port: 8080 + # Location where other data (besides archives) are stored. It will be created if # it doesn't exist. # NOTE: This directory must not overlap with any path used in CLP's execution container. An error @@ -130,6 +206,9 @@ clpConfig: # will be raised if so. tmp_directory: "/tmp/clp/var/tmp" + # Location of the AWS tools' config files (e.g., `~/.aws`) + aws_config_directory: null + credentials: database: username: "clp-user"