Skip to content

Centralized operational logging architecture for multi-host CLP Package deployments #1760

@junhaoliao

Description

@junhaoliao

Centralized operational logging architecture for multi-host CLP deployments

Document structure: This document covers Docker Compose deployments first, followed by a
Kubernetes-specific section that explains how the same approaches
apply to Kubernetes.

Request

Background

Currently, CLP's deployment architecture requires container-to-host volume mounts for all service
logs. Each service writes logs to files within /var/log/<component>/ which are mounted from
${CLP_LOGS_DIR_HOST:-./var/log} on the host. While this approach has been convenient for
single-host deployments (allowing users to easily provide logs by archiving host files), it creates
significant challenges for multi-host deployments:

  1. Multi-host incompatibility: In Kubernetes or multi-node Docker Compose deployments, logs are
    scattered across different hosts, making centralized log access difficult
  2. Storage overhead: Each host requires dedicated storage for log retention
  3. Operational complexity: Admins must access (e.g. SSH into) individual hosts or set up
    additional log aggregation infrastructure
  4. Scaling limitations: Adding nodes increases operational burden exponentially

With planned support for multi-host deployments through Kubernetes, and addition of the log-ingestor
component, there's an opportunity to modernize the operational logging architecture to leverage:

  1. CLP's own compression technology - operational logs can benefit from the same
    high-compression-ratio storage that user logs enjoy
  2. Container-native logging - Docker/Kubernetes native log drivers eliminate the need for host
    mounts
  3. Centralized access - all logs accessible from a single control node
  4. WebUI integration - operational logs viewable alongside user logs in the existing CLP
    interface

Requirements

The new operational logging solution must satisfy the following requirements:

R1: Multi-host Support

  • R1.1: Support both Kubernetes (multi-node) and Docker Compose (single or multi-host)
    deployments
  • R1.2: All logs accessible from a central control node, regardless of where services are
    running
  • R1.3: No per-host log file access required for normal operations

R2: Tiered Access

  • R2.1 (Hot): Real-time access to recent logs (0-X minutes) for debugging active/crashed
    services
    • Maximum acceptable lag: < 30 seconds
    • Must support live tailing
  • R2.2 (Warm): Recent historical logs (X minutes - Y hours) available uncompressed for immediate
    grep/analysis
    • Maximum acceptable lag: < 5 minutes
    • Must support live tailing
  • R2.3 (Cold): Older logs compressed using CLP for long-term storage and efficient retrieval
    • Maximum acceptable lag: < 24 hours
    • Must support full-text search

R3: Admin Access & Export

  • R3.1: Deployment admins can view all logs from all services
  • R3.2: Easy export mechanism for sending logs to support/developers
  • R3.3: Export should include both real-time and historical logs

R4: WebUI Integration

  • R4.1: Dedicated WebUI page for viewing real-time operational logs (files on disk)
  • R4.2: Operational logs searchable through existing Search page once archived
  • R4.3: Support filtering by service name, log level, time range
  • R4.4: (Future) Admin-only access control for operational logs

R5: Lightweight & Efficient

  • R5.1: Minimal additional resource overhead (< 50MB memory, < 0.1 CPU core)
  • R5.2: Use CLP's compression capabilities for long-term storage
  • R5.3: No heavyweight dependencies

R6: Incremental Migration

  • R6.1: Support gradual service-by-service migration from file-based to centralized logging
  • R6.2: Maintain backward compatibility during transition
  • R6.3: Clear deprecation path for CLP_LOGS_DIR environment variables

Possible implementation

Architecture Overview

Current

flowchart TD
    A["All services write to files<br/>(some also write to stdout)"]
    B["Docker local logging driver"]

    subgraph Host1 [Host 1]
        C1["/var/log/&lt;component&gt;/*.log<br/>(via volume mount)"]
        D1["docker logs container-name"]
    end

    subgraph Host2 [Host 2]
        C2["/var/log/&lt;component&gt;/*.log<br/>(via volume mount)"]
        D2["docker logs container-name"]
    end

    A --> B
    B --> C1
    B --> D1
    B --> C2
    B --> D2

    E["Admin must access each host separately<br/>(SSH, copy files, etc.)"]
    C1 -.-> E
    C2 -.-> E
Loading

Characteristics:

  • Services write logs to files via volume mounts from ${CLP_LOGS_DIR_HOST}
  • Logs scattered across hosts - no centralized access
  • Admin must SSH to each host to view/export logs

After (CLP-managed Fluent Bit)

flowchart TD
    A[All services write to stdout]
    B["Docker fluentd logging driver"]

    subgraph ControlNode [Control Node]
        C["Fluent Bit<br/>(receives logs from all hosts)"]

        subgraph Output1 [Output 1: File with rotation]
            D["/var/log/&lt;component&gt;/*.log"]
            E["WebUI reads directly<br/>(real-time access)"]
            D --> E
        end

        subgraph Output2 [Output 2: S3 + CLP archives]
            F["S3 (IRv2 compressed logs)"]
            G["Log-ingestor (periodic ingestion)"]
            H["CLP Archives (dataset='_clp')"]
            I["WebUI Search page"]
            F --> G --> H --> I
        end

        C --> D
        C --> F
    end

    subgraph WorkerNode1 [Worker Node 1]
        W1["Container logs"]
    end

    subgraph WorkerNode2 [Worker Node 2]
        W2["Container logs"]
    end

    A --> B
    B --> W1 -->|"fluentd-address"| C
    B --> W2 -->|"fluentd-address"| C

    J["docker logs (via dual logging cache)"]
    B -.-> J
Loading

Characteristics:

  • All logs centralized on control node via Fluent Bit
  • Organized path structure (/var/log/<component>/)
  • Automatic S3 upload with IRv2 compression
  • Historical logs searchable in WebUI via _clp dataset
  • docker logs still works via dual logging (Docker 20.10+)

Three-Tier Data Lifecycle

  1. Hot Tier (0-X minutes): Files on disk at /var/log/<component>/*.log

    • Access method: WebUI new endpoint /os/cat (similar to existing /os/ls)
    • Retention: Managed by Fluent Bit file rotation (time-based or size-based)
    • Purpose: Real-time debugging, live tail, recent log access
  2. Warm Tier (X min - Y hours): IRv2 files on S3

    • Access method: (Future optimization) Query directly from IRv2 without full archive
      ingestion
    • Retention: Until log-ingestor processes and archives them
    • Purpose: Transition period; reduces ingestion urgency
  3. Cold Tier (>Y hours): CLP Archives

    • Access method: Existing WebUI Search page with dataset=_clp filter
    • Retention: Configurable archive retention policy
    • Purpose: Long-term searchable storage with high compression

Component Changes

1. Fluent Bit deployment

  • Docker Compose
  • Kubernetes: DaemonSet or single Deployment on control node (to be determined based on performance
    testing)
    • Log tailing

2. Fluent Bit Configuration for Docker Compose

fluent-bit.conf

[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    info

[INPUT]
    Name          forward
    Listen        0.0.0.0
    Port          24224

# Output 1: File for real-time access
[OUTPUT]
    Name          file
    Match         *
    Path          /var/log
    Format        json
    # Rotation policy (matches CLP plugin flush)
    # TBD: time-based or size-based to align with CLP plugin

# Output 2: CLP plugin for S3 + IRv2 compression
[OUTPUT]
    Name          clp_s3
    Match         *
    s3_region     ${CLP_S3_REGION}
    s3_bucket     ${CLP_S3_BUCKET}
    s3_bucket_prefix    ir/${FLUENT_BIT_TAG}/%Y/%m/%d/
    upload_size_mb      16
    use_disk_buffer     true
    # Uses Zstd compression for IRv2 format

Open Questions:

  • What should be the rotation policy for file output?
    • Time-based (e.g., rotate every 5 minutes)?
    • Size-based (e.g., rotate at 50MB)?
    • Should align with CLP plugin's flush policy to ensure synchronization
  • What is the exact flush/upload behavior of the CLP Fluent Bit plugin (irv2-beta)?
    • Triggered by upload_size_mb threshold?
      • Time-based interval?
    • Both?

3. Service migration (Incremental)

Phase 1: Migrate CLP first-party services

compression-worker:
  # Remove old configuration
  # environment:
  #   CLP_LOGS_DIR: "/var/log/compression_worker"  # DEPRECATED
  #   CLP_WORKER_LOG_PATH: "/var/log/compression_worker/worker.log"  # DEPRECATED
  # volumes:
  #   - *volume_clp_logs  # DEPRECATED

  # Add new logging driver
  logging:
    driver: "fluentd"
    options:
      fluentd-address: "fluent-bit:24224"
      tag: "clp.compression-worker"
      labels: "service,component"
      fluentd-async: "true"

Phase 2: Migrate third-party services (database, queue, redis, results-cache)

  • Similar logging driver configuration
  • Verify compatibility with each service's log format

Backward Compatibility:

  • Keep ${CLP_LOGS_DIR_HOST} volume mounts during Phase 1-2 transition
    • Services log to both files (old) and Fluent Bit (new) temporarily
      After validation, remove old mounts and deprecate CLP_LOGS_DIR env vars

4. S3 configuration (clp-config.yaml)

New bundled services:

bundled: ["database", "queue", "redis", "results_cache", "fluentbit", "minio"]

S3 path structure:

s3://<clp-bucket-name>/
├── ir/
│   ├── clp.compression-worker/
│   │   ├── 2025/01/15/
│   │   │   ├── clp.compression-worker_0_2025-01-15T10:30:00Z_<uuid>.zst
│   │   │   └── clp.compression-worker_1_2025-01-15T11:00:00Z_<uuid>.zst
│   │   └── 2025/01/16/...
│   ├── clp.query-worker/...
│   └── ...
└── archive/
    └── _clp/
        ├── 2025/01/
        └── ...

5. Log-ingestor configuration

Ingest into dataset "_clp"

  • "_clp" reserved for operational logs, which is prefixed with underscore to differentiate from user
    datasets
  • Shows alongside other datasets in the dataset selector dropdown

Open Questions:

  • What should be the scan interval for operational logs?
    • 5 minutes (matching buffer_timeout)?
    • Faster for quicker transition to searchable archives?

6. WebUI enhancements

6.1 New /os/cat API Endpoint

Volume mount (add to webui service in docker-compose-all.yaml):

6.2 New "Operational Logs" Page

For querying realtime logs.

Location: /components/webui/client/src/pages/OperationalLogsPage/

Features:

  • File browser using /os/ls API
  • Log viewer using /os/cat API
  • Download/export button
  • For historical logs, redirect to the search page with the dataset filter set to _clp (see 6.3)

Open Questions:

  • Should we also add /os/tail to support live tailing with server-sent events?
    • Pro: True live tail without client polling
    • Con: More complex implementation

6.3 Search Page Enhancement

Once the log-ingestor for CLP operational logs is ready, we can verify that the logs are searchable
in the search page.

Dataset filter:

  • Add URL parameter support: /search?dataset=_clp

Migration Timeline

Phase 1: Infrastructure Setup

  • Add Fluent Bit service to docker-compose-all.yaml
  • Create Fluent Bit configuration with dual outputs (file + CLP plugin)
  • Add fluentbit and minio (optional) to bundled services in config schema
  • Update clp-config.yaml templates with S3 path structure

Phase 2: WebUI Development

  • Implement /os/cat API endpoint
  • Create Operational Logs page UI
  • Add dataset URL parameter support to Search page
  • Mount /var/log volume to webui service

Phase 3: Service Migration for Third-Party Services

Migrate bundled services (no change in our code is required)

  • database (MariaDB)
  • queue (RabbitMQ)
  • redis
  • results-cache (MongoDB)

Challenges:

  • Each service has different log format
  • May require custom Fluent Bit parsers
  • Verify no log loss during migration

Phase 4: Service Migration for First-Party Services (First Wave)

Migrate Python-based services (easier log format standardization):

  • compression-scheduler
  • compression-worker
  • query-scheduler
  • query-worker
  • garbage-collector
  • reducer

Per-service checklist:

  • Update logging driver to fluentd
  • Keep CLP_LOGS_DIR env var but mark as deprecated
  • Test real-time log access via WebUI
  • Test log ingestion to archives
  • Validate search functionality

Phase 5: Service Migration for First-Party Services (Second Wave)

Migrate remaining services:

  • webui
  • mcp-server
  • api-server
  • log-ingestor (tricky: logging about logging)
  • spider-scheduler
  • spider-compression-worker

Phase 6: Cleanup & Optimization

  • Remove CLP_LOGS_DIR environment variables
  • Remove *volume_clp_logs mounts from services (keep only in Fluent Bit and webui)
  • Remove ${CLP_LOGS_DIR_HOST} host mounts
  • Documentation updates
  • Performance tuning

Future Optimizations

  • Direct IRv2 querying (Warm tier optimization):

    • Query worker currently supports archives only
    • Extend to support IRv2 stream files on S3
    • Would enable searching logs before full archive ingestion
  • WebUI live tail (Server-sent events):

    • Current proposal: Client-side polling of /os/cat
    • Optimization: Server-sent events for true push-based tail
  • Authentication & Authorization:

    • Current: No access control on operational logs
    • Future: Admin-only access to _clp dataset
    • Requires: an authentication system in the CLP Package (TBA)
  • Structured logging standardization:

    • Ensure all CLP services output JSON logs
    • Consistent field names (timestamp, level, message, component, etc.)
    • Easier filtering and parsing in WebUI
  • Multi-cluster support:

    • Current design: Single S3 bucket per deployment
    • Future: Multiple clusters writing to the same bucket with cluster ID prefix
    • Use case: Multi-region deployments for legal compliance

Alternative approaches: Native Docker logging drivers

This section evaluates lighter-weight alternatives to the CLP-managed Fluent Bit approach, using
Docker's native logging drivers. These alternatives may appeal to users who:

  • Prefer a simpler CLP Package without log aggregation infrastructure
  • Already have their own log aggregation systems (Fluentd, Vector, OpenTelemetry, Loki, etc.)
  • Deploy on single-host environments only

Candidate logging drivers

1. json-file driver

The default Docker logging driver. Writes JSON-formatted logs to local files.

flowchart TD
    A[All services write to stdout]
    B["Docker json-file logging driver<br/>(with rotation: max-size, max-file)"]

    subgraph Host1 [Host 1]
        C1["/var/lib/docker/containers/&lt;id&gt;/*.log"]
        D1["docker logs container-name"]
    end

    subgraph Host2 [Host 2]
        C2["/var/lib/docker/containers/&lt;id&gt;/*.log"]
        D2["docker logs container-name"]
    end

    A --> B
    B --> C1 --> D1
    B --> C2 --> D2

    E["External log aggregation (optional)<br/>Fluentd, Vector, OpenTelemetry, Loki, etc."]
    C1 -.-> E
    C2 -.-> E
Loading

Configuration example:

x-service-defaults: &service_defaults
  logging:
    driver: "json-file"
    options:
      max-size: "50m"
      max-file: "5"
      compress: "true"

Key options (Docker Docs: JSON File logging driver):

Option Default Description
max-size -1 (unlimited) Maximum size before rotation (e.g., 10m, 1g)
max-file 1 Maximum number of rotated files to keep
compress false Gzip compression for rotated files
labels - Comma-separated labels to include in log metadata
env - Comma-separated env vars to include in log metadata

2. syslog driver

Routes container logs to a syslog server (local or remote).

flowchart TD
    A[All services write to stdout]
    B["Docker syslog logging driver"]

    subgraph ControlNode [Control Node]
        C["rsyslog container<br/>(receives logs from all hosts)"]
        D["/var/log/&lt;component&gt;/*.log"]
        E["WebUI reads directly<br/>(real-time access)"]
        C --> D --> E
    end

    subgraph WorkerNode1 [Worker Node 1]
        F1["Container logs"]
    end

    subgraph WorkerNode2 [Worker Node 2]
        F2["Container logs"]
    end

    A --> B
    B --> F1 -->|"tcp://rsyslog:514"| C
    B --> F2 -->|"tcp://rsyslog:514"| C

    G["docker logs (via dual logging cache)"]
    B -.-> G
Loading

Configuration example:

x-service-defaults: &service_defaults
  logging:
    driver: "syslog"
    options:
      syslog-address: "tcp://rsyslog:514"
      syslog-facility: "daemon"
      syslog-format: "rfc5424"
      tag: "{{.Name}}"

**Key options
** (Docker Docs: Syslog logging driver):

Option Description
syslog-address Address of syslog server: udp://host:port, tcp://host:port, tcp+tls://host:port, or unix:///path
syslog-facility Syslog facility (e.g., daemon, local0-local7)
syslog-format Message format: rfc3164, rfc5424, rfc5424micro
syslog-tls-* TLS options for tcp+tls connections
tag Custom tag; supports Go templates (e.g., {{.Name}}, {{.ID}})

docker logs command availability

A critical consideration is whether docker logs remains functional with each driver.

Driver docker logs works? Notes
json-file Yes Native support
local Yes Native support
journald Yes Native support
syslog Yes (with dual logging) Requires Docker Engine 20.10+ (Docker Docs: Dual logging)
fluentd Yes (with dual logging) Requires Docker Engine 20.10+

Dual logging
(Docker Docs: Dual logging): Starting with
Docker Engine 20.10, Docker automatically caches logs locally when using remote logging drivers
(like syslog or fluentd), enabling docker logs to work. No configuration is required to enable
this feature.

Cache configuration: The cache options below can be configured either:

  • Per-container via --log-opt flags (e.g., --log-opt cache-max-size=50m)
  • Globally in /etc/docker/daemon.json (applies to all new containers)

Note: The Docker documentation does not provide explicit docker-compose examples for cache-*
options. While the docs state these "can be specified per container", only daemon.json examples
are shown. In docker-compose, you would use:

logging:
  driver: "syslog"
  options:
    syslog-address: "tcp://rsyslog:514"
    cache-max-size: "50m"  # Unverified - not explicitly documented
Option Default Description
cache-disabled false Disable local caching
cache-max-size 20m Max cache file size
cache-max-file 5 Max number of cache files
cache-compress true Compress rotated cache files

Component changes impact analysis

1. Fluent Bit deployment

Aspect CLP-managed Fluent Bit json-file syslog
Additional service required Yes (Fluent Bit container) No Optional (rsyslog container for centralization)
Memory overhead ~50MB 0 ~10-20MB (rsyslog)
CPU overhead ~0.1 core 0 Minimal

Verdict:

  • json-file: Simplest, zero overhead
  • syslog: Lightweight if rsyslog already deployed; can centralize to control node

2. Fluent Bit configuration (dual output)

Aspect CLP-managed Fluent Bit json-file syslog
File output for real-time access Yes (configurable path) Yes (Docker-managed path) Via rsyslog file output
S3/IRv2 output Yes (CLP plugin) No (would need separate collector to scan) No (would need separate "shipper")
Log rotation Fluent Bit managed Docker managed rsyslog managed
Organized path structure (/var/log/<component>/) Yes No (/var/lib/docker/containers/<id>/) Yes (rsyslog templates)

Verdict:

  • json-file: Loses organized path structure and S3 pipeline
  • syslog: Can achieve organized paths via rsyslog templates; still loses S3 pipeline

3. Service migration

Aspect CLP-managed Fluent Bit json-file syslog
Services write to stdout Yes Yes Yes
Remove CLP_LOGS_DIR env vars Yes Yes Yes
Remove *volume_clp_logs mounts Yes Yes Yes
Migration complexity Medium Low Low-Medium

Verdict: All approaches support the same service migration pattern (stdout-based logging).

4. S3 configuration

Aspect CLP-managed Fluent Bit json-file syslog
Automatic S3 upload Yes No No
IRv2 compression on upload Yes No No
Path structure on S3 Yes (ir/<component>/<date>/) N/A N/A

Verdict:

  • json-file / syslog: Lose automatic S3 ingestion. Users must implement their own log
    scanning / shipping if needed.

5. Log-ingestor configuration

Aspect CLP-managed Fluent Bit json-file syslog
Automatic _clp dataset ingestion Yes No No
Scan interval configurable Yes N/A N/A
Historical log searchability Yes No No

Verdict:

  • json-file / syslog: No automated path to CLP archives. Historical operational logs not
    searchable in WebUI.

6. WebUI enhancements

6.1 /os/cat API Endpoint
Aspect CLP-managed Fluent Bit json-file syslog (centralized)
Log file location /var/log/<component>/ /var/lib/docker/containers/<id>/ /var/log/<component>/ (rsyslog)
WebUI mount required Yes (/var/log) Docker socket or different path Yes (/var/log)
Implementation complexity Low Medium-High Low

Verdict:

  • json-file: WebUI would need to mount Docker's container directory or use Docker API
  • syslog: Can achieve same organized structure as CLP-managed Fluent Bit via rsyslog templates
6.2 "Operational Logs" Page
Aspect CLP-managed Fluent Bit json-file syslog
File browser works Yes Needs adaptation Yes
Download/export Yes Yes Yes
Redirect to Search for historical Yes No (no historical) No (no historical)
6.3 Search Page Enhancement (?dataset=_clp)
Aspect CLP-managed Fluent Bit json-file syslog
Historical operational log search Yes No No
Dataset filter functional Yes No No

Verdict:

  • json-file / syslog: Historical search feature not available.

Requirements impact matrix

Requirement CLP-managed Fluent Bit json-file syslog (centralized)
R1.1: Multi-host K8s support Yes No Yes (with rsyslog)
R1.2: Central control node access Yes No Yes
R1.3: No per-host access required Yes No Yes
R2.1 (Hot): Real-time access (<30s lag) Yes Yes Yes
R2.2 (Warm): Recent historical (<5min lag) Yes Partial Partial
R2.3 (Cold): CLP compressed archives Yes No No
R3.1: Admin view all logs Yes Per-host only Yes
R3.2: Easy export Yes docker logs > file Yes
R3.3: Real-time + historical export Yes Real-time only Real-time only
R4.1: WebUI real-time page Yes Needs adaptation Yes
R4.2: Search page for archived Yes No No
R4.3: Filter by service/level/time Yes Limited Yes (rsyslog parsing)
R5.1: Lightweight (<50MB) Marginal Yes Yes
R5.2: CLP compression for storage Yes No No
R5.3: No heavyweight deps Fluent Bit required Yes rsyslog (lightweight)
docker logs command Yes (dual logging) Yes (native) Yes (dual logging)
Organized path structure Yes No (Docker internal paths) Yes (rsyslog templates)
Compatible with external log aggregation May conflict Yes (users plug in their own) Yes (standard syslog)
TLS encryption for log transport Yes (if configured) N/A (local only) Yes (tcp+tls://)

Comparing approaches across deployment scenarios

Scenario CLP-managed Fluent Bit json-file syslog
Single-host Docker Compose Works well Works well Works well
Multi-host Docker Compose Centralized via Fluent Bit Logs scattered Centralized via rsyslog
Kubernetes Centralized via Fluent Bit DaemonSet Logs per node Centralized via rsyslog DaemonSet

To set up multi-host deployments with syslog:

  • Deploy rsyslog as a container on the control
    node (GitHub: puzzle/kubernetes-rsyslog-logging)
  • Configure rsyslog to write to /var/log/<component>/ using templates
  • All containers forward to the central rsyslog via syslog-address

Recommendation: Configurable modes

Consider offering multiple operational logging modes via configuration:

# clp-config.yaml
operational_logging:
  mode: "simple"  # Options: "simple", "centralized", "full"
Mode Logging Driver S3 Pipeline WebUI Integration Use Case
simple json-file No Limited Single-host, users have own aggregation
centralized syslog + rsyslog No Real-time only Multi-host, no CLP archive needed
full CLP-managed Fluent Bit Yes Full Multi-host, full CLP integration

This allows users to choose based on their deployment complexity and existing infrastructure.


Kubernetes considerations

In Kubernetes, there are no logging driver configurations at the container level like Docker Compose.
Instead, the container runtime (containerd, CRI-O) handles log collection differently.

How Kubernetes logging works

  1. Container runtime writes logs to files on the node:

    • Location: /var/log/containers/<pod>_<namespace>_<container>-<id>.log
    • These are symlinks to /var/log/pods/<namespace>_<pod>_<uid>/<container>/0.log
  2. Log format: JSON by default (similar to Docker's json-file driver)

  3. kubectl logs: Reads from these node files (always works, no driver dependency)

  4. Log rotation: Configured via kubelet, not per-container

Log rotation configuration (Helm)

Since CLP plans to use Helm for Kubernetes deployments, log rotation is configured in the kubelet
settings rather than per-container:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
containerLogMaxSize: "50Mi"
containerLogMaxFiles: 5

This applies to all containers on the node. Unlike Docker Compose, you cannot configure rotation
per-service.

Default behavior (equivalent to json-file)

With no additional configuration, Kubernetes behaves like Docker's json-file driver:

  • Logs stored per-node in /var/log/containers/
  • No centralized access - must access each node separately
  • kubectl logs <pod> works natively

syslog equivalent

There is no direct syslog logging driver in Kubernetes. Achieving centralized logging requires a
DaemonSet-based log forwarder—which is essentially the Fluent Bit approach described below.

CLP-managed Fluent Bit (recommended for Kubernetes)

This is the standard pattern for Kubernetes log aggregation:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: clp
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      containers:
        - name: fluent-bit
          image: fluent/fluent-bit:latest
          volumeMounts:
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: containers
              mountPath: /var/log/containers
              readOnly: true
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: containers
          hostPath:
            path: /var/log/containers

Key difference from Docker Compose:

Aspect Docker Compose Kubernetes
Log delivery mechanism Push (container → fluentd driver → Fluent Bit) Pull (Fluent Bit tails files on node)
Configuration location Per-service logging: block DaemonSet + ConfigMap
Reliability Depends on fluentd-async setting File-based, survives pod restarts

How it works:

  1. Fluent Bit runs as a DaemonSet (one pod per node)
  2. Mounts /var/log/containers/ from the host (read-only)
  3. Tails the JSON log files written by the container runtime
  4. Forwards to:
    • Central Fluent Bit aggregator on control node, or
    • Directly to S3 with CLP plugin

This approach achieves the same result as Docker's fluentd logging driver, but through file tailing
rather than network push.


References

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions