Skip to content

Comprehensive Audit Logs for Enterprise Compliance #163

@rucka

Description

@rucka

Story Statement

As an enterprise security officer
I want comprehensive, tamper-evident audit logs for all KB operations
So that I can ensure compliance with enterprise security policies and investigate security incidents

Where: Knowledge service — enhanced audit subsystem extending Epic #66 foundation

Epic Context

Parent Epic: Platform Hardening & Enterprise Readiness #68
Status: Refined
Priority: P0 (Must-Have)

Status Workflow

  • Refined: Story is detailed, estimated, and ready for development
  • In Progress: Story is actively being developed
  • Done: Story delivered and accepted

Acceptance Criteria

Functional Requirements

  1. Given the basic audit logging from Epic Centralized Knowledge Service #66 (Audit Logs for KB Operations #154) is in place
    When an enterprise security officer queries GET /api/v1/organizations/acme/audit-logs/verify
    Then the service verifies hash chain integrity and returns { "status": "valid", "entries_verified": N, "range": { "from": "...", "to": "..." } }

  2. Given audit log entries are stored with hash chain
    When any entry is tampered with (modified, deleted, inserted)
    Then the integrity verification endpoint detects the break: { "status": "invalid", "first_break_at": "...", "entry_id": "..." }

  3. Given an enterprise admin with SIEM integration configured
    When audit events occur
    Then events are exported in CEF (Common Event Format) via webhook to configured SIEM endpoint in near-real-time

  4. Given an enterprise admin requests a compliance report
    When they send GET /api/v1/organizations/acme/audit-logs/compliance-report?standard=soc2&period=2026-Q1
    Then the service returns a structured report: total events, event type breakdown, access patterns, anomalies, failed auth count, data access summary

  5. Given a configurable retention policy (e.g., 365 days for enterprise)
    When the retention period is set via org settings
    Then audit logs are retained for the configured period; after expiry, entries are archived to cold storage (S3) before deletion

  6. Given the audit log API
    When any user queries it
    Then the query response includes X-Audit-Integrity: verified header confirming chain validity for the returned range

Business Rules

  • Hash chain: each entry includes prev_hash = SHA-256(previous_entry) — creates tamper-evident chain
  • Chain verification: O(n) scan from first to last entry in range, verify each hash link
  • SIEM export: webhook POST to configured URL with CEF-formatted events, retry with exponential backoff (3 retries)
  • Compliance report standards: SOC 2 Type II, GDPR (extensible)
  • Retention: configurable per org (30-365 days), default 90 days; archived entries moved to S3 before deletion
  • Audit log access restricted to admin role (inherited from Audit Logs for KB Operations #154)

Edge Cases and Error Handling

  • Hash chain gap (entries deleted by DB admin): Verification reports gap location and affected range
  • SIEM endpoint unreachable: Queue events locally (max 1000 entries), retry; alert if queue exceeds threshold
  • Compliance report for empty period: Return report with zero counts
  • Very large audit log (>1M entries): Verification runs in batches; progress indicator in response (or async job)
  • Archive to S3 fails: Retain entries beyond retention period; alert ops team

Definition of Done Checklist

Development Completion

  • All 6 acceptance criteria implemented and verified
  • Hash chain implementation on audit entries
  • Integrity verification endpoint
  • SIEM webhook export in CEF format
  • Compliance report generator (SOC 2, GDPR templates)
  • Configurable retention with cold storage archival
  • Unit tests for hash chain, verification, compliance report
  • Integration tests for end-to-end audit flow

Quality Assurance

  • Hash chain verification detects all tampering scenarios (modify, delete, insert)
  • SIEM export delivers events within 60s of occurrence
  • Compliance report generates in <30s for 1-year period
  • Archive/deletion does not affect active audit queries

Deployment and Release

  • S3 bucket for cold storage archival configured
  • SIEM webhook configuration documented
  • Compliance report templates reviewable

Story Sizing and Sprint Readiness

Refined Story Points

Final Story Points: XL(8)
Confidence Level: Medium
Sizing Justification: Extends #154 foundation with hash chain, SIEM integration, compliance reporting, cold archival. Each is a distinct subsystem. Moderate total effort.

Sprint Capacity Validation

Sprint Fit Assessment: Fits in single sprint
Total Effort Assessment: Yes

Story Splitting Recommendations

  1. Comprehensive Audit Logs for Enterprise Compliance #163-A: Hash chain + integrity verification (L(5))
  2. Comprehensive Audit Logs for Enterprise Compliance #163-B: SIEM export + compliance report + cold archival (M(3))

Dependencies and Coordination

Story Dependencies

Prerequisite Stories: Epic #66 #154 (Audit Logs — foundation to extend)
Dependent Stories: None

External Dependencies

Infrastructure Requirements: S3 cold storage bucket, SIEM endpoint for testing

Validation and Testing Strategy

Acceptance Testing Approach

Testing Methods: Unit tests for hash chain logic; integration tests: insert entries → tamper → verify → detect; mock SIEM endpoint for webhook testing
Test Data Requirements: Seeded audit entries with known hash chain, tampered entries
Environment Requirements: PostgreSQL test container, mock SIEM webhook, S3 mock for archival

Notes

Refinement Insights: Hash chain is the core innovation over basic audit logs. SIEM export and compliance reports are relatively straightforward on top of that.

Technical Analysis

Implementation Approach

Technical Strategy: Extend audit_logs table with prev_hash column. On insert, compute SHA-256 of previous entry's (id + action + timestamp + prev_hash) and store as current entry's prev_hash. Verification: iterate chain and recompute hashes. SIEM: webhook emitter triggered by audit event emitter. Compliance: SQL aggregation queries with report template.
Key Components: Hash chain module, integrity verifier, SIEM webhook emitter, compliance report generator, archival cron job
Data Flow: Audit event → compute hash chain → insert → emit to SIEM webhook → (async) archive expired entries to S3

Technical Requirements

  • Extend audit_logs table: add prev_hash VARCHAR(64), entry_hash VARCHAR(64) columns
  • Hash computation: SHA-256(entry_id + action + actor + timestamp + resource_id + prev_hash)
  • SIEM CEF format: CEF:0|pair|knowledge-service|1.0|<action>|<description>|<severity>|...
  • Cold archival: export to S3 as newline-delimited JSON; delete from DB after confirmed upload
  • Compliance report: parameterized SQL queries with template rendering

Technical Risks and Mitigation

Risk Impact Probability Mitigation Strategy
Hash chain computation adds latency to audit writes Medium Low Async chain computation; batch verification
Concurrent audit writes break hash chain ordering High Medium Sequence audit inserts via single-writer pattern or advisory lock

Spike Requirements

Required Spikes: Evaluate single-writer pattern vs advisory lock for hash chain ordering under concurrent load

Metadata

Metadata

Assignees

No one assigned

    Labels

    user storyWork item representing a user story

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions