Skip to content

feat: add SecOps Alert Triage Agent for intelligent security alert processing#6182

Open
Samir-atra wants to merge 1 commit intoaden-hive:mainfrom
Samir-atra:feat/secops-alert-triage-agent-5866
Open

feat: add SecOps Alert Triage Agent for intelligent security alert processing#6182
Samir-atra wants to merge 1 commit intoaden-hive:mainfrom
Samir-atra:feat/secops-alert-triage-agent-5866

Conversation

@Samir-atra
Copy link

Description

Implements a comprehensive SecOps Alert Triage Agent that intelligently processes security alerts from monitoring tools (Datadog, Wiz, Snyk, PagerDuty, GitHub), correlates related events, suppresses false positives, classifies threats by severity, enriches with contextual intelligence, and escalates to on-call engineers with actionable incident briefs. Addresses the alert fatigue problem faced by security teams (thousands of alerts per day, 40%+ false positives).

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

Related Issues

Fixes #5866

Changes Made

Agent Architecture

  • 7 nodes with 6 edges forming the processing pipeline
  • Human-in-the-loop (HITL) gate at hitl-escalation node for Critical/High alerts
  • Conversational mode with checkpointing for long-running sessions

Workflow Pipeline

  1. intake - Receives and normalizes alerts from multiple sources
  2. dedup - Correlates alerts by asset, time window, and attack pattern
  3. fp-filter - Suppresses false positives using configurable rules
  4. severity - Classifies alerts as Critical/High/Medium/Low
  5. enrichment - Adds contextual intelligence (service owner, deployments, prior incidents, threat intel)
  6. hitl-escalation - Escalates Critical/High alerts with human acknowledgment
  7. digest - Generates daily SecOps summaries with metrics

Success Criteria (Measurable)

  • False positive suppression rate: >= 35%
  • Escalation accuracy (no real threats missed): >= 90%
  • Human confirmation rate for Critical/High alerts: 100%
  • MTTR improvement vs manual triage baseline: >= 40%
  • Daily digest generation: Automatic

Constraints

  • Mandatory HITL: No automated response for Critical/High alerts without human acknowledgment (hard constraint)
  • Audit trail: Full audit trail for all triage decisions (hard constraint)
  • Alert preservation: Original alert data preserved (no deletion, only filtering) (hard constraint)
  • Rationale logging: All false positive determinations documented (soft constraint)

Configuration

  • Suppression rules: CI/CD IPs, approved scanners (Nessus, Qualys, Rapid7), maintenance windows
  • Asset criticality: Production (1.0x), Staging (0.7x), Development (0.3x), Internal (0.2x)
  • Severity thresholds: Based on CVSS scores with multi-factor scoring

Test Coverage

  • 20 passing tests covering:
    • Agent structure validation
    • Goal and constraint definitions
    • Node and edge configurations
    • Client-facing nodes
    • HITL gate presence
    • Success criteria validation
    • Configuration definitions

CLI Interface

  • info - Display agent information
  • validate - Validate agent structure
  • shell - Interactive CLI session
  • tui - Launch TUI (requires textual)

Testing

  • Unit tests pass (PYTHONPATH=.:core:exports uv run pytest exports/secops_alert_triage_agent/tests/test_structure.py)
  • Lint passes (cd core && ruff check .) - Not applicable (new agent in exports)
  • Manual testing performed - Ready for manual testing

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Screenshots (if applicable)

N/A - This is an agent implementation, not a UI change.

Market Validation

Security teams face:

  • Thousands of alerts per day with 40%+ false positives
  • Alert fatigue leading to missed incidents
  • Expensive to hire senior SecOps engineers for 24/7 triage
  • Security is moved honeycomb to oss #1 gating factor for enterprise AI adoption

Torq built $20M ARR on this exact problem (no-code/low-code security automation for SecOps). This open-source implementation gives startups, scale-ups, and security-conscious engineering teams access to enterprise-grade alert intelligence.

Technical Details

Input Formats Supported:

  • Datadog alerts
  • Wiz alerts
  • Snyk alerts
  • PagerDuty alerts
  • GitHub Advanced Security alerts
  • Generic webhook payloads
  • Manual alert descriptions

Alert Normalization:
Standard schema includes: alert_id, source, timestamp, title, description, severity, affected_asset, indicators (ips, domains, hashes, users), raw_alert

Escalation Format for Critical/High Alerts:
Complete incident brief with executive summary, risk assessment, recommended actions, and contextual information (service owner, deployments, prior incidents, threat intel)

…ocessing

Implements a comprehensive SecOps Alert Triage Agent that:

- Ingests security alerts from Datadog, Wiz, Snyk, PagerDuty, GitHub, and webhook sources
- Normalizes alert schemas into a standard format
- Deduplicates and correlates related alerts by asset, time window, and attack pattern
- Filters false positives using configurable suppression rules (CI/CD IPs, approved scanners, maintenance windows)
- Classifies severity using CVSS scores, asset criticality, blast radius, and exploit likelihood
- Enriches alerts with contextual intelligence (service owner, deployments, prior incidents, threat intel)
- Requires human acknowledgment for Critical/High alerts before any automated response (HITL)
- Generates daily SecOps digests with comprehensive metrics

Agent workflow:
intake -> dedup -> fp-filter -> severity -> enrichment -> hitl-escalation -> digest

Success criteria (measurable):
- False positive suppression rate >= 35%
- Critical/High escalation accuracy >= 90%
- 100% human confirmation for Critical/High alerts
- MTTR improvement >= 40% vs manual triage baseline
- Automatic daily digest generation

Constraints:
- Mandatory HITL for Critical/High alerts (hard constraint)
- Full audit trail for all triage decisions (hard constraint)
- Alert preservation (no deletion, only filtering) (hard constraint)
- Rationale logging for false positive determinations (soft constraint)

Includes:
- Complete agent implementation with 7 nodes and 6 edges
- Comprehensive test suite with 20 passing tests
- CLI interface (info, validate, shell, tui)
- Full documentation and README
- Configuration for suppression rules, asset criticality, severity thresholds

Resolves aden-hive#5866
@github-actions
Copy link

PR Requirements Warning

This PR does not meet the contribution requirements.
If the issue is not fixed within ~24 hours, it may be automatically closed.

PR Author: @Samir-atra
Found issues: #5866 (assignees: none), #1 (assignees: none)
Problem: The PR author must be assigned to the linked issue.

To fix:

  1. Assign yourself (@Samir-atra) to one of the linked issues
  2. Re-open this PR

Exception: To bypass this requirement, you can:

  • Add the micro-fix label or include micro-fix in your PR title for trivial fixes
  • Add the documentation label or include doc/docs in your PR title for documentation changes

Micro-fix requirements (must meet ALL):

Qualifies Disqualifies
< 20 lines changed Any functional bug fix
Typos & Documentation & Linting Refactoring for "clean code"
No logic/API/DB changes New features (even tiny ones)

Why is this required? See #472 for details.

@github-actions github-actions bot added the pr-requirements-warning PR doesn't follow contribution guidelines. Please fix or it will be auto-closed. label Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-requirements-warning PR doesn't follow contribution guidelines. Please fix or it will be auto-closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Agent Idea]: SecOps Alert Triage Agent — Intelligent Security Alert Correlation, False Positive Suppression & Escalation

1 participant