Security Documentation

This directory contains security hardening scripts for the cagent-action GitHub Action.

🔒 Security Features

This action includes built-in security features for all agent executions:

Authorization Check - Users are verified for comment-triggered events:
- Only OWNER, MEMBER, and COLLABORATOR roles can trigger via comments (e.g., /review)
- External contributors (CONTRIBUTOR, FIRST_TIME_CONTRIBUTOR, NONE) are blocked
- Skips for non-comment events (PR triggers, scheduled jobs, workflow_dispatch)
- Comment-triggered actions are the main abuse vector - this protects against cost/spam attacks
Output Scanning - All agent responses are scanned for leaked secrets:
- API key patterns: sk-ant-*, sk-*, sk-proj-*
- GitHub tokens: ghp_*, gho_*, ghu_*, ghs_*, github_pat_*
- Environment variable names in output
- If secrets detected: workflow fails, security issue created
Prompt Sanitization - User prompts are checked in two tiers:
- Critical patterns (block): Direct secret exfiltration commands (echo $API_KEY, console.log(process.env))
- Suspicious patterns (strip + warn): Behavioral/natural language injection ("ignore previous instructions", "base64 decode", etc.) — matching lines are stripped from the prompt before it reaches the agent
- Medium-risk patterns (warn): API key variable names in configuration

Security Architecture

The action implements a defense-in-depth approach:

┌─────────────────────────────────────────────────────────────┐
│ 1. Authorization Check (check-auth.sh)                      │
│    ✓ Verify user's author_association role                  │
│    ✓ Block external contributors by default                 │
│    ✓ Only OWNER, MEMBER, COLLABORATOR allowed               │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ 2. Prompt Sanitization                                      │
│    ✓ Detect prompt injection attempts                       │
│    ✓ Warn about suspicious patterns                         │
│    ✓ Check for encoded malicious content                    │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ 3. Agent Execution                                          │
│    ✓ User-provided agent runs in isolated cagent runtime    │
│    ✓ No direct access to secrets or environment vars        │
│    ✓ Controlled execution environment                       │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ 4. Output Scanning                                          │
│    ✓ Scan for leaked API keys (Anthropic, OpenAI, etc.)     │
│    ✓ Scan for leaked tokens (GitHub PAT, OAuth, etc.)       │
│    ✓ Block execution if secrets found                       │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ 5. Incident Response                                        │
│    ✓ Create security issue with details                     │
│    ✓ Fail workflow with clear error                         │
│    ✓ Prevent secret exposure in logs                        │
└─────────────────────────────────────────────────────────────┘

Security Scripts

Shared Patterns (`secret-patterns.sh`)

Central source of truth for secret detection patterns. This file is sourced by:

sanitize-output.sh - Uses SECRET_PATTERNS array for comprehensive regex matching
action.yml (Build safe prompt step) - Uses SECRET_PATTERNS for prompt verification

Why shared patterns?

DRY principle: Single source of truth prevents drift
Consistency: Same patterns across all security layers
Maintainability: Update patterns in one place

Secret patterns detected:

SECRET_PATTERNS=(
  'sk-ant-[a-zA-Z0-9_-]{30,}'        # Anthropic API keys
  'ghp_[a-zA-Z0-9]{36}'              # GitHub personal access tokens
  'gho_[a-zA-Z0-9]{36}'              # GitHub OAuth tokens
  'ghu_[a-zA-Z0-9]{36}'              # GitHub user tokens
  'ghs_[a-zA-Z0-9]{36}'              # GitHub server tokens
  'github_pat_[a-zA-Z0-9_]+'         # GitHub fine-grained tokens
  'sk-[a-zA-Z0-9]{48}'               # OpenAI API keys
  'sk-proj-[a-zA-Z0-9]{48}'          # OpenAI project keys
)

`sanitize-output.sh`

Purpose: Output scanning for leaked secrets

Function: Last line of defense - scans AI responses for leaked API keys/tokens

Patterns: Sources from secret-patterns.sh for comprehensive detection

Usage:

./sanitize-output.sh output-file.txt

Outputs:

leaked=true/false to $GITHUB_OUTPUT
Exits with code 1 if secrets detected

`sanitize-input.sh`

Purpose: Input sanitization for PR diffs and user prompts

Function:

Removes code comments from diffs (prevents hidden instructions)
Detects CRITICAL patterns (blocks execution with exit 1)
- Direct secret extraction commands (echo $API_KEY, console.log(process.env))
- Environment variable extraction (printenv ANTHROPIC_API_KEY)
- Secret file access (cat .env)
Detects SUSPICIOUS patterns (strips matching lines from output, warns, exit 0)
- Instruction override attempts ("ignore previous instructions")
- System/mode overrides ("system mode", "debug mode")
- Natural language secret requests ("show me the API key")
- System prompt extraction attempts
- Jailbreak attempts
- Encoding/obfuscation (base64, hex)
Detects MEDIUM-RISK patterns (warns but allows execution)
- API key variable names in configuration

Usage:

./sanitize-input.sh input-file.txt output-file.txt

Outputs:

blocked=true/false to $GITHUB_OUTPUT (true only for CRITICAL patterns)
stripped=true/false to $GITHUB_OUTPUT (true when suspicious content was removed)
risk-level=low/medium/high to $GITHUB_OUTPUT
Exits with code 1 only for CRITICAL patterns (direct secret exfiltration)

Built-in Protections

Prompt Injection Protection

Removes all code comments before analysis (prevents hidden instructions)
Blocks patterns like "ignore previous instructions", "show me the API key"
Detects encoded requests (base64, hex, ROT13)

Secret Leak Prevention

Scans for API key patterns with specific lengths and formats
Checks for environment variable names in output
Blocks execution if any secrets detected
Creates security incident issues automatically

Security Testing

Running Tests

cd tests

# Run security test suite (21 tests)
./test-security.sh

# Run exploit simulation tests (6 tests)
./test-exploits.sh

Test Coverage

test-security.sh (21 tests):

Clean input (should pass)
Prompt injection in comment (should strip, not block)
Clean output (should pass)
Leaked API key (should block)
Leaked GitHub token (should block)
Authorization - OWNER (should pass)
Authorization - COLLABORATOR (should pass)
Authorization - CONTRIBUTOR (should block)
Clean prompt (should pass)
Prompt injection in user prompt (should strip, not block)
Encoded content in prompt (should strip, not block)
Low risk input - normal code (should pass)
Medium risk input - API key variable (should warn but pass)
Critical input - secret exfiltration command (should block)
Regex pattern in output (should NOT flag as leak)
Real GitHub server token (should flag as leak)
Release notes with 'system...models' (should NOT block)
Real 'system mode' injection (should strip, not block)
Verify suspicious content physically removed from output file
Critical pattern (echo $ANTHROPIC_API_KEY) still blocks with exit 1
Mixed suspicious + clean content preserves clean parts

test-exploits.sh (6 tests):

Prompt injection via comment (should be stripped)
High-risk behavioral injection (should be blocked)
Output token leak (should be blocked)
Prompt override attempt (should warn)
Extra args parsing sanity check
Quoted arguments handling

All tests must pass before deployment.

Security in Practice

Basic Usage with Security Checks

- name: Run Agent
  id: agent
  uses: docker/cagent-action@latest
  with:
    agent: my-agent
    prompt: "Analyze the logs"
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

- name: Check for security issues
  if: always()
  run: |
    if [ "${{ steps.agent.outputs.secrets-detected }}" == "true" ]; then
      echo "⚠️ Secret leak detected - incident issue created"
    fi
    if [ "${{ steps.agent.outputs.prompt-suspicious }}" == "true" ]; then
      echo "⚠️ Prompt had suspicious patterns"
    fi

All executions automatically include:

Prompt sanitization warnings
Output scanning for secrets
Incident issue creation if secrets detected
Workflow failure on security violations

Maintenance

Adding New Secret Patterns

When adding new secret patterns:

Update secret-patterns.sh with new regex pattern:

SECRET_PATTERNS=(
  # ... existing patterns ...
  'new-provider-[a-zA-Z0-9]{40}'  # New provider API keys
)

Add to SECRET_PREFIXES if needed for quick checks:
```
SECRET_PREFIXES='(sk-ant-|...|new-provider-)'
```

Run tests to verify:

cd tests
./test-security.sh
./test-exploits.sh

Consider adding a specific test case for the new pattern in test-security.sh

Security Review Checklist

Before deploying changes:

All security tests pass (test-security.sh)
All exploit tests pass (test-exploits.sh)
Shared patterns are used consistently
New patterns added to secret-patterns.sh only
No hardcoded secrets in code
Authorization checks cannot be bypassed
Output scanning covers all execution paths

Security Outputs

The action provides security-related outputs that can be checked in subsequent steps:

Output	Description
`secrets-detected`	`true` if secrets were detected in output
`prompt-suspicious`	`true` if suspicious patterns were detected in prompt

Reporting Security Issues

If you discover a security vulnerability, please:

Do NOT open a public issue
Email security concerns to the maintainers
Provide detailed information about the vulnerability
Allow time for a fix before public disclosure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security Documentation

🔒 Security Features

Security Architecture

Security Scripts

Shared Patterns (`secret-patterns.sh`)

`sanitize-output.sh`

`sanitize-input.sh`

Built-in Protections

Prompt Injection Protection

Secret Leak Prevention

Security Testing

Running Tests

Test Coverage

Security in Practice

Basic Usage with Security Checks

Maintenance

Adding New Secret Patterns

Security Review Checklist

Security Outputs

Reporting Security Issues

References

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Security Documentation

🔒 Security Features

Security Architecture

Security Scripts

Shared Patterns (secret-patterns.sh)

sanitize-output.sh

sanitize-input.sh

Built-in Protections

Prompt Injection Protection

Secret Leak Prevention

Security Testing

Running Tests

Test Coverage

Security in Practice

Basic Usage with Security Checks

Maintenance

Adding New Secret Patterns

Security Review Checklist

Security Outputs

Reporting Security Issues

References

Shared Patterns (`secret-patterns.sh`)

`sanitize-output.sh`

`sanitize-input.sh`