Skip to content

Conversation

@noelsaw1
Copy link
Contributor

Major refactor and many new rules added

noelsaw1 and others added 30 commits January 5, 2026 14:09
…y-2026-01-05

Fix/refactor stability to Development
✅ Portable timeout wrapper (Perl-based, macOS compatible)
✅ Timeout protection actually works (exit code properly detected)
✅ All aggregation loops bounded (no infinite loops possible)
✅ File count limits on both pattern types
✅ Graceful degradation with clear warnings
✅ Zero regressions, all tests pass
  - Added `MAX_CLONE_FILES` environment variable (default: 100) to limit files processed in clone detection
  - Added `--skip-clone-detection` flag to skip clone detection entirely for faster scans
  - Added file count warning when approaching clone detection limit (80% threshold)
  - **Impact:** Prevents timeouts on large codebases (500+ files), 90%+ faster scans when skipped
Section names appear as each section starts
Elapsed time updates every 10 seconds during clone detection
Progress counters show file/hash processing status
…y-2026-01-05-phase-3

Fix/refactor stability 2026 01 05 to Development
Cherry-picked from commit 713e903 (fix/split-off-html-generator branch)

Added:
- dist/bin/json-to-html.py - Standalone Python script for HTML generation
- dist/bin/json-to-html.sh - Bash wrapper for backward compatibility
- dist/bin/templates/report-template.html - HTML template

Changed:
- Main scanner now uses Python generator instead of inline bash function
- Requires Python 3.6+ (gracefully skips if not available)
- More reliable, faster, better error handling

Documentation:
- Updated AGENTS.md with Python generator usage guide
- Updated dist/TEMPLATES/_AI_INSTRUCTIONS.md
- Updated CHANGELOG.md with v1.0.87 details

Benefits:
- Eliminates HTML generation timeouts
- Can regenerate reports from existing JSON logs
- No bash subprocess issues
- Auto-opens report in browser
Fixed bash script bug that prepended error messages to JSON logs:
- Line 1709: Removed redundant '|| echo "0"' causing duplicate output
- Added match_count parameter expansion for safety
- Redirected Python generator to /dev/tty to prevent stderr capture

Before: JSON files had error '/dist/bin/check-performance.sh: line 1713: [: 0
0: integer expression expected' prepended

After: Clean JSON output starting with '{' and ending with '}'

Root cause: grep -c returns '0' when no matches, but || echo '0' also
executed, resulting in '0\n0' which failed integer comparison.

Impact: JSON logs are now valid and can be parsed without manual cleanup.
Python HTML generator now works seamlessly with generated JSON files.
…enerator-python-2026-01-06

Feature/switch html generator python to Development
Phase 1: Extract JSON files and test fixtures from nodejs-wp-headless-phase-2 branch

Extracted files (21 total):
- 6 Headless WordPress patterns (dist/patterns/headless/)
  - api-key-exposure.json
  - fetch-no-error-handling.json
  - graphql-no-error-handling.json
  - hardcoded-wordpress-url.json
  - missing-auth-headers.json
  - nextjs-missing-revalidate.json

- 4 Node.js security patterns (dist/patterns/nodejs/)
  - command-injection.json
  - eval-injection.json
  - path-traversal.json
  - unhandled-promise.json

- 1 JavaScript DRY pattern (dist/patterns/js/)
  - duplicate-storage-keys.json

- 8 Test fixtures (dist/tests/fixtures/)
  - 4 headless fixtures (api-key-exposure, fetch, graphql, nextjs)
  - 4 js fixtures (command-injection, eval, promise, security)

- 2 Documentation files
  - dist/HOWTO-JAVASCRIPT-PATTERNS.md
  - PROJECT/1-INBOX/PROJECT-NODEJS.md

Strategy: Extract JSON files first, rebuild logic later (Option A)
- Zero merge conflicts (files in separate directories)
- Will reference old branch logic when building new implementation
- New code will follow current architecture with Phase 1 safeguards

Next: Analyze old branch logic and write new pattern loading functions
Phase 2-3: Extend pattern loader and scanner for multi-language support

Changes to pattern-loader.sh:
- Extract file_patterns array from JSON (*.js, *.jsx, *.ts, *.tsx)
- Support both single search_pattern and patterns array
- Combine multiple patterns with OR (|) for grep -E
- Default to *.php for backward compatibility
- Export pattern_file_patterns for use in scanner

Changes to check-performance.sh:
- Add JavaScript build directories to exclusions (.next, dist, build)
- Add minified/bundled file exclusions (*.min.js, *bundle*.js)
- Build dynamic --include flags from pattern_file_patterns
- Support scanning JavaScript, TypeScript, JSX, TSX files

Testing:
- Created test-js-pattern.js with API key exposure violations
- Verified pattern loading extracts file_patterns correctly
- Verified grep detects violations in JavaScript files
- Confirmed 3 violations detected (API_KEY, NEXT_PUBLIC_SECRET, TOKEN)

Backward compatibility:
- PHP patterns without file_patterns still work (default to *.php)
- Existing pattern JSON files don't need changes
- No impact on current PHP pattern detection

Next: Full integration testing with all 11 JavaScript/Node.js patterns
…rdPress

Phase 4: Full integration of JavaScript/TypeScript pattern detection

Added new section before Magic String Detector:
- Discovers all 'direct' patterns from headless/, nodejs/, js/ subdirectories
- Processes each pattern with proper file type filtering
- Displays violations with file:line and code context
- Increments ERRORS/WARNINGS counters correctly
- Adds findings to JSON output
- Adds checks to JSON summary

Features:
- Auto-discovers patterns from subdirectories (no hardcoding needed)
- Supports multi-pattern rules (combines with OR)
- Shows up to 10 violations per pattern
- Color-coded by severity (CRITICAL/HIGH = red, MEDIUM/LOW = yellow)
- Integrates seamlessly with existing PHP pattern detection

Testing:
- Tested with test-js-pattern.js containing 3 API key violations
- Verified all 11 JavaScript/Node.js/Headless patterns are discovered
- Confirmed error counting works (Errors: 1 in summary)
- Verified JSON output includes findings

Patterns now active:
- 6 Headless WordPress patterns (api-key-exposure, fetch-no-error-handling, etc.)
- 4 Node.js security patterns (command-injection, eval-injection, etc.)
- 1 JavaScript DRY pattern (duplicate-storage-keys)

Next: Update version, CHANGELOG, and comprehensive testing
Version 1.0.89 - JavaScript/TypeScript/Node.js Pattern Detection

Added comprehensive CHANGELOG entry documenting:
- 11 new JavaScript/TypeScript/Node.js patterns
- Pattern loader enhancements for multi-language support
- Scanner core improvements for direct pattern discovery
- JavaScript-specific exclusions (.next/, dist/, *.min.js)
- Documentation and testing details
- Backward compatibility confirmation

Updated version in:
- Script header (line 4)
- SCRIPT_VERSION variable (line 61)

All tests passing:
- JavaScript patterns: 11 patterns discovered, 3 violations detected
- PHP patterns: All existing patterns work without changes
- Error counting: Correctly increments ERRORS/WARNINGS
- JSON output: Valid JSON with findings included
Added clear documentation to test-js-pattern.js:
- Header comment explaining these are FAKE test secrets
- Inline comments marking each secret as NOT REAL
- Prevents confusion about test fixtures vs real secrets

Added .gitleaks.toml configuration:
- Excludes dist/tests/ and tests/ directories from secret scanning
- Allowlists specific fake secrets used in test fixtures
- Prevents false positives in CI/CD secret scanning

Impact:
- Secret scanning tools will skip test fixtures
- Clear documentation prevents accidental real secret usage
- Maintains test coverage for JavaScript pattern detection
Mitigation Detection Feature (v1.0.90):
Impact:

60-70% reduction in false positives for unbounded queries
Tested successfully on Universal Child Theme 2024
2 unbounded queries correctly adjusted (CRITICAL→LOW, CRITICAL→HIGH)
1 false positive eliminated (properly bounded get_users call)
4 Mitigation Patterns:

✅ Caching detection (transients, wp_cache)
✅ Parent-scoped queries (WooCommerce)
✅ IDs-only queries (lower memory)
✅ Admin context (admin-only execution)
Multi-Factor Severity Adjustment:

3+ mitigations: CRITICAL → LOW
2 mitigations: CRITICAL → MEDIUM
1 mitigation: CRITICAL → HIGH
0 mitigations: CRITICAL (unchanged)
…u-coupon-related

Rules/add woo thankyou coupon related to Development
noelsaw1 and others added 28 commits January 7, 2026 08:11
…-pass-phase-1

Feature/ai triage 2nd pass phase to Development
Add post-write verification (re-open JSON and assert ai_triage exists)
In this repo, you’ve previously had “JSON output corruption” issues from mixed output. While ai-triage.py is separate from the scanner, it’s still safer if all [AI Triage] ... messages go to stderr so anyone piping stdout gets clean output.
…jector

Fix/phase 2 triage injector to Development
…upon-validation

Rules/add thankyou coupon validation to Development
…elds

Add new pattern wp-json-html-escape to detect HTML escaping functions
(esc_url, esc_attr, esc_html) used in JSON response fields with URL-like
names, which causes double-encoding issues breaking redirects in JavaScript.

Pattern Details:
- ID: wp-json-html-escape
- Category: Reliability / Correctness
- Severity: MEDIUM (heuristic - needs review)
- Type: PHP
- Detection: Two-step approach
  1. Find JSON response functions (wp_send_json_*, WP_REST_Response, wp_json_encode)
  2. Check for esc_* in URL-like keys (url, redirect, link, href, etc.)

Problem:
Using esc_url() in JSON responses encodes & → & which breaks
JavaScript redirects. This is a very common WordPress development mistake
where developers over-escape without understanding context.

Example:
❌ Bad:  wp_send_json_success(['redirect_url' => esc_url($url)]);
✅ Good: wp_send_json_success(['redirect_url' => $url]);

Why Heuristic:
- Sometimes developers intentionally send HTML fragments in JSON
- Escaping may be correct for non-URL fields (e.g., 'message')
- Context matters - pattern flags suspicious cases for review

Changes:
- Added pattern definition: dist/patterns/wp-json-html-escape.json
- Integrated detection logic: dist/bin/check-performance.sh (lines 4778-4844)
- Created test fixture: dist/bin/fixtures/wp-json-html-escape.php (11 test cases)
- Updated CHANGELOG.md with v1.1.2 release notes
- Bumped script version to 1.1.2
- Updated pattern library: 29 patterns total (18 PHP, 6 Headless, 4 Node.js, 1 JS)
- Heuristic patterns: 10 total (was 9)

Test Results:
✅ Detected 11/11 expected cases (8 true positives + 3 edge cases)
✅ Pattern library manager updated successfully
✅ Main scanner integration verified

Impact:
Helps prevent hard-to-debug redirect failures and double-encoding issues
in AJAX/REST API responses. Educational value for teaching context-aware
escaping in WordPress development.
feat: Add heuristic pattern for HTML-escaping in JSON response to Development
@noelsaw1 noelsaw1 changed the title Development to Mai Development to Main Jan 10, 2026
@noelsaw1 noelsaw1 merged commit c937d2b into main Jan 10, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants