Skip to content

Conversation

@mostlikelee
Copy link
Contributor

@mostlikelee mostlikelee commented Nov 6, 2025

Related issue: Resolves #35043

The meat of this issue is an atomic table swap when aggregating host counts for vulnerabilities. This approach turned out to be much more performant to the existing approach as well as just moving the inserts into a single transaction.

To ensure no performance regressions, new data seeder and performance tooling was created to seed data based on customer environments.

Checklist for submitter

If some of the following don't apply, delete the relevant line.

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.
    See Changes files for more information.

  • Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements)

Testing

  • Added/updated automated tests (unable to replicate issue in automated tests)

  • QA'd all new/changed functionality manually

Summary by CodeRabbit

  • Bug Fixes

    • Resolved an issue where vulnerability host counts could be incorrectly reported as missing.
  • Chores

    • Introduced vulnerability performance testing tooling and documentation for quality assurance.

@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 67.44186% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.14%. Comparing base (cfd54cf) to head (53878cd).
⚠️ Report is 73 commits behind head on main.

Files with missing lines Patch % Lines
server/datastore/mysql/vulnerabilities.go 67.44% 8 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #35317      +/-   ##
==========================================
+ Coverage   66.12%   66.14%   +0.01%     
==========================================
  Files        2075     2084       +9     
  Lines      175818   176465     +647     
  Branches     7198     7198              
==========================================
+ Hits       116268   116714     +446     
- Misses      48871    49014     +143     
- Partials    10679    10737      +58     
Flag Coverage Δ
backend 67.75% <67.44%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mostlikelee
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 7, 2025

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 7, 2025

Walkthrough

This PR fixes vulnerabilities disappearing from Fleet during the vulnerability update job by replacing the unsafe in-place count zeroing with an atomic table-swap pattern. It also introduces performance testing tools—a vulnerability data seeder and performance tester—for validating the fix.

Changes

Cohort / File(s) Summary
Vulnerability counting fix
server/datastore/mysql/vulnerabilities.go
Replaces per-scope in-place updates with atomic table-swap mechanism. Introduces ensureSwapTableExists, insertHostCountsIntoTable, and atomicTableSwapVulnerabilityCounts helpers. Refactors batch insertion to accept parameterized table name and transaction context. Eliminates the unsafe zeroing step that caused counts to disappear during failures.
Performance testing documentation
tools/software/vulnerabilities/performance_test/README.md
New README documenting seeder and performance tester usage, CLI options, examples, workflow guidance, and extension patterns.
Vulnerability data seeder
tools/software/vulnerabilities/performance_test/seeder/volume_vuln_seeder.go
New CLI tool for generating test vulnerability data at scale. Includes batch host/team creation, OS updates, software installation, CVE/software mapping, and OS vulnerability seeding. Incorporates deadlock-aware retry logic and step-wise timing reports.
Performance tester
tools/software/vulnerabilities/performance_test/tester/performance_tester.go
New CLI tool for performance testing MySQL datastore operations. Executes configurable iterations of test functions, tracks timing statistics (total, average, min, max, percentiles), and supports verbose output and detailed result summaries.

Sequence Diagram

sequenceDiagram
    participant Job as Vulnerability Job
    participant DS as Datastore
    participant SwapTbl as Swap Table
    participant MainTbl as Main Counts Table

    Note over Job,MainTbl: Atomic Table Swap Pattern (New)
    Job->>DS: UpdateVulnerabilityHostCounts()
    DS->>SwapTbl: Ensure swap table exists & clear
    DS->>DS: Aggregate counts (global, team, no-team)
    DS->>SwapTbl: Batch insert new counts into swap table
    rect rgb(150, 200, 150)
        Note over DS,MainTbl: Atomic operation
        DS->>MainTbl: RENAME TABLE swap→main, main→old
        DS->>MainTbl: DROP old table
    end
    DS->>SwapTbl: Recreate empty swap table
    Job->>Job: Continue (counts always visible, even if job interrupted)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • vulnerabilities.go: Critical fix to core vulnerability counting logic; requires verification of atomic swap correctness, transaction safety, and absence of race conditions.
  • volume_vuln_seeder.go: ~350 lines of new code with deadlock retry logic, batch operations, and orchestrated workflow; moderate complexity in data generation but straightforward control flow.
  • performance_tester.go: ~200 lines of new code; straightforward test harness with CLI parsing and metrics collection; requires validation that timing captures are accurate.
  • README.md: Documentation only; low effort.

Areas requiring extra attention:

  • Verify atomic RENAME TABLE operations are truly atomic and handle concurrent reads correctly
  • Confirm transaction boundaries in insertHostCountsIntoTable prevent partial inserts
  • Validate seeder deadlock retry exhaustion handling
  • Ensure performance tester percentile calculations are statistically sound

Suggested reviewers

  • getvictor
  • jahzielv
  • sgress454

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The PR description adequately covers the main change (atomic table swap), rationale (performance improvement), and includes related issue reference and relevant checklist items, though automated tests are marked as unable to be replicated.
Linked Issues check ✅ Passed The PR implements the atomic table swap solution to fix vulnerabilities disappearing during job execution by preventing the transient zeroing window, directly addressing the root cause in issue #35043.
Out of Scope Changes check ✅ Passed All changes are in-scope: atomic table swap fix in vulnerabilities.go addresses #35043, and new seeding/performance testing tools directly support ensuring no regressions per PR objectives.
Title check ✅ Passed The title 'Atomic vulnerability count calculations' accurately reflects the main change: replacing per-scope updates with an atomic table-swap pattern for vulnerability host counts.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 35043-vuln-counts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f3888f and 03d2c36.

📒 Files selected for processing (5)
  • changes/35043-missing-vuln-counts (1 hunks)
  • server/datastore/mysql/vulnerabilities.go (2 hunks)
  • tools/software/vulnerabilities/performance_test/README.md (1 hunks)
  • tools/software/vulnerabilities/performance_test/seeder/volume_vuln_seeder.go (1 hunks)
  • tools/software/vulnerabilities/performance_test/tester/performance_tester.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

⚙️ CodeRabbit configuration file

When reviewing SQL queries that are added or modified, ensure that appropriate filtering criteria are applied—especially when a query is intended to return data for a specific entity (e.g., a single host). Check for missing WHERE clauses or incorrect filtering that could lead to incorrect or non-deterministic results (e.g., returning the first row instead of the correct one). Flag any queries that may return unintended results due to lack of precise scoping.

Files:

  • tools/software/vulnerabilities/performance_test/tester/performance_tester.go
  • server/datastore/mysql/vulnerabilities.go
  • tools/software/vulnerabilities/performance_test/seeder/volume_vuln_seeder.go
🧠 Learnings (2)
📚 Learning: 2025-08-13T18:20:42.136Z
Learnt from: titanous
Repo: fleetdm/fleet PR: 31075
File: tools/redis-tests/elasticache/iam_auth.go:4-10
Timestamp: 2025-08-13T18:20:42.136Z
Learning: For test harnesses and CLI tools in the Fleet codebase, resource cleanup on error paths (like closing connections before log.Fatalf) may not be necessary since the OS handles cleanup when the process exits. These tools prioritize simplicity over defensive programming patterns used in production code.

Applied to files:

  • tools/software/vulnerabilities/performance_test/README.md
  • tools/software/vulnerabilities/performance_test/tester/performance_tester.go
  • tools/software/vulnerabilities/performance_test/seeder/volume_vuln_seeder.go
📚 Learning: 2025-08-08T07:40:05.301Z
Learnt from: getvictor
Repo: fleetdm/fleet PR: 31726
File: server/datastore/mysql/labels_test.go:2031-2031
Timestamp: 2025-08-08T07:40:05.301Z
Learning: Fleet repo targets Go 1.24.5 (root go.mod), which supports testing.T.Context(). Do not flag usage of t.Context() or suggest replacing it with context.Background() in tests (e.g., server/datastore/mysql/labels_test.go Line 2031 and similar).

Applied to files:

  • tools/software/vulnerabilities/performance_test/tester/performance_tester.go
🔇 Additional comments (1)
tools/software/vulnerabilities/performance_test/seeder/volume_vuln_seeder.go (1)

374-384: Fix compile-time range error.

for i := range softwareCount does not compile—softwareCount is an int, not a slice/map/string/channel. This prevents the seeder from building or running.

-	for i := range softwareCount {
+	for i := 0; i < softwareCount; i++ {
⛔ Skipped due to learnings
Learnt from: getvictor
Repo: fleetdm/fleet PR: 33218
File: orbit/pkg/table/santa/ringbuffer_test.go:48-50
Timestamp: 2025-10-07T19:42:55.988Z
Learning: Go 1.22 introduced the range-over-int feature, allowing `for i := range N` syntax to iterate from 0 to N-1. This is valid and idiomatic Go code in projects using Go 1.22 or later.

@mostlikelee mostlikelee changed the title 35043 vuln counts Atomic vulnerability count calculations Nov 7, 2025
@mostlikelee mostlikelee marked this pull request as ready for review November 7, 2025 15:51
@mostlikelee mostlikelee requested a review from a team as a code owner November 7, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vulnerabilities disappearing in Fleet

4 participants