-
Notifications
You must be signed in to change notification settings - Fork 32
fix: package ingestion race condition (TC-3152) #2080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR eliminates race conditions in PURL ingestion by replacing non-atomic SELECT-then-INSERT sequences with PostgreSQL’s atomic INSERT…ON CONFLICT DO NOTHING pattern (with fallback SELECT), introduces batch and deduplication helpers for base PURLs and PURL status entries, refactors status-entry builders into pure functions, and updates tests to validate parallel ingestion and correct helper ordering. Sequence diagram for atomic PURL ingestion with ON CONFLICT DO NOTHINGsequenceDiagram
participant Loader
participant DB
Loader->>DB: INSERT base_purl ... ON CONFLICT DO NOTHING RETURNING row
alt Insert successful
DB-->>Loader: Return new row
else Conflict (row exists)
DB-->>Loader: Return NULL
Loader->>DB: SELECT base_purl WHERE id = uuid
DB-->>Loader: Return existing row
end
ER diagram for atomic base_purl and purl_status ingestionerDiagram
BASE_PURL {
id UUID PK
type TEXT
namespace TEXT
name TEXT
}
VERSION_RANGE {
id UUID PK
}
PURL_STATUS {
id UUID PK
advisory_id UUID FK
vulnerability_id TEXT
status_id UUID FK
base_purl_id UUID FK
version_range_id UUID FK
context_cpe_id UUID FK
}
BASE_PURL ||--o{ PURL_STATUS : "base_purl_id"
VERSION_RANGE ||--o{ PURL_STATUS : "version_range_id"
PURL_STATUS }o--|| BASE_PURL : "base_purl_id"
PURL_STATUS }o--|| VERSION_RANGE : "version_range_id"
Class diagram for new and refactored PURL status ingestion helpersclassDiagram
class PurlStatusEntry {
+advisory_id: Uuid
+vulnerability_id: String
+purl: Purl
+status: String
+version_info: VersionInfo
+context_cpe: Option<Cpe>
}
class PurlStatusCreator {
-entries: Vec<PurlStatusEntry>
+new()
+add(entry: &PurlStatusEntry)
+create(connection: &C)
}
class PurlCreator {
+create(db)
}
class Purl {
+package_uuid()
+clone()
}
PurlStatusCreator "1" *-- "*" PurlStatusEntry
PurlStatusEntry "1" *-- "1" Purl
PurlStatusEntry "1" *-- "1" VersionInfo
PurlStatusEntry "1" *-- "0..1" Cpe
PurlCreator "1" *-- "*" Purl
Class diagram for refactored status entry builder functionsclassDiagram
class AdvisoryVulnerabilityContext {
+advisory: Advisory
+advisory_vulnerability: AdvisoryVulnerability
}
class Purl {
}
class Range {
+events: Vec<Event>
}
class PurlStatusEntry {
}
class VersionScheme {
}
class VersionSpec {
}
class VersionInfo {
}
AdvisoryVulnerabilityContext "1" *-- "1" Advisory
AdvisoryVulnerabilityContext "1" *-- "1" AdvisoryVulnerability
Range "1" *-- "*" Event
PurlStatusEntry "1" *-- "1" Purl
PurlStatusEntry "1" *-- "1" VersionInfo
VersionInfo "1" *-- "1" VersionScheme
VersionInfo "1" *-- "1" VersionSpec
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `modules/ingestor/tests/parallel.rs:301-302` </location>
<code_context>
+ // progress ingestion tasks
+ let result = futures::future::join_all(tasks).await;
+
+ // now test
+ assert_all_ok(data.len(), result);
+
+ // done
</code_context>
<issue_to_address>
**question (testing):** Question: Is there coverage for concurrent ingestion of identical advisories?
If not, please add a test for concurrent ingestion of the same advisory to verify the race condition fix and ON CONFLICT handling.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| // now test | ||
| assert_all_ok(data.len(), result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (testing): Question: Is there coverage for concurrent ingestion of identical advisories?
If not, please add a test for concurrent ingestion of the same advisory to verify the race condition fix and ON CONFLICT handling.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2080 +/- ##
==========================================
+ Coverage 68.16% 68.24% +0.08%
==========================================
Files 368 369 +1
Lines 20682 20730 +48
Branches 20682 20730 +48
==========================================
+ Hits 14097 14147 +50
+ Misses 5747 5745 -2
Partials 838 838 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ctron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For creating packages, we do have the PackageCreator. Which should actually take care of all of this. Bundling the logic. Using the same approach.
I'd hate to see another place, with a different style implementing this. So unless it's impossible to re-use that logic, or not able to extend, we should re-use it.
Not so sure but I can not see any usage of |
Most likely because most of the graph creation code wasn't used anymore. There is some inconsistency with all the |
|
I think the situation is more nuanced than the comments suggest. First of all, to my understanding, here is not a matter of adopting The usage of the old (decomissioned?) I've anyway looking into Now, going back to this PR, are we saying that the long-term solution should be to eliminate this mixed |
Yes, I think that's the way forward. |
Can this PR leverage the same exception applied to #2022 that allowed the |
Sorry, I don't understand. |
What's wrong with this PR and its usage of the |
|
So unless there's a very good reason not to use PurlCreator, I don't see a reason duplicate code. Especially it if deviates that much from the pattern we already have. Taking a closer look at this. There seems to be exactly one call, outside of a lot of test code, which uses that code path. It looking at it, the whole logic is useless there anyway, because there is no need to return the ID, as is pre-known. So why do we keep all of this alive? What is the benefit of keeping quite a good portion of code alive just for a single case which can be replaced with code that is known to work in all other places? |
I don't know the benefit but it's the same benefit we had for merging one month ago #2022: which was the benefit? |
|
The benefit was to solve https://issues.redhat.com/browse/TC-2983. |
Great, this PR is solving an issue as well, https://issues.redhat.com/browse/TC-3152. |
|
#2022 re-uses existing code. It's a 3 line change, calling into existing code. This PR creates some new. Rewriting stuff that is currently broken. I am against creating new code or rewriting existing for stuff that already exists. |
Sorry, I don't understand how this explains why not having used the proper |
|
It is in this PR. Back to my question: what are reasons not to use PurlCreator? |
This PR is not using I'm just asking for you to share how |
|
I don't see why 2022 should its using an existing function. |
And this PR is fixing the very same existing function 😉 Before #2022, With #2022, the My questions have all been meant to understand the reason for having introduced |
|
That's the change of #2022:
|
Signed-off-by: mrizzi <mrizzi@redhat.com> Assisted-by: Claude Code
…3152) Signed-off-by: mrizzi <mrizzi@redhat.com> Assisted-by: Claude Code
724557b to
8d1df91
Compare
|
@sourcery-ai summary |
|
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider moving the pure helper functions (build_package_status, build_range_from, build_package_status_versions) into their own module to declutter the loader and improve separation of concerns.
- The PurlStatusCreator collects all entries into a single Vec before inserting—if you expect very large ingestions, you may want to chunk or stream entries into the creator to bound memory usage.
- There’s a lot of boilerplate around constructing PurlStatusEntry in different branches—consider a small builder or helper to DRY up the repeated entry‐construction logic.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider moving the pure helper functions (build_package_status, build_range_from, build_package_status_versions) into their own module to declutter the loader and improve separation of concerns.
- The PurlStatusCreator collects all entries into a single Vec before inserting—if you expect very large ingestions, you may want to chunk or stream entries into the creator to bound memory usage.
- There’s a lot of boilerplate around constructing PurlStatusEntry in different branches—consider a small builder or helper to DRY up the repeated entry‐construction logic.
## Individual Comments
### Comment 1
<location> `modules/ingestor/src/graph/purl/status_creator.rs:79` </location>
<code_context>
+
+ for entry in self.entries {
+ // Validate status exists
+ let status_id = *status_map
+ .get(&entry.status)
+ .ok_or_else(|| Error::InvalidStatus(entry.status.clone()))?;
</code_context>
<issue_to_address>
**suggestion:** Error variant Error::InvalidStatus may benefit from more context.
Consider adding relevant fields (e.g., advisory_id or purl) to Error::InvalidStatus to make debugging easier.
Suggested implementation:
```rust
.get(&entry.status)
.ok_or_else(|| Error::InvalidStatus {
status: entry.status.clone(),
advisory_id: entry.advisory_id.clone(),
purl: entry.purl.clone(),
})?;
```
You will also need to:
1. Update the definition of `Error::InvalidStatus` in your error enum (likely in a file like `modules/ingestor/src/error.rs` or similar) to be a struct variant with fields: `status: String`, `advisory_id: AdvisoryIdType`, `purl: PurlType` (replace types as appropriate).
2. Update any `Display`, `Debug`, or error handling implementations to format and use these new fields.
3. Update any other code that constructs or matches on `Error::InvalidStatus` to use the new struct variant.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| for entry in self.entries { | ||
| // Validate status exists | ||
| let status_id = *status_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Error variant Error::InvalidStatus may benefit from more context.
Consider adding relevant fields (e.g., advisory_id or purl) to Error::InvalidStatus to make debugging easier.
Suggested implementation:
.get(&entry.status)
.ok_or_else(|| Error::InvalidStatus {
status: entry.status.clone(),
advisory_id: entry.advisory_id.clone(),
purl: entry.purl.clone(),
})?;You will also need to:
- Update the definition of
Error::InvalidStatusin your error enum (likely in a file likemodules/ingestor/src/error.rsor similar) to be a struct variant with fields:status: String,advisory_id: AdvisoryIdType,purl: PurlType(replace types as appropriate). - Update any
Display,Debug, or error handling implementations to format and use these new fields. - Update any other code that constructs or matches on
Error::InvalidStatusto use the new struct variant.

Problem
The original PURL ingestion code used a SELECT-then-INSERT pattern that caused race conditions during parallel ingestion.
When multiple concurrent tasks try to ingest the same PURL:
This happened because:
The
advisories_paralleltest was failing withduplicate key value violates unique constraint "package_pkey".Solution
Introduced
PurlStatusCreatorfor safe and performant purl_status ingestion.Bonus fix: the
assert_all_okfunction respects the input ordering usingnumas first parameter in the asserts reflectingnumbeing the first parameter in the function's signature.Related to https://issues.redhat.com/browse/TC-3152
Summary by Sourcery
Improve package ingestion reliability and performance by switching to atomic upserts, introducing batch insertion utilities for base PURLs and package statuses, refactoring loaders to use these utilities, and adding a parallel ingestion test to ensure thread-safety.
New Features:
Bug Fixes:
Enhancements:
Tests: