feat(fracmanager): implement fraction snapshots with wait group reference counting#277
feat(fracmanager): implement fraction snapshots with wait group reference counting#277eguguchkin wants to merge 1 commit intomainfrom
Conversation
d7cfd54 to
6edb32d
Compare
79ea9fe to
4054ad1
Compare
6edb32d to
38a7993
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #277 +/- ##
==========================================
+ Coverage 71.65% 71.67% +0.02%
==========================================
Files 204 205 +1
Lines 14770 14735 -35
==========================================
- Hits 10583 10562 -21
+ Misses 3435 3423 -12
+ Partials 752 750 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🔴 Performance DegradationSome benchmarks have degraded compared to the previous run. Show table
|
| searchParams, | ||
| tr, | ||
| ) | ||
| release() |
There was a problem hiding this comment.
If search panics, then release() is never called.
|
|
||
| // Collect fractions in correct chronological order: from oldest (remote) to newest (active) | ||
| for _, r := range remotes { | ||
| r.wg = &fs.wg |
There was a problem hiding this comment.
What happens in the following scenario:
- we already have a snapshot built - S1
- search request arrives
- while search request is being executed, snapshot rebuild is called. Let's call it S2
- now we overwriting a wait group inside
...SyncDestroyablestruct with S2 wait group. - at the same time, if we want to destroy a fraction, we call
wg.Wait(). But we might issue wait for wait group S2. Since nobody has calledwg.Addfor a S2 wg, it will return immediately. However, search request might still be working and touch this fraction.
There was a problem hiding this comment.
Besides, we issue s.wg = &fs.wg under fraction registry lock (only a single snapshot can be being under construction at the same time).
However, we read this wg group d.wg.Wait() in Destroy.
i.e. techically it's a data race if understand correctly - it's a unsynchronized read/write of a non-atomic variable. The whole question is what guarantees do we have in Go.
| all []frac.Fraction // all fractions in creation order (read-only view) | ||
| muAll sync.RWMutex // protects active, all, and oldestTotal fields | ||
| appender *syncAppender // currently active writable fraction | ||
| all *fractionsSnapshot // all fractions in creation order (read-only view) |
There was a problem hiding this comment.
If we have search request constantly arriving (let's say 5 requests per second) and taking same fraction snapshot. Can we potentially run into any problems?
Fixes #275